Abstract
Along with the development of computer and information technology and the arrival of the digital reading wave, more and more users have switched the way they meet their reading needs to digital reading systems, At the same time, a variety of digital reading systems have also been created. However, most digital reading systems focus on how to present a better reading style, but little research has been done on how to use artificial intelligence and big data technology to provide intelligent information services and user behavior analysis. In such a large environment, a digital reading system that can provide reading behavior collection and intelligent analysis, while providing intelligent reading analysis function, will have broad research prospects. The digital reading system studied in this paper can provide reading behavior analysis and intelligent recommendation service for professional users based on artificial intelligence and big data technology. At the same time, the system uses artificial intelligence technology to realize the functions of bilingual learning reading, new word induction records, etc., which can provide users with knowledge efficiency. In addition, the system utilizes big data technology to provide users with information services such as communication content exchange. The main innovations of this digital reading system are computer automatic clauses based on Chinese and English syntax features, data layering processing mechanism that takes into account the speed and quality of book analysis, and book encryption and decryption schemes across computer systems. The system is based on a C/S and B/S fusion architecture and includes a reading system based on PC and Android.It can present customized ePub electronic resources, and collect users’ reading behavior through mobile screen or mouse and other devices, then use artificial intelligence and big data technology to analyze user data, and finally generate user reading reports. At present, the whole system has been applied in many universities, and the reading level of students and the work efficiency of teachers have been greatly improved, which proves that this digital reading system has high practical value.
Introduction
Digital reading has rapidly sprouted and grown worldwide in recent years. A Nielsen Book Company research report shows that the total sales of global digital reading increased from 9% in 2010 to 27% in 2015, showing a geometric trend. Its market share has been on the rise since 2009 to 2011, and it has been increasing rapidly from 2011 to 2013. Until 2014 to 2015, the upward trend has dropped and stabilized. In 2016, the market share of digital books in the United States accounted for 23% of the global market. The electronic reading market in the UK accounted for 20% of the British book market. The number of German digital reading also showed an increasing trend in recent years. The growth rate in 2012 was 2.4%, and that in 2013 reached 3.9%. According to the sales situation of e-books in various countries, among the branches that affect the global book industry, the digital reading market has emerged in recent years and has become a powerful competitive point in the reading market [1]. On the other hand, the development of the digital reading industry in our country started at the beginning of the 21st century. The rapid development in these years has shown a blowout [2]. “Digital reading white paper 2016” points out that the number of domestic electronic reading users has broken through 300 million, up 12.3% in 2015. The entire profit of market for digital readings has reached 12 billion yuan [3]. The report of Chinese Academy of Press and Publication published in April 18th on Chinese people’s reading survey also pointed out that in 2016, the domestic public media comprehensive reading rate as high as 79.9%, the electronic reading up to 68.2%, compared to 2015 increased by 4.2%, has been rising steadily for 8 consecutive years [4].
In this context, digital reading has become the common focus of both the publishing industry and the Internet industry. After investigating the digital reading products, we found that there is a big gap in the field of bilingual reading for teaching purposes [5]. How to effectively improve the efficiency of bilingual reading for teaching purposes has become a key issue in today’s global integration. Based on the foreign language learning and the needs of teachers and students, this study aims to design a reading product that can serve teaching and research and improve the level of foreign language education. Therefore, it proposes a new concept of digital bilingual reading, starts from the way of reading and creates original bilingual reading learning experience, at the same time produces reading data which can be used for teaching analysis, providing a strong data support to improve the quality of foreign language teaching [6].
The bilingual reading teaching service platform based on Android system proposed in this paper will solve the following problems of the current teaching electronic readers: (1) Currently, reading products on the market cannot meet the content of the original foreign language texts and can only be translated mechanically. This article will study and solve this problem, and design a reading scheme that can switch the language version in real time and achieve a reliable bilingual sentence contrast; (2) for teachers who lack effective control over the reading process of students in the process of bilingual reading teaching [7], the platform can collect students’ reading time, reading progress, reading frequency and other key data, which will be cleaned and analyzed, combined with a visual chart for the convenience of teaching quality assessment. (3) it abandons the web-based analysis inefficient solution for epub content, adopts “safe and efficient” platform parsing solution, namely, to avoid the use of relatively inefficient Java language implementation, using C++ to achieve content parsing, through the JNI programming and application layer Interaction, belonging to a highly decoupled independent module, to ensure operational efficiency. At the same time, it can obtain the dynamic secret key from the server to decrypt the underlying book content to ensure the safety of the book resources [8].
The launch of the platform will speed up the pace of bilingual reading and teaching informatization and improve the management efficiency of colleges and universities. It can quickly meet different and multi-level demands of different audiences for online reading services, and simultaneously promote the development of the publishing industry and the reading industry.
System overall design
Bilingual reading teaching service platform can provide users with bilingual reading, online community exchange, foreign language learning, college management and other services. The platform is based on modular design concept [9], and the entire system is divided into five relatively independent but interlinked modules, forming a highly cohesive, low coupling system framework as shown in the picture.
The overall system architecture design.
Resource and data storage block as the most basic part of the system, its configuration scheme is related to the stability and efficiency of the whole system operation [10]. It mainly includes book data, classified subject data, book resource library, user data and other data resources. In order to achieve loosely coupled, the system is divided into 6 functional modules according to the function set. Each module performs relatively independent process and interacts with the specified data interface to ensure the stability of the whole system and data access security. The security system and standard system provide multi-directional guarantee for the operation of the system platform from the aspects of security and standardization.
The above functional system provides support for the bilingual reading teaching service Android clients on the operating platform to meet the needs of the overall business.
Client architecture design
Bilingual reading teaching service Android client adopts hierarchical and modular design idea to separate each functional module, which is responsible for implementing related functions. The modules are relatively independent, and communication is realized through the publish/subscribe bus and the callback, thus realizing a high cohesion, low-coupling system which is easy to extend and maintain [11]. Combined with the system requirements, the Android client is divided into six modules: bookstore, bookshelf, reader, community, user and database.
Android client architecture design.
The bookstore module is responsible for recommending books, providing book search functions, and classifying books, including intelligent classification, book recommendation and other functional points [12]; the shelf module is responsible for displaying the books added by the user, and provides the user with the book management and the book cloud download function; the reader module is the core module of the system, including including the functions of book parsing, book caching, book content display and other functions, and provides a set of local encryption scheme for books; community module is responsible for providing a online community, including communication circle, topic, review, reading corner and other functions; user module includes login registration (third party login, account association), new word book, experience and other functional points; the database module is mainly composed of database table, which is responsible for storing user information, book information, bookmark notes information, new word information, role relationship and so on.The functions of each module are connected to each other to support and jointly provide support for bilingual reading teaching service platform, this article describes only the design and implementation of the core functions.
Reader module is the core function module of bilingual reading teaching service platform, which realizes the functions of bilingual reading, exact word translation and copyright protection mechanism. The reader module can be divided into a book parsing layer, a UI cluster, a business logic cluster, a data interaction layer, and a data layer. These layers are supported by the tools provided by the public key library, and interlayer communication is implemented by EventBus [13] (highly decoupled event publishing/Subscription mechanism), using SQLite [14] database, xml file, Cache file block and server database for data storage [15]. The architecture of the reader is shown in Fig. 3.
Reader architecture diagram.
Content of the book needs to be parsed before the reader presents it, the process is shown in Fig. 4.
Book parsing process.
First of all, open a new thread through the thread pool to decompression Epub compressed package, get the basic information of the compressed package, including support for language version, title, author, book id, followed by the standard Epub parsing specifications [16], then insert the book record into the database table.
When the user opens a book, the system identifies whether the book has a local cache (cache block). If the cache exists, the parsing process is skipped and the book cache is directly read and displayed. Otherwise, the book parsing process is performed to read the content. opf file through the client API to get the book XHTML list to be parsed, then read the encrypted xhtml file [17] in the order of chapters to get byte stream of encrypted content, and then get SecretKey that can be converted to book decryption key after certain operation. Through above operation it can guarantee the security of the key, prevent the deciphered book content from deciphering because of the key disclosure, so as to realize the copyright protection mechanism [18].
The process of obtaining the plaintext decryption key is shown in Fig. 5. SecretKey is a set of 32-byte character arrays, each adjacent two digits of the byte array (adjacent high * 16
Decryption key generation process.
This key is used to decrypt the text, and the input stream of the text content is obtained, which is introduced to the reader parsing engine to parse.
The reader is responsible for parsing the input stream file, parsing the source data according to the file format, structuring to generate the corresponding data structure Model, and generating a local cache block (the next time you can get the Model directly from the cache). Finally, Draw to the canvas. The following is a detailed description of the key parts of the reader’s design process.
1. Bilingual reading design
Readers with bilingual reading capabilities are implemented as follows:
Metadata generation stage: According to the content.opf in the production of Epub, the paragraph id is set for the html file under the OEBPS folder in sequence [22]. The generation rule of the id is as follows: The paragraph id of the first paragraph of the first chapter is 1, Epub files in each language version have one-to-one correspondence, increasing in turns.Such as the label ID of second paragraph of the first chapter of the Chinese version should be 2, the label ID of second paragraph of foreign language version is 2 too; the rules for the generation of paragraphs in other chapters are: assuming that the first chapter contains 5 paragraphs, the label ID of the last paragraph is 5, and the label ID of the first paragraph of the second chapter is 6*(5
(1) Book content display process
Book content loading and presentation process can refer to Fig. 6.
Book content loading and presentation.
Page data preparation process.
Word translation flow chart.
In the drawing phase, the page number of the current drawing is obtained by pageIndex, and the page number is added up from 1, then the first character to be drawn is obtained. Assuming the yth character of paragraph is x, the last character of the previous page is the y-1th character of paragraph x. According to the size of the container and the space occupied by each character, calculate the number of lines and characters in this page to generate the corresponding coordinates of each character in this page. The data of this page is stored in a row by an array storing the information of each line, data processing is finished to the container for drawing [24]. The data preparation phase is shown in Fig. 7.
(2) sentence translation, bookmarks, notes and other features designed
The mechanism of sentence-to-sentence translation is as follows: When a bilingual version of a book is imported, the sentence is split according to the sentence-breaking regular expression and the separated bilingual control sentence feature information is stored. After the user chooses a sentence, the feature information of the sentence is intelligently extracted, the translated text is extracted through the feature comparison [25], and the translated text is presented to the end user.
The phrase translation process is shown in Fig. 8.
In addition, in order to facilitate the reader to read, the system provides a perfect humane, intelligent and efficient reading platform with bookmarking functions, note taking functions, adding new words, brightness adjustment, font size adjustment, page turning mode and schedule selection.
2, data acquisition design
User’s habits, reading interest, reading time, reading habits and other data collection are important for the user reading behavior analysis and processing, intelligent recommendation of personalized content, teaching instructional data analysis. The program is implemented as shown in Fig. 9.
User reading data acquisition process diagram.
When the user is reading in the smart reader, the reader will perform the above-mentioned data collection algorithm of user reading habits to collect the user’s reading time period, reading terminal and other information, and upload them to the smart user data processing system for intelligent analysis. Finally, the intelligent user data processing system can calculate the user’s reading habits according to the collected user’s reading habits data, so as to guide the system to provide customized services for users. All reading operations performed by the user on the reader will be collected, including the duration of each reading, the time pointed at the beginning and end of the reading, the progress of the reading, the reading speed, taking notes, adding words and other reading aids and associated book information, This information will provide very valuable original data for the development of bilingual reading teaching programs.
User’s daily reading habits output report.
The reading time of the user can be judged by the user’s reading start and end time points, so as to reflect the degree of reading patience. In addition, reading time period can reflect the user’s reading habits. The reading progress and reading speed can reflect the seriousness of the user’s reading. At the same time, the detailed procedures and concentration of notes taken during the reading process can evaluate the user’s reading investment. The system can intelligently analyze and process these data according to preset rules, and the output results will provide a very valuable reference for the development of bilingual reading teaching programs. Parts of the report that the user reads the data eventually through the system output is as follows.
User reading habits chart.
The user reading habits chart represents the teaching week on the horizontal axis and the reading amount of a single user or the entire class of the user on the vertical axis. Through the chart data, it can be seen intuitively whether the reader’s reading behavior is habitual reading or surprise reading, which can effectively evaluate the overall teaching quality.
The starting and ending progress of each reading can be used to calculate the amount of reading each time. The ratio of the amount of reading to the length of each reading is the reading speed. This paper uses big data technology to analyze a large number of samples, and summarizes a comparison table of reading speed, through which users can evaluate the reading effect.
User reading speed analysis report.
After a comparative analysis of the actual reading quality of a large number of users and the reading speed calculated by the system, this paper proposes a reference table for reading speed grading, which is divided into eight grades: “less than 50 words/minute”, “between 50 words/minute and 100 words/minute”, “between 100 words/minute and 150 words/minute”, “between 150 words/minute and 200 words/minute”, “between 200 words/minute and 300 words/minute”, “between 300 words/minute and 500 words/minute “and” greater than 500 words/minute”. Reading speed greater than 500 words/ minute is generally considered as invalid reading, which is impossible for normal readers to complete high-quality reading. The rest of the reading speed can provide improvement suggestions for the publishing industry and education industry. For example, if the statistical report shows that the reading speed of a certain part is generally low, it can represent that the content of this part is relatively difficult to understand. The publishing enterprise can adjust the content expression and organization of this part, and the teacher can make more explanation plans for this part to improve teaching quality.
As a core function, the reader module realizes the functions of book parsing, book loading, local caching, content presentation and reading data statistics. In addition, cross-platform multi-encryption copyright protection mechanisms and metadata caching solutions are provided.
System architecture implementation
The digital bilingual reading system uses the MVC architecture [26] to split the whole system into View, Controller and Model. The operation mechanism of MVC architecture is shown in Fig. 13. View is responsible for receiving user requests and passing to Controller. Controller processes requests and notifies Model to update data. Model updates View, View obtains updated data to respond to users.
MVC architecture diagram.
Based on the MVC system architecture, the implementation of the system is divided into view layer, service layer, logical business layer, data access layer, interlayer communication and public component, as shown in Fig. 14.
Architecture diagram of system implementation.
The composition of the view layer including Activity, Fragment, View template, View adapter etc., this layer is responsible for displaying the interface; the service layer provides back-end services for the entire system, mainly dealing with some time-consuming operations, including resource and information collection; the business logic layer is responsible for processing the business logic of various functional modules, it declares the Controller instance of the corresponding business logic in the view layer or the service layer, View notifies the business logic layer to perform the corresponding business logic by calling the implementation of the Controller. The main business logic of this system includes resource downloading, book parsing, book display, bookshelf management, community logic, and personal management; the data exchange layer is responsible for the interaction between business logic layer and data layer, and provides the Ormlite database operation framework based on the reflection implementation [27], Android Sharepreference storage tool, JSON data parsing scheme and a SAX parser [28] that improves the efficiency of processing large files; the data layer includes SQlite database, XML file, which is used to storage key value pair, and Cache file; the inter layer communication module contains the Intent (The API of the communication among the four components of Android), Gesture (Responsible for the interaction between the user and the screen, including a relatively complex set of click event delivery mechanisms), EventBus (An observer pattern based event publish/subscribe framework, highly decoupled communication between modules), Callback, Okhttp (An open source network communication framework responsible for data transmission between the client and the server), Reflect (Building class objects by reflection mechanism, obtaining class methods, etc.); the public module is composed of some tools which support the whole system, the other layer can call the API of this module to to achieve a specific function, the layer contains log printing tools, image loading tools, network state recognition tools, crash information collection tools, string processing tools, document processing tools and other service tools.
These seven layers constitute a perfect digital bilingual reading system to provide an effective way for readers to learn foreign languages and communicate online.
Reader module code logic structure is shown in Fig. 15.
Reader module code logic structure.
EpubKernel (Epub parsing tool) is responsible for preprocessing the Epub resource package, the contents of which including file decompression (implemented by FileHelper), generating corresponding book records in local database, etc. This process is time-consuming operation using the AsyncTask provided by Android system to open a sub-thread to perform the above operation; ZLXMLReader (book content parser) obtains book content resources by chapters, DecrypDES will be used to decrypt the data into the corresponding BookModel model; BookModel parses the generated book content, and transfers the data to ZLAndroidWidget to display, if this book does not exist in the local cache, by calling the CachedCharStorage interface, a local cache file is generated, the file is stored locally with a .cache suffix; ReadingActivity is the operation of the entire reader entrance, responsible for:
Presenting the content: the data in the BookModel format is drawn into a Bitmap by the ZLAndroidWidget and presented to the user; Integration of plug-in features: Including gesture-based bilingual content switching, sentence matching and rendering, content fragmentation retrieval, word addition and mastery analysis; User-friendly user interface: contains reader appearance, screen brightness, font size adjustment; Paper catalog directory navigation TocFragment bookmark BookMarkerFragment and cloud synchronization notes BookMarkerFragment function.
Reader module implementation renderings.
In order to realize the copyright protection mechanism, a set of book content encryption and decryption scheme is designed based on DES encryption algorithm, as is shown in Fig. 17.
Encryption and decryption implementation plan.
The system needs to be encrypted in the Java layer and decrypted in the C++ layer. The implementation involves data interaction between the Java language and the C++ language. This paper uses JNI to implement cross-platform programming [29].
The source data is Epub format e-book. Call the EpubXmlParser parser in java layer to parse the toc.ncx file to obtain the tocList of toc.ncx files to be encrypted. Traverse the tocList, obtain the corresponding path file input stream FileInputStream according to the path index, and call the desEncrypt method to obtain Encrypted book source data. When parsing a book, the C++ layer calls the ZLXmlReader parser to parse the toc.ncx file to obtain the tocList of the to-be-decrypted data directory paths, traverse the tocList, obtain the corresponding path file input stream FileInputStream according to the path index, and call the desDecrypt method to decrypt it to obtain the decrypted book source Data, follow-up process. Due to the cross-platform implementation of this process, it is necessary to pay attention to the encoding format and the encryption and decryption block and data filling methods.
1. A unified coding format
Java internal uses 16bit unicode encoding (UTF-16) to represent the contents of the string, each Chinese or English characters accounted for 2 bytes; JNI internal uses UTF-8 encoding to represent the contents of the string, UTF-8 is a Variable length format unicode, normal ascii characters accounted for 1 byte, Chinese accounted for 3 bytes; C/C++ is the use of the original data format, ascii 1 byte, and Chinese usually use GB2312 encoding that a Chinese character 2 bytes [30]. Java layer UTF-16 encoding format string, the data passed from the JVM JNI layer, in the C/C++ format jstring input, JNI provides two kinds of analytical jstring format method, respectively, GetStringUTFChars method, call this method Processing jstring string to obtain a string format UTF-8; GetStringChars method, call this method to handle jstring string format UTF-16 string. In addition, these two methods of input if there is Chinese, it needs to be further converted into GB2312 encoding [31]. Transcoding is shown in Fig. 18.
Code conversion.
2. Block and data filling method is unified
The DES algorithm is a block encryption algorithm, and the encryption and decryption process is performed in a block. Common block methods include CBC, ECB and CFB. When the number of bits to be encrypted is less than the block size, the data needs to be padded. The common paddings are NoPadding, PKCS5Padding, PKCS7Padding, etc. In the process of encryption and decryption, we need to unify the encryption and decryption side of the block and fill the way.
Epub parsing process is a time-consuming operation, each time you open the same books it needs to re-analyze the Epub file, presenting a very serious impact on the system’s response speed. Developing effective caching solutions will greatly reduce the number of books analyzed, speed up system response and improve user experience. The system’s local caching scheme is shown in Fig. 19.
Library loading code logic.
When opening a book, determine whether the local cache of the book exists, if there is, directly load the metadata cache and draw the book content. If the cache does not exist, then parse the book. The system design of the local cache space size is 200M. Taking into account the user’s reading habits, it is proposed to use LRUCache to achieve local cache. The design principle of LRUCache is that when the space is full of storage, it releases the most unvisited resource storage [32]. LRUCache is realized by HashMap
The implementation of the caching scheme obviously improves the response speed of the system when loading the books. The comparison between the caching load time and the direct loading time is shown in Table 1.
Cache load time and direct load time comparison table
Caching mechanism.
As you can see from Table 1, opening a book from the cache has less time overhead than opening a book directly. At the same time, the larger the book file, the more obvious the promotion. In the case of more and more book content, the use of cache reading scheme will bring greater performance improvement.
Bilingual reading is the core function of the entire system point, adding gesture switch language version to make reading more convenient. In order to improve the switching efficiency of the content pre-loading mechanism, according to the user’s last reading record, it shows the current language version and reading progress. In the functional process it has also increased the data collection function buried point, collecting data for statistical users to collect points. The entire bilingual reading implementation code can refer to Fig. 21.
Bilingual reading code logical structure.
The gesture is captured by the onTouchEvent method of the content display container ZLAndroidWidget. The system determines the user’s intention according to the sliding distance of the gesture. When the sliding distance in the vertical direction is greater than 200 and the sliding distance in the horizontal direction is less than 80, the gesture is deemed as a user’s need to switch the language version, the system calls FBLayuage method to change FBangle language version switch operation [34].
At the same time, the ZLAndroidWidget also contains an entry for user reading behavior acquisition that transforms all user actions into validated research data.
Based on the research on the current situation of e-reading in domestic and abroad market, this paper designs and implements a bilingual reading teaching service platform based on Android, which is to meet the needs of the bilingual reading teaching service subdivision, and conducts in-depth research on reading function and user experience, abandoning the inefficient way that most terminal readers use web parsing books. The groundbreaking custom-made native reading container based on the system provides a more efficient, beautiful and personalized reading experience, while the fine-grained collection of user reading behavior data can provide scientific data support for bilingual teaching.
The highlight of this system is a flexible bilingual reading mode, original sentence-to-sentence translation, highly efficient e-book parsing capability and copyright protection mechanism with higher security. At the same time, the user experience is deeply combined with gestures to enable e-reading to have a more comfortable reading experience than paper reading. Through the book production process for the labeling of custom features to achieve the translation function, it can accurately complete the analysis of each sentence. In order to speed up the analysis of books, a set of local book metadata caching mechanism based on LRU algorithm is designed and implemented [35]. In addition, an efficient and reliable near-far-end encryption mechanism also provides a complete copyright protection mechanism for book resources.
Bilingual reading teaching service platform with high cohesion and low coupling design concept, it is a powerful, stable and reliable system to solve the difficult problem of reading supervision of extra-curricular schools, providing students with a set of bilingual reading, accurate translation of words, strong service support, and one of the online reading, foreign language learning exchange platform, has an instructive role in the development of bilingual reading, which is a landmark in the field of bilingual reading exploration and practice.
Footnotes
Acknowledgments
This paper is supported by Beijing Key Laboratory of Work Safety Intelligent Monitoring (Beijing University of Posts and Telecommunications).
