Abstract
This study evaluated a bilingual text-mining system, which incorporated a bilingual taxonomy of key words and provided hierarchical visualization, for understanding learner-generated text in the learning management systems through automatic identification and counting of matching key words. A class of 27 in-service teachers studied a course “e-Learning in primary mathematics” was asked to reflect “what is e-Learning” before and after the course. Their concept of “e-Learning” was investigated by counting the matching key words using the text-mining system and a content analysis of learner-generated text using a rubric, respectively. The correlations of the results using these two methods were .823 and .840 in the preteaching and postteaching reflections. This text-mining system has the potential as a supporting tool for teachers to gain a general understanding of learner-generated text using the hierarchical visualization for supporting pedagogical decision-making, which can be applied to massive open online courses with a large enrolment of learners.
Keywords
The use of online discussion forums of learning management systems (LMSs) and social networks in virtual education environments not only enhances learners’ collaboration and discussion but also facilitates them to reflect their understanding of key concepts of domain knowledge. The growth of these learning environments well document what is actually happening in the learning process; this needs effective methods to process and comprehend these textual data (Gašević, Dawson, & Siemens, 2015). In addition to extracting statistics about learners’ participation and engagement from these learning environments, topic modeling based on semantic and linguistic methods has been adopted to analyze the quality of learners’ collaboration using their textual data, such as identifying the structure and topics of online-threaded discussions and supporting collaborative writing processes by providing visualization of discovered topics as feedback to learners (Southavilay, Yacef, Reimann, & Calvo, 2013). Since topic modeling enables teachers to discover the topics discussed by learners, various text-mining techniques have been applied to assess the conceptual knowledge of learners (Ming & Ming, 2015; Sherin, 2012; Williams & D’Mello, 2010).
The studies above incorporate the techniques of topic modeling in identifying learners’ conceptual understanding without using expert content knowledge. This study focuses on exploring learners’ conceptual understanding of a domain and provides a domain-specific model from learner-generated text about their degree of proficiency in a domain. Some previous studies have presented the construction of conceptual models for identifying learners’ domain concepts covered by their text in English (Dzikovska, Steinhauser, Farrow, Moore, & Campbell, 2014; Pérez-Marín & Pascual-Nieto, 2010), in which a conceptual model may be specified by a concept map, conceptual diagram, hierarchical model, or other knowledge representation formats. In these models, each concept is represented by a set of predefined answers or references.
Despite a variety of text-mining techniques have been presented for understanding learner-generated text, the majority of previous work has addressed the analysis of English text. Recently, text-mining techniques have been extended to Chinese text such as topic discovery (Zhao, Qin, Liu, & Tang, 2016), concept mining (Zhou & Wang, 2010), and domain ontology construction (Liu, He, Lim, & Wang, 2014). In Hong Kong, learners are allowed to choose the language, either English or Chinese, that they are able to express themselves well in the learning process. This poses challenges in analyzing these text data using existing text-mining techniques. Bilingual text mining for understanding domain concepts covered by learners’ text mixing of English and Chinese becomes a research agenda in this context.
Background of Study
Text-Mining Applications in Learner-Generated Text
The term text mining or knowledge discovery from text was first mentioned by Feldman and Dagan (1995) for the analysis of text supported by machine. There is an emerging trend of text mining which is to understand educational practices and research issues related to text generated from learners (Romero & Ventura, 2010). Data mining tools are usually designed to deal with structured data from databases. In contrast, text mining is focused on finding and extracting useful and interesting information from unstructured text. With the advancement of natural language processing techniques, the algorithms of text mining have been employed to extract and analyze learning-related text from online discussion forums for educational purposes such as extracting opinions in e-Learning systems (Song, Lin, & Yang, 2007), facilitating the automatic coding process for the study of a discussion forum (Lin, Hsieh, & Chuang, 2009), and developing a visual analytics system for students to overview their contributions in online discourse (Teplovs, Fujita, & Vatrapu, 2011).
In recent years, there are studies focusing on the understanding of learners’ conceptual models with learner-generated text. Learner-generated text in this study refers to text generated from discussions and self-reflections in the learning process using features such as discussion forums in LMSs. Pérez-Marín and Pascual-Nieto (2010) studied the automatic generation of learners’ conceptual models using a hierarchical structure of domain knowledge with learner-generated text. Before constructing this conceptual model for each learner, domain knowledge and the related concepts were defined by the teacher, in which a set of questions per concept and the corresponding correct answers or references were provided by the teacher. Lárusson and White (2012) developed an analysis method using a lexical reference to enable teachers to visualize writings of learners about the originality of words generated from learners. In this analysis, a set of key concepts covered by the lecture was manually input as the query terms to determine the originality of a learner’s blog post in an introductory course of Computer Science. Dzikovska et al. (2014) developed an adaptive feedback system supporting the analysis of learner-generated text to address the challenge of teaching conceptual topics in Physics. The system compared the objects and relations of topics in the answers of learners with the predefined reference answers to come up with feedback about the correctness, contradictory parts, missing chunks, and irrelevant fragments of answers. These studies indicate that the learners’ conceptual models distilled from learner-generated text match well with the manual model initially worked out by the teachers.
Multilingual Text Mining
The above techniques generally focus on the processing of English text from online discussions to understand and improve pedagogical practices. Notwithstanding, text mining on multilingual text data has been increasingly investigated and becoming a new trend in the field of text mining (Gupta & Lehal, 2009). Early studies on processing text corpora in different languages have focused on word matching across their languages. For example, Fung (1995) proposed a context heterogeneity measure between words and their translations to compile bilingual lexicon entries from a nonparallel English-Chinese corpus. In addition, statistical approaches are adopted to analyze bilingual text for word matching (e.g., Smadja, McKeown, & Hatzivassiloglou, 1996). Information extraction is one of the major functions in the field of text mining. Enormous efforts have been paid on tackling this issue for multilingual text. Riloff, Schafer, and Yarowsky (2002) presented the integration of cross-language projection into an information extraction system to learn the extraction rules for a new language automatically. Some studies have addressed the development of lexicon or key word extraction from text in different languages. Chu, Nakazawa, and Kurohashi (2014) proposed an approach to bilingual lexicon extraction from comparable corpora based on topical and contextual knowledge on Chinese-English and Japanese-English Wikipedia while Huang, Chen, and Yang (2015) presented a method to extract key words in a language with the help of the other by estimating preferences for topical key words and combining language-specific word statistics.
Recently, identifying underlying topics from text in different languages has become an active research area. Techniques derived from latent Dirichlet allocation (LDA) were commonly proposed to extract bilingual topics from unaligned corpora in two languages such as using term matching across languages (Boyd-Graber & Blei, 2009). Besides, a multilingual topic model with multilevel hyperpriors was proposed to capture the variation in topics across multilingual documents (Krstovski, Smith, & Kurtz, 2016). By detecting a set of common topics from multilingual nonparallel text data and finding the differences in perspectives on these topics across languages, a statistical model was developed to identify the cross-cultural differences (Gutiérrez, Shutova, Lichtenstein, de Melo, & Gilardi, 2016).
On the one hand, the above multilingual text-mining applications have focused on the development of automated techniques for information extraction, lexicon and key word extraction, and topic discovery from multilingual text. On the other hand, there are an increasing number of studies committed to show learners’ conceptual understanding through examining learner-generated text (Pérez-Marín & Pascual-Nieto, 2010; Lárusson & White, 2012; Dzikovska et al., 2014). However, the text-mining techniques attempting to understand conceptual model of learners are limited to text generated in a single language such as English only. In this regard, this study addressed the research gap to explore the use of a conceptual model together with the techniques of multilingual text mining to understand domain concepts by learner-generated text in LMS in multilingual context focusing on the use of both English and Chinese.
Hierarchical Visualization
Users are impeded to conveniently understand the content and structure of a huge amount of data in complex form. In this regard, techniques of visual representation have been studied to enable data to be represented in a manageable overview and allow a structure of visualization at a desired level of detail (de Oliveira & Levkowitz, 2003). For this purpose, Elmqvist and Fekete (2010) presented a model using hierarchical aggregation for information visualization to improve overview and scalability of large-scale visualization. Such hierarchically aggregated visualization provides users an overview of the data with multiscale representations to retrieve details of data at various levels on-demand (Shneiderman, 1996). Bostock, Ogievetsky, and Heer (2011) proposed the use of D3 (data-driven-documents) to build this hierarchical visualization through the web platform. Making use of scalable vector graphics, D3’s standardized representation enables not only better expressiveness and accessibility but also performance improvement using web browsers.
The Design of a Bilingual Text-Mining System With Hierarchical Visualization
This study focused on the application of a bilingual text-mining system with hierarchical visualization support for teachers to understand learner-generated text in the learning process. It involves text mining using a bilingual taxonomy of key words on a topic to identify and count the matching key words from their text. The results of the bilingual text mining are displayed in layers using a hierarchical structure for visualization of matching key words. The bilingual taxonomy of key words is set up initially by the teacher from a theoretical perspective. It is enriched by looking up relevant words and terms used by learners as well as key words discovered by LDA modeling.
Bilingual Text-Mining Design
Concept hierarchy is important for any domain knowledge because it allows learners to formulate relations in an abstract and concise way for facilitating the development, refinement, and understanding of knowledge (Cimiano, Hotho, & Staab, 2005). With a focus on exploring learners’ conceptual understanding, this study adopted a conceptual model using a hierarchical structure of a bilingual taxonomy on a domain to understand learner-generated text. Such bilingual taxonomy includes a set of concepts; and each concept consists of a number of subconcepts in order to capture domain knowledge with various granularities. This parent–child relationship provides an effective way to represent and organize concepts of a domain. For example, Figure 1 shows the hierarchical structure of a bilingual taxonomy on a domain with n concepts and each concept comprises of a number of subconcepts, where there are p and q subconcepts in the first and the nth concepts, respectively.
The hierarchical structure of a bilingual taxonomy on a domain.
Setup of a bilingual taxonomy of key words
In Hong Kong, learners commonly use a mixture of English and Chinese in written communications, and therefore learner-generated text accumulated in LMSs usually includes English and Chinese words and terms. A bilingual taxonomy of key words is therefore adopted in this study to include a number of bilingual key words for each concept and subconcept for matching with text generated by the learners in order to comprehend their understanding of a domain. Referring to the work by Daems, Erkens, Malzahn, and Hoppe (2014) and Pérez-Marín & Pascual-Nieto (2010), the bilingual taxonomy adopted in this study sets a number of specific words and terms characterizing each concept and subconcept. When learners are asked to discuss their understanding of a domain, some of these key words will be covered in their elaborations of the related concepts and subconcepts of the domain. In other words, learners’ understanding of each concept and subconcept of the domain may be represented by the related key words used in their text. Since the concepts and subconcepts are structured hierarchically by this bilingual taxonomy, learners’ understanding of the domain can be overviewed and their understanding on any part and layer of the bilingual taxonomy can be identified. In this study, the bilingual taxonomy of key words is initially designed by the teacher through reviewing the related literature of the domain and the relevant curriculum.
Enrichment of bilingual taxonomy of key words by manual search and the support from LDA modeling
Although a bilingual taxonomy of key words is built according to teachers’ review of related literature and curriculum of the domain, learners are highly probable to use key words other than the ones listed in the bilingual taxonomy to express their understanding of the concepts and related subconcepts. Teachers therefore need to enrich the key words in the bilingual taxonomy by two ways: to extract key words directly by sampling a proportion of the learner-generated text in related LMSs and to use key words discovered by LDA modeling. LDA is a generative probabilistic model to discover latent topics in a document collection and was developed for topic discovery (Blei, Ng, & Jordan, 2003). The top words of latent topics from the LDA model may be employed to complement the original key words designed by the teachers. This study follows the recommendations by Blei et al. (2003) and Yin and Wang (2014) to preprocess the input text by part-of-speech tagging using the Stanford CoreNLP toolkit (Manning et al., 2014) in order to extract nouns, verbs, and adjectives from the input text irrespective of English and Chinese. The condensed text containing these three types of words is then processed by LDA to reveal the latent topics of the input text, where each topic is described by a probability distribution of words. With the reference to the words with high probabilities for each topic, the teachers can finalize the bilingual taxonomy of key words, which includes synonyms in both English and Chinese to accommodate the languages used in learner-generated text.
English and Chinese Text Processing
The input English text is first divided into a sequence of words, symbols, and other meaningful elements known as tokens by the Stanford CoreNLP toolkit (Manning et al., 2014). The Stanford Parser (Klein & Manning, 2003) is adopted for English word segmentation in this study. The process of chunking the sentence is then conducted by grouping the parser leaves with the same parent as a term. In the final step, lemmatization is performed so that the lemma of each word can be determined. Then, the words with the same lemma are grouped as a single item to improve the quality of word indexing.
Chinese text segmentation is the most important part in Chinese text processing since a Chinese term consists of several single words but without space between the words. For example, “短片” is considered as a sequence of two Chinese words, which means “video clip” rather than as individual words of “短” and “片,” which mean “short” and “slice,” respectively. This study adopted the conditional random field (CRF)-Lex Chinese word segmenter (Chang, Galley, & Manning, 2008) to split Chinese text into sequences of terms and identify the part of speech of these resulting terms, in which there are 423,224 distinct entries in the external lexicons to improve segmentation consistency of the CRF model.
Hierarchical Visualization Design
A hierarchical structure of a bilingual taxonomy is adopted for understanding the domain concepts covered by learner-generated text; text-mining results which contain those matching key words with counting figures are recorded in a hierarchical structural list. Hierarchical visualization in this study refers to the provision of hierarchically aggregated results for visualization through web browsers in which users can overview the data with multiscale representations to retrieve details of data at various parts and levels on-demand. These features enable users to understand the hierarchical results obtained from text mining. The method for hierarchical aggregation presented by Elmqvist and Fekete (2010) is adopted in this study to facilitate the reporting and the easy comprehension of the subordinate or membership relations among the matching key words obtained from the text-mining results. D3 (Bostock et al., 2011) is then employed to enable users to visualize the text-mining results with a large-scale structure interactively through web browsers.
This study adopted the technique presented by Teoh and Ma (2002) for a ringed circular layout of nodes to visualize the hierarchical results from text mining; and a click on the visualization can manage the viewing through the different hierarchical layers. In such visualization model, each concept or subconcept is represented by a blue ringed circular layout with balls inside. The blue color scheme of getting deeper in each layer is iteratively in line with more layers of subconcepts (for details, see the part “Results and Discussion”). Each ball denotes a set of bilingual key words with similar scope. The ball is orange in color if at least one of these key words matches with words or terms mentioned by learners and is gray in color otherwise. The size of an orange ball is proportional to the matching counts of these key words. The matching of learner-generated text with the key words of any concept/subconcept can be studied by viewing the number of orange balls together with the sizes of the orange balls which reveal, to certain extend, learners’ conceptual understanding of the domain.
The Use of Bilingual Text-Mining System to Examine Learner-Generated Text
Figure 2 shows users’ view on the bilingual text-mining process with three parts, namely, text extraction, using the bilingual text-mining system and hierarchical visualization. Text data are generated in LMSs such as text produced from learners in discussion forums of these platforms. In order to conduct text mining, learner-generated text is needed to be extracted from these platforms. In this study, a Moodle plug-in is designed for teachers to extract the text data from discussion forums for text mining. The bilingual text-mining system of this study consists of a text-mining interface, text-mining server, and database. The web-based text-mining interface is developed by PHP 5.5. There are five webpages in this user interface, namely, “template download,” “manage key word,” “import text,” “text mining,” and “result” (see Figure 3). This user interface is designed to facilitate the preparation of text data for mining which has been extracted from LMSs. The interface also has functions for the design and input of bilingual taxonomy of key words, the administration of the text-mining process, and the postprocessing of text mining for generating hierarchical visualization results.
Users’ view on text extraction, using the bilingual text-mining system and hierarchical visualization of the bilingual text-mining process. The “Result” webpage of the user interface of the text-mining system.

Through the web-based interface, users can instruct the server to perform text mining through the Internet without installing the required programs on users’ own computers. The text-mining interface enables the matching of the words and terms in learner-generated text with the ones in the bilingual taxonomy and therefore is able to identify and count the matching key words. The database is built using MySQL 5.6 to store text-mining data including learner-generated text, bilingual taxonomy of key words, and text-mining results, so that users are enabled to access these text-mining data through the network. The hierarchical results produced from text mining are saved in text file format which can be viewed and downloaded from the “Result” webpage. In order to assist users to browse the text-mining results conveniently, they are enabled to produce their online visualization of the hierarchical results interactively through web browsers with the text-mining interface of the system (for details, see Figure 5).
Evaluating the Bilingual Text-Mining System With Hierarchical Visualization
Method
Two evaluation studies were conducted in order to validate the usefulness of the bilingual text-mining system with hierarchical visualization. One study was to evaluate learners’ domain concepts by counting the number of matching key words using the text-mining system in the preteaching and postteaching reflections. The other study was to conduct a content analysis of the two reflections. In content analysis, researchers are required to code and quantify the findings of a research topic in order to extract information from the qualitative data sets (Newby, 2014). In this study, two researchers who were familiar with the rubric designed for this study were assigned as the coders to independently code and score learners’ reflections and then discussed discrepancies on their scoring results in order to obtain a consensus on the final scores. The Cronbach’s α reliability coefficient between the two coders was .828.
Participants
The evaluation involved 27 in-service teachers who joined a 5-week full-day intensive professional development course on “e-Learning in primary mathematics.” These in-service teachers had 13 years of teaching experience on average (SD = 7.265 years), and they were majoring in the teaching of mathematics. By a statistical power analysis, the effect sizes in the text mining and content analysis were 1.480 and 2.679, respectively, considered as very large according to Sawilowsky (2009). The sample size with 27 participants is further justified to be sufficient for high-power statistical tests (α = .05 and power = .8) in the text mining and content analysis.
Bilingual taxonomy of key words on e-Learning for counting matching words
Enriched Key Words Which Were Found Manually From Learner-Generated Text in Their Reflections in This Study.
The Top 20 Words/Terms From a 4-Topic LDA Model on Learners’ Preteaching and Postteaching Reflections.
Note. LDA = latent Dirichlet allocation. The highlighted words/terms were selected to enrich the designed key words in this study.

The hierarchical visualization of text-mining results of learners’ pre-teaching and post-teaching reflections of their understanding of e-Learning.
Rubric for content analysis of learner-generated text
The learners’ reflections were also evaluated by a content analysis to measure their understanding of e-Learning. In this regard, a rubric based on the e-Learning framework was developed for researchers to assign scores and quantify the quality of learners’ reflections on their understanding of the concept of “e-Learning” (for details, see rubric in the Appendix). Learners’ understanding in the “technology” concept was assessed by Components A and C of the rubric while the other two components were used to measure their understanding in the “pedagogy” concept.
Results and Discussion
After completing the process of text mining of the learner-generated text from preteaching and postteaching reflections, the matching key words were identified, counted, and reported using hierarchical visualizations (for details, see Figure 5), where a set of bilingual key words with similar scope in learning was represented by an orange ball if at least one of these key words was matched with learner-generated text and a gray ball otherwise. Note that each ball was labeled by the first key word of its corresponding set of key words (see Figure 4). Learners’ learning progress could be studied using this visualization tool. As Figure 5 shows, there were 15 sets of key words covered by the preteaching reflections while there were 27 sets of key words covered by the postteaching reflections. Also, the counting of the matching key words has increased from 107 to 410 after the teaching of the course. These figures indicated that learners had better understanding of e-Learning after studying the course because more key words of e-Learning were recognized, and these key words were used more frequently to elaborate their understanding of e-Learning in the postteaching reflections.
The finalized bilingual taxonomy of key words on e-Learning for this study.
Since the hierarchical aggregation of matching key words was available through web browsers, the learning progress of any concept and subconcept of e-Learning could be examined visually based on the change of color and size of corresponding key word balls in the two reflections. For example, there were significant changes of the number and size of orange balls in the subconcept “digital ways of collecting data” of the technology concept including “text mining,” “summative data,” and “formative data,” as shown in the extended parts of Figure 5. This learning outcome was consistent with the emphasis of this concept in the course. The visualization also shows that some key word balls in “e-resources” and “digital ways of communication” subconcepts remained in gray color after the teaching of the course.
The Counts of Matching Key Words and Proportions of Matching Counts at Various Levels of the Bilingual Taxonomy in the Preteaching and Postteaching Reflections.
Also, the proportions of matching counts at the second level were adopted to examine learners’ understanding of the technology and pedagogy concepts. As for the examination of learners’ technology concepts, for example, the proportion of matching counts in “digital technology” was
Likewise, learners’ understanding of the subconcepts of e-Learning could be examined using this measure at the third level. For example, “collaborative learning” was the most popular key word set of the “principles/models/theories/strategies” subconcept mentioned by learners in the postteaching reflection with the proportion of matching counts equal to
These examples demonstrate that the counts of matching key words found by bilingual text mining on learner-generated text enable teachers to make pedagogical decision with informed judgment based on the strengths and weaknesses of learners demonstrated in the learning process.
The Counts and Scores of the Learners in the Preteaching and Postteaching Reflections by Counting Matching Key Word Using Text-Mining and Content Analysis With a Rubric, Respectively.
Statistical Results of Learners’ Reflections Measured by Text-Mining and Content Analysis Methods.
p < .001.
The Pearson correlation coefficient was adopted in this study to measure the relationship between the results obtained from text-mining and content analysis methods. This correlation analysis showed that there was a strong correlation between the results measured by text-mining and content analysis methods in the postteaching reflection (r = .840, p < .001) and the preteaching reflection (r = .823, p < .001). Such correlation results validated the usefulness of the text-mining method in understanding learners’ reflections. On top of using content analysis that involves a labor-intensive and time-consuming process, the bilingual taxonomy of key words provides an alternative and instant feedback about learner-generated text and therefore has potential as an alternative and less labor-intensive tool to monitor learners’ learning progress in online courses. This will be especially useful to online courses with large enrolment of participants such as massive open online courses (MOOCs), in which it is not easy for teachers to have an understanding of learner-generated text without text-mining support.
Summary and Future Work
This study reports the evaluation of a bilingual text-mining system, which is capable of understanding the domain concepts covered by learner-generated text accumulated in LMSs either in English, Chinese, or a mixture of English and Chinese in order to support pedagogical practices in the bilingual learning and teaching environment. With a bilingual taxonomy of key words on the concerned domain, the text-mining system identifies and counts the matching key words covered by their text automatically. In this study, a tailor-made Moodle plug-in is developed. It can be redeveloped to work with other LMSs to facilitate the extraction of text for mining. An online web-based text-mining interface was designed to facilitate the process of text mining and result reporting, which include the preparation of text data and bilingual taxonomy of key words as well as the processing and reporting of text-mining results through hierarchical visualization.
Summary
A class of 27 in-service teachers studying a course about e-Learning in primary mathematics was involved to evaluate the usefulness of the text-mining system. Having designed the bilingual taxonomy of key words on e-Learning, learners’ reflections on the concept of “e-Learning” before and after the course were processed by the text-mining system in order to identify and count the matching key words of learner-generated text in Moodle’s forum. The text-mining results of both reflections were reported through hierarchical visualization and employed to determine the proportions of matching counts at various levels of the bilingual taxonomy for understanding the learning progress of learners. The same sets of learners’ reflections were evaluated again by content analysis using a rubric to measure their understanding of e-Learning. The correlation analysis indicated that there was a strong correlation between the results using these two evaluation methods with r = .823 and r = .840 in the preteaching and postteaching reflections, respectively. Both evaluation methods indicated that learners gained from the teaching and had a much better understanding of e-Learning in school education.
Contributions
This bilingual text-mining system contributes to the text-mining technique to conduct an automatic analysis of online learner-generated text in English and Chinese for an efficient understanding of learners’ domain knowledge level. With the bilingual taxonomy of key words, this bilingual text-mining system can provide instant feedback on counting matching key words in learner-generated text based on the conceptual model of domain knowledge. With the hierarchical visualization of matching key words, this bilingual text-mining system can support teachers to easily gain a general understanding of learner-generated text through visualizing the proportions of these matching counts at various levels of a bilingual taxonomy for online courses. The bilingual text-mining system therefore is a supporting tool potential for teachers to improve pedagogical practices based on learners’ learning data.
Limitations and future work
This study had two limitations. The first limitation was related to research design—the need to design and then enrich the bilingual taxonomy of key words on the related domain before using this bilingual text-mining system. The second limitation was related to data collection—the small sample size for evaluating this bilingual text-mining system. There are two incentives to address these limitations for increasing the usage of this text-mining system in future: making the repeated use of the bilingual taxonomy of key words across different cohorts of the same course and applying this text-mining system to online courses with high enrolment of learners such as MOOCs. In this context, the designed key words can become more stable if the key words are enriched by working with more than one cohort of learners. In addition to extracting key words by sampling a proportion of the text generated by learners in related LMSs or MOOCs, teachers are recommended to enrich key words in the bilingual taxonomy using a LDA model to discover key words from learner-generated text and other algorithms to support the building of a domain taxonomy and its domain vocabulary like excerpting Wikipedia’s pages and their underlying structure of categories for the initial knowledge base (Daems et al., 2014). The support from these automatic processes is especially important for these online courses because it is generally impractical for teachers to manually search key words used by learners when the number of learners is large. Hence, further research on the usefulness of this bilingual text-mining system in MOOCs environment will be explored.
Appendix: A Rubric for Evaluating Learners’ Understanding of E-Learning
Note. ICT = Information and Communication Technology.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was funded by the Hong Kong UGC (University Grants Committee) TDG (Teaching Development Grant) (Ref: HKIED7/T&L/12-15).
