Abstract
With the continuous development of big data and machine learning technology, its application in literature research has gradually attracted attention. This study aims to explore how big data analysis techniques can reveal deep themes and emotional trends in 19th century British fiction. Through a comprehensive questionnaire survey, text mining and sentiment analysis, this paper studies and analyzes a large number of text data of 19th century English novels. Preliminary results show that deep neural networks and latent Dirichlet distribution (LDA) models can effectively reveal the theme and emotional changes in literary works. In addition, the analysis also reveals the literary emotional changes in 19th century English society under the background of industrialization, urbanization and other important events. Overall, this study confirms the value of big data technology in literary research and provides new perspectives and methods for future research.
Introduction
With the rapid development of information technology, big data has shown a wide range of application prospects in various fields. Especially in literature research, big data analysis provides researchers with new perspectives and methods, making it possible to conduct more in-depth and systematic research on traditional literary works. In particular, the 19th century British novels not only reflect the social and cultural characteristics of that era, but also contain rich humanistic feelings and ideological enlightenment. However, how to grasp and analyze the connotation of these works more accurately through modern technology has become a topic to be studied.
In the 19th century, Britain experienced great changes brought about by the Industrial Revolution. As a new literary form, the novel quickly rose to prominence and became the mainstream cultural expression form of the society at that time. Writers such as Jane Austen, Charles Dickens, George Eliot and so on, through their works, provided the picture of life in that era, reflecting the changes in people’s emotions, thoughts, social status and so on. In this context, how to understand and analyze the deep meaning of these novels has become the focus of literature researchers for a long time. Traditional literary analysis methods, such as text interpretation, comparative study, historical background investigation, etc., can reveal the connotation of the works to a certain extent, but they are inadequate in processing a large amount of text information and exploring the similarities and differences between the works. In recent years, with the rise of big data technology, many scholars began to try to apply it to literature research, hoping to provide a more scientific and systematic method for literary analysis with the help of this new technology. At present, the application of big data in literature research has achieved some preliminary results, such as word frequency statistics, theme modeling and other methods to analyze the characteristics of the theme, emotion and writing style of novels.
In general, big data analysis provides a new research direction for the study of 19th century British novels, but also brings new challenges and opportunities for literary research.
In the intersection of big data and literature research, there have been many meaningful studies in recent years. Many of these studies have focused on using big data and machine learning techniques to analyze literary works, but each has its own focus. Rithani et al. proposed that deep neural networks are powerful tools for processing big data, especially in complex data structures and non-linear relationships [1]. This provides theoretical support for using big data methods to analyze 19th century British novels, especially in dealing with the complexity and multi-layered nature of literary works. Hu et al. studied the dynamic evolution of emotion in the novel Never Let You Go by using fractal extension method, and proposed that the dynamic change of emotion in the novel has its own specific rules, further expanding the scope of quantitative analysis of literary works [2]. This provides a strong basis for the emotional analysis of 19th century English novels in this study. Alkan et al. used sentiment analysis and potential Dirichlet distribution (LDA) model to analyze the texts of Nobel Literature laureates and drew interesting findings about their themes and emotions [3]. This study not only confirms the application value of big data and machine learning in literature research, but also provides a reference for the methodology of this study. Goswami and Kumar reviewed the application of deep learning technology in big data analysis [4]. Although their research focuses on more technical applications, their perspective also highlights the potential of deep learning and big data techniques in processing complex data sets, such as the underlying patterns in literature.
To sum up, big data and machine learning technology have broad application prospects in literature research. These existing studies provide a solid theoretical and methodological foundation for this paper, and also show the necessity and potential value of continuing to explore this field.
The purpose of this study is to explore the application and expression of big data analysis in the study of 19th century British novels. First of all, I hope to use big data technology to deeply dig and analyze the texts of 19th century British novels, in order to identify their main literary themes, emotional trends and social and cultural reflections. The second plan is to evaluate the effectiveness and accuracy of big data analysis in this field, compare it with traditional literary analysis methods, and explore its advantages and limitations. Finally, we will discuss how to combine this new research method with the traditional literature research method to provide a more comprehensive and in-depth perspective for literature research.
In today’s data-driven era, literary research must also keep up with The Times, constantly seeking new methods and perspectives. Through the introduction of big data analysis, research can not only process a large number of texts quickly and systematically, but also reveal the hidden information and patterns in literary works, which are difficult to achieve in traditional research methods. In addition, as an important part of the history of Western literature, British novels in the 19th century not only have high artistic value, but also are important historical records of British society at that time. Through big data analysis, the research can more accurately understand the cultural characteristics, social changes and people’s emotional world of this period, which undoubtedly has important historical and cultural significance for modern readers to better understand and evaluate this era.
This research project systematically explores the application of big data analysis in the study of 19th century British novels from multiple dimensions. The research content is mainly divided into several key parts. The study will begin with theoretical foundations, provide an overview of the basic principles of big data analysis, and explore the theoretical framework for literary studies. This part will set up the theoretical foundation of the whole study and provide support for the subsequent empirical analysis. Secondly, it focuses on data and questionnaire design. This paper selects samples of 19th century British novels and designs relevant questionnaires to collect different views and interpretations of the works as reference for data analysis. Then, the research will enter the stage of data preprocessing and model construction. In this part, the collected text data will be preprocessed to meet the needs of big data analysis. And through the construction of topic analysis model (LDA) and literary emotion analysis model, the work is more in-depth and detailed interpretation. Finally, a comprehensive evaluation of the analysis results of the above models will be made, and combined with traditional literary interpretation, the practical application and limitations of big data analysis in the study of 19th century British novels will be discussed. The study concludes with a conclusion, summarizing the main findings of the study, as well as the academic impact and societal value of these findings on literary studies and the application of big data.
Through the above main research links, this study aims to form a complete, logical, rigorous and reliable research framework to comprehensively evaluate the effectiveness and feasibility of big data analysis in the study of 19th century British novels. It is hoped that it can provide strong theoretical support and practical reference for the combination of literature research and big data application in other humanities fields.
Theoretical basis and research methods
Fundamentals of big data analysis
The application of big data analysis technology in literature research needs to ensure close integration with literature research and avoid the disconnection between technology and literature research. This means that technical applications should be closely related to the specific needs and contexts of literary research, ensuring that the results of analysis contribute to improving the understanding and interpretation of literary works [5, 6].
Core dimensions of big data analytics:
Volume of data: For example, a full collection of 19th century English novels may contain millions of words, which poses a challenge to traditional research methods, but big data technology makes it possible to process such a large scale of text.
Data speed: Real-time or near-real-time data analysis, such as instant sentiment analysis of reader responses, provides a new dimension to literary research.
Data Diversity: The linguistic, emotional, contextual, and thematic diversity of literary works requires a high degree of flexibility and adaptability in data analysis methods.
Data authenticity: In literary research, it is essential to ensure the authenticity and accuracy of texts [7, 8].
Specific analysis techniques and methods:
Building on big data analytics, analytical techniques in the literature field include, but are not limited to, text mining, sentiment analysis, topic modeling such as latent Dirichlet Assignment (LDA), and deep learning techniques.
For example, text mining can be used to identify key themes and patterns in literary works, sentiment analysis can reveal emotional trends in works, topic models (LDA) can extract meaningful information from large amounts of text, and deep learning is capable of processing more complex literary datasets [9, 10].
Big data analytics offers a novel and powerful approach to literary research, enabling researchers to look at literary works from a whole new perspective, revealing patterns and trends that are difficult to detect with traditional methods. This integration of methods not only improves the depth and breadth of literary analysis, but also provides a new, data-driven approach to understanding literary works and the sociocultural phenomena behind them.
Theoretical framework of literary research
Before exploring the application of big data analysis to the study of 19th century British novels, it is necessary to gain an in-depth understanding of the characteristics and historical background of novels of the period, and to clarify the literary theoretical framework on which this study relies [11].
Characteristics of the 19th century English novel:
Nineteenth-century English fiction is often noted for its detailed description, complex characterization, and rich narrative strategies. Works of fiction from this period often reflect socio-cultural phenomena of the time, such as social changes brought about by the Industrial Revolution, class differences, and gender issues [12].
Theoretical framework:
Discourse analysis: focuses on the exploration of language, narrative structure and themes, especially the in-depth analysis of object description, characterization, dialogue and narrative strategy in 19th century English novels.
Socio-cultural analysis: Consider how a work of fiction reflects the social, cultural, and historical context of its time, such as the impact of the Industrial Revolution on social structures and individual lives.
The specific theories and characteristics of the 19th century English novel:
When discussing the theoretical framework of literary studies, the characteristics of the 19th century English novel, such as attention to detail, the depiction of social class and gender issues, need special attention in the analysis.
Works of this period often contain direct or metaphorical reflections of the political, social, and cultural changes of the time, making them not only products of literature and art, but also mirror images of society at the time.
Combined with big data analysis, these theoretical frameworks can help researchers conduct traditional literary analysis more systematically and effectively, while revealing patterns and connections that traditional research methods may overlook. In summary, the literary theoretical framework of this study aims to provide a comprehensive and flexible approach to the integration of big data analysis and traditional literary analysis, not only deepening the understanding of the 19th century British novel, but also providing a strong theoretical support for the innovation of literary research methodology.
Application and limitation of big data analysis in literature research
In the field of literature research, big data analysis has shown remarkable application prospects.
(1) Application of big data analysis:
Text management and Analysis: Big data technology can effectively manage and analyze large amounts of text data, providing new opportunities for in-depth understanding of 19th century English fiction [13]. For example, it is possible to quickly identify themes, character traits, and narrative patterns in a novel and relate these elements to a socio-cultural context.
Deepening traditional literary Criticism: By using modern text mining techniques such as topic modeling (LDA) and sentiment analysis, researchers can conduct in-depth quantitative analysis of literary works from different dimensions, thus expanding the methods and perspectives of traditional literary criticism [14, 15].
Comparative literature research: Big data technology can efficiently cross-analyze different literary genres, regions and cultures to reveal more complex and diverse literary phenomena.
(2) Limitations and challenges of big data analysis:
Difficulty in capturing subtle meanings: Quantitative analysis has difficulty capturing subtle meanings in literary works, such as satire, symbolism, or deep cultural and psychological connotations. These often require more in-depth qualitative analysis and interpretation.
Data quality and selection criteria: Not all literary works are suitable for quantitative analysis, and data preprocessing can introduce bias or error [16].
Limitations of technology and computing power: Big data analysis often requires specialized software and computing power, which to some extent limits its popularity and application in literary research [17].
(3) A targeted study of 19th century British novels:
This research is particularly concerned with the application of modern text mining techniques to the analysis of 19th century English novels, especially in terms of theme extraction and sentiment analysis. The application of this technique can help to more accurately reveal the social and cultural characteristics and emotional tendencies of the novels in this period [18, 19].
To sum up, big data analysis has important application value in literature research, but there are also limitations and challenges that cannot be ignored. Therefore, this study aims to provide a comprehensive, accurate and profound exploration of 19th century English fiction through a comprehensive methodological framework, combining quantitative analysis and qualitative interpretation.
Data and questionnaire design
Data source and selection criteria
The data in this study mainly come from two sources: first, publicly available literature databases, such as Project Gutenberg and Internet Archive; The second is the reader feedback data obtained through questionnaire survey.
Literary works database
Selected from databases such as Project Gutenberg and the Internet Archive are 200 works of fiction published in Britain in the 19th century. These works cover a variety of genres and themes, such as social criticism, love, adventure, etc. The specific data sources are shown in Table 1.
Data collection
Data collection
Questionnaire survey: A questionnaire is designed, which mainly includes the evaluation, emotional response and theme understanding of the selected literary works. The questionnaire will be distributed through an online platform and is expected to collect 500 valid questionnaires. Table 2 shows the main contents of the questionnaire survey.
Questionnaire survey contents
In selecting the literature and designing the questionnaire, the research followed the following criteria:
(1) Selection criteria for literary works:
It must be a novel published in Britain in the 19th century; Try to choose representative and influential works; Cover multiple genres and topics to ensure diversity of data.
(2) Selection criteria for questionnaire design:
Questions should be concise and clear, avoid leading questions; It should cover multiple aspects related to the literary work, such as the evaluation of the work, emotional response and thematic understanding; Select a representative sample for the questionnaire, such as people of different ages, genders and cultural backgrounds.
Through this rigorous set of selection criteria and design guidelines, the research aims to obtain comprehensive and accurate data to support subsequent big data analysis and literary interpretation.
In order to gain an in-depth understanding of readers’ feelings and understandings of 19th century British novels, this study designed a questionnaire. The questionnaire consists of three parts: work evaluation, emotional response, and topic understanding. The specific design of each part is as follows:
Evaluation of the work: This part includes five questions, aiming to collect readers’ comments on the plot and characterization of the novel. Emotional response: This section contains three questions to understand the main emotional response of readers during the reading process. Thematic understanding: This section has four questions aimed at understanding how the reader understands the main social values and moral perspectives of the novel.
The design of the questionnaire is shown in Table 3.
Question type design
Details of the problem:
Evaluation: How would you evaluate the story? (Very poor/poor/average/good/Very good) Are you satisfied with the characterization of this work? (Very dissatisfied/dissatisfied/average/satisfied/very satisfied)
Emotional response: What are the main emotions you feel when reading this work? (Sad/happy/nervous/ bored/other)
Understanding of the theme: What social values do you think this work mainly expresses? From your point of view, does this work successfully convey its theme? (Yes/no)
The intended purpose of the questionnaire:
The questionnaire aims to collect comprehensive data on 19th century British fiction, which will be used to support subsequent big data analysis and literary interpretation.
By collecting direct feedback from readers, research will be able to better understand how readers receive and interpret literary works, thus providing an empirical basis for literary criticism and theoretical analysis.
Overall, the questionnaire design will provide important first-hand data for the study of 19th century British novels, which will help deepen the understanding and evaluation of literary works of this period.
Data preprocessing
Data preprocessing is a crucial step in the analysis process, which directly affects the accuracy of subsequent model construction and analysis results. There are two main sources of data in this study: one is the text data of 19th century English novels, and the other is the questionnaire data of these novels.
For the text data of 19th century English novels, the following pre-processing steps were taken:
Remove stop words: Remove words such as “the”, “is”, “in” and other words that have no practical analytical value. Stem extraction: reducing words to their stem or root form. Text segmentation: NLP tools are used for text segmentation to prepare for the subsequent topic model.
The text data preprocessing steps are shown in Table 4.
Text data preprocessing
Questionnaire data preprocessing
For the questionnaire survey data, the research mainly carried out the following pre-processing steps:
Missing value processing: For records with missing values, the mean or median is used to fill in. Numerical coding: For category data (such as: very unsatisfactory, unsatisfactory, average, satisfied, very satisfied), numerical coding is carried out to facilitate calculation. Data normalization: all data is converted to the same scale.
The pre-processing steps of questionnaire data are shown in Table 5.
After these pre-processing steps, the data is fully prepared for subsequent model construction and analysis.
Topic analysis model (LDA)
On the basis of the pre-processing, the model is constructed and verified. In this study, Latent Dirichlet Allocation (LDA) was selected as the topic analysis model. LDA is an unsupervised machine learning model used to automatically identify topics from a large number of documents.
Assumptions of the LDA model:
Document topic distribution: Each document is treated as a mix of topics, which are randomly selected according to the Dirichlet distribution. This assumption is based on the fact that literary works often contain multiple interwoven themes, such as love, adventure, or war. Subject word distribution: Each topic is treated as a mixture of multiple words, which are randomly selected according to a polynomial distribution. This hypothesis supports associations between specific themes and specific words in literature. Word assignment in each document: Each word in the document is assigned to a certain topic, and this process is determined by Dirichlet distribution and polynomial distribution. This assumption allows each word to be associated with one or more themes, reflecting the diversity of word usage in literary works.
The LDA model can be expressed in the following mathematical formula:
Document-topic distribution: Theme-word distribution: For each word in the document
Where Dirichlet is the Dirichlet distribution, Multinomial is the polynomial distribution, and
After LDA model analysis, the following topics and related keywords are obtained, as shown in Table 6.
Analysis results
Through these simulation data and the LDA model, the research can explore themes and emotions that are often present in 19th century English fiction, providing a deeper perspective for literary research. At the same time, the validity of the model needs to be further verified by cross-validation or other evaluation indicators (such as confusion degree, consistency score, etc.).
This step not only helps to understand the thematic preferences in literary works of this period, but also provides powerful data support for cross-literary and cross-cultural comparative research.
In order to evaluate the accuracy of the two models constructed in this study, the Topic Analysis Model (LDA) and the literary sentiment analysis model, several commonly used model evaluation indicators were used.
Hypothesis of literary emotion analysis model:
Emotional diversity: It is assumed that linguistic expressions in literary works can be mapped to specific emotional categories (such as happiness, sadness, anger, etc.). Textual emotional consistency: Assuming that certain textual fragments (such as sentences or paragraphs) in a work are emotionally consistent, they can be used for emotional classification. Algorithmic applicability: It is assumed that by using specific machine learning algorithms, such as sentiment analysis algorithms, emotional tendencies can be accurately extracted from text.
Precision is shown in the following Eq. (1):
The Recall rate is shown in the following Eq. (2):
F1 scores are shown in Eq. (3) below:
Where TP (True Positive) is the quantity that is correctly predicted to be Positive, FP (False Positive) is the quantity that is incorrectly predicted to be positive, and FN (False Negative) is the quantity that is incorrectly predicted to be negative.
Technical details of the model and implementation strategy:
Use machine learning methods, such as support vector machines (SVM) or deep learning networks, to train sentiment analysis models.
The text data is first preprocessed, including word segmentation, stop removal and part-of-speech tagging.
The model is then trained and tested using sentiment dictionaries or annotated datasets.
The accuracy, recall rate and F1 score were used as the main indexes for the performance evaluation of the model.
The topic analysis model and emotion analysis model were evaluated respectively. As shown in Fig. 1.
Results of topic analysis model (LDA).
For topic A:
The analysis of predicted emotion and actual emotion is shown in Fig. 2.
Results of literary emotion analysis model.
For Positive:
Through the above model accuracy verification, it is found that the two models have relatively high accuracy and recall rate. However, it still needs to be further optimized and adjusted to more accurately apply big data analysis to 19th century British novels.
In this study, the efficacy and accuracy of the constructed topic Analysis model (LDA) and literary sentiment analysis model were rigorously evaluated. Several widely accepted model evaluation metrics are relied upon to quantify the performance of these models.
Specific assumptions of the model:
(1) Topic Analysis Model (LDA):
Assume that each document is an amalgam of multiple underlying topics.
Each topic can be represented by a specific set of keywords.
(2) Literary emotion analysis model:
It is assumed that emotions in literary texts can be identified by analyzing language usage and sentence structure.
Sentiment analysis models can accurately map text to specific sentiment categories.
Assessing the accuracy of the model involves three key metrics: Precision, Recall, and F1 Score. The mathematical expression is as follows:
Precision is used to measure the proportion of positive samples correctly classified by the model to all samples identified as positive by the model. The following Eq. (4) is shown:
Recall is concerned with the proportion of positive samples correctly classified by the model to all samples that are actually positive. The following Eq. (5) is shown:
An F1 score is a harmonic average of accuracy and recall used to evaluate a model under a uniform standard. The following Eq. (6) is shown:
Where TP (True Positive) is the quantity that is correctly predicted to be Positive, FP (False Positive) is the quantity that is incorrectly predicted to be positive, and FN (False Negative) is the quantity that is incorrectly predicted to be negative.
Model implementation details:
LDA model automatically extracts keywords and topics by modeling the topic of the document set.
Literary sentiment analysis models use natural language processing techniques, such as sentiment dictionaries and deep learning, to analyze the emotional tendencies of texts.
In order to ensure the accuracy and reliability of the model, cross-validation and parameter adjustment are carried out.
The topic analysis model and emotion analysis model were evaluated respectively. As shown in Fig. 3.
Results of topic analysis model (LDA).
Taking topic A as an example, the above mathematical model can be used to calculate:
Through these evaluation indicators, it can be seen that subject A gets quite good classification effect in the model, the accuracy reaches 75%, and the recall rate reaches 68%.
The analysis of predicted emotion and actual emotion is shown in Fig. 4.
Results of literary emotion analysis model.
For “Positive” emotions:
These data show that the classification effect of “Positive” emotions is very good, with accuracy and recall rates of more than 80%.
Both models perform well in terms of accuracy and recall rate, but at the same time, some limitations are exposed, such as the accuracy of some topics or emotion classification still has room for improvement. These results provide directions for further optimization of the model to be more accurately applied to big data analysis of 19th century British novels.
Results of big data analysis
In the process of big data analysis, this study applies the previously constructed and verified topic analysis model (LDA) and literary emotion analysis model. These two models provide a comprehensive and in-depth analysis of the textual data of 19th century English novels.
Implementation of deep data mining:
Through the LDA model, this study successfully extracted three main themes from the study sample: social class, family and marriage, and moral values. These themes are derived from the analysis of word frequency and co-occurrence frequency in the text, and reveal the core issues concerned by the 19th century English novels.
For emotion analysis, this study adopts the emotion analysis model based on machine learning, combined with the emotion dictionary and natural language processing technology, to conduct a quantitative analysis of the emotional tendency of each novel.
This research also uses advanced data mining techniques, such as association rule learning and network analysis, to explore correlations and patterns between different themes and emotions. As shown in Fig. 5.
Topic analysis results.
The numbers here represent the weight of each theme in a particular novel. For example, in “Novel 1”, the theme of social class has the highest weight, reaching 0.7, indicating that the novel mainly focuses on social class issues.
The emotional analysis model of this study also gives the emotional tendencies in each novel. As shown in Fig. 6.
Analysis results of literary emotion.
Where,
For example, in “Novel 2”, the proportion of Negative emotions is the highest, reaching 0.7, which means that the work mainly expresses a negative or pessimistic emotional tendency.
Combined with subject matter and sentiment analysis, the research can be interpreted in a more complex and multi-level way. For example, “Novel 1” mainly focuses on the issue of social class, and has an obvious tendency of positive emotions. It may explore how to change social status through efforts.
Through the above big data analysis, this study not only reveals the themes and emotional tendencies of common concern in 19th century British novels, but also provides strong data support for further literary interpretation and academic research.
After the completion of big data analysis, the research now turns to the interpretation of literary connotations supported by data. First, literary works are analyzed from both the thematic and emotional levels, and then further explore how they influence each other in a particular text.
Based on the results of the Thematic Analysis Model (LDA), it focuses on three themes: social class, family and marriage, and moral values. These three topics have their own weight coefficients, expressed as
Where
As shown in Fig. 7.
Theme and literary connotation.
For example, in “Novel 1”, the literary content index (LQI) reached 0.75, meaning that the work has a high literary value in the three themes of social class, family and marriage, and moral values.
Then a simple model is used to quantify the literary connotation of each novel in the emotional dimension, as shown in Eq. (9):
Where,
The models employed in the analysis, such as Eqs (8) and (9), not only focus on statistics, but also take into account the cultural and social context of the literary work.
As shown in Fig. 8.
Emotion and literary connotation.
Here, “Fiction 1” performs best on the Emotional Literature Index (ELI) at 0.43.
Comprehensive literature connotation interpretation
Finally, a comprehensive model is used to integrate the thematic and emotional aspects of literature, as shown in the following Eq. (10):
Where
Through the synthesis model (Eq. (10)), both thematic and emotional aspects are integrated to provide a more comprehensive perspective on literary works.
To ensure the protection of literary values and cultural connotations, this study not only focuses on the results of digital analysis, but also combines literary theory and critics’ views to interpret.
The differences and similarities between the new data-driven perspective and the traditional interpretation:
New Perspectives: Data-driven interpretation allows us to quantify and objectify the themes and emotions of literary works. For example, through the LDA model, the weight of topics such as social class, family and marriage, and moral values can be quantitatively analyzed, providing a new analytical dimension.
Similarities and differences with traditional interpretations: While traditional literary interpretations rely on the subjective understanding and cultural context of critics, data-driven interpretations offer a more objective, evidence-based approach. This approach emphasizes quantitative analysis, but at the same time needs to be combined with traditional interpretation to gain a deeper literary and cultural understanding.
Ensure the literary value and cultural content of the analysis results:
Through this model, the research can more comprehensively understand the literary connotation contained in the 19th century British novels, and provide a more solid data foundation for the subsequent literary criticism and theoretical research.
The purpose of this study is to explore the application and value of big data analysis in the study of 19th century British novels. By building data models based on Subject analysis (LDA) and sentiment analysis, we not only quantitatively analyze selected literary works, but also successfully transform these data into in-depth interpretations of literary connotations.
This research has successfully used big data tools to quantify themes and emotional expression in 19th century British fiction, enhancing understanding of the social, cultural and emotional complexity of these works.
The proposed Comprehensive Literary Index (CLI) provides a comprehensive evaluation framework for literary works and promotes the development of literary criticism and theoretical research.
Education and outreach recommendations:
Literature researchers are encouraged to participate in interdisciplinary training to improve their capabilities in data analysis, enabling them to utilize big data tools more effectively.
Include modules on big data analysis in literary studies and education courses to promote the understanding and application of this approach.
Hold workshops and seminars to share examples and best practices of big data analytics in literary studies.
Promoting the field of literary research:
This study demonstrates the great potential of big data analysis in literary research and provides a new perspective for the interpretation of literary works.
This interdisciplinary approach not only broadens the application of big data analytics in the humanities, but also provides new avenues for methodological innovation in literary studies.
Despite the remarkable results achieved in this study, there are also limitations, such as limited sample size and the need for further optimization of model parameters. Future studies can improve the accuracy and reliability of the model by expanding the dataset and adopting more refined data analysis methods.
Overall, this study not only provides a novel and in-depth analysis of the 19th century English novel, but also provides a valuable reference for the practical application and education of literary studies.
Footnotes
Funding
Key Research Project of Higher Education Institutions in Henan Province in 2022: Innovative Path Research on Strengthening Science Education and Promoting Science Popularization in Higher Education Institutions under the Background of New Media Era (No.22A880027).
