Abstract
In recent years, with the increase of the number of college students and the change of social pressure, the problem of students’ mental health has gradually been widely concerned by the society. The purpose of this study is to systematically evaluate and analyze the mental health status of college students and explore the core factors that affect their mental health. Using big data technology and SCL-90 scale, this study assessed the mental health of a sample of students within a certain range. The results show that academic pressure, interpersonal relationship, living habits, and personal experience are the main influencing factors of students’ mental health. Based on these findings, the study provides a series of specific mental health intervention recommendations and strategies for colleges and universities. This study not only provides targeted mental health strategy suggestions for colleges and universities but also provides a new research perspective and method for mental health research.
Introduction
With the rapid development of society, modern college students are facing increasing academic pressure and social challenges, which leads to their mental health status widely concerned. Mental health is not only directly related to the personal development of students but also affects the teaching quality of schools and the overall stability of society. However, due to the influence of many complex factors, the mental health problems of college students present diversity and concealment, and the traditional evaluation methods are often difficult to identify and deal with these problems accurately and timely. In recent years, the research on college students’ mental health has been deepening. Studies have shown that the common psychological problems of college students include anxiety, depression, social fear, and academic pressure [ref.]. These problems are often closely related to students’ living environment, study pressure, interpersonal relationship, and personal background. In addition, the rise of big data technology provides new perspectives and means for research and solutions in this field. By collecting and analyzing large amounts of data, research can gain a deeper understanding of students’ psychological conditions, uncover potential risk factors, and propose more targeted intervention strategies.
In recent years, the application of big data technology in the field of mental health has been increasing, especially in assessing and analyzing the mental health status of college students. According to Xiao et al., 1 big data technology has significant application value in screening the mental health of Chinese college students. In addition, Mao and Chen 2 further demonstrated the importance of big data in the investigation of the mental health status of contemporary college students by building a risk prediction model. These studies show that big data can not only provide broad data support but also help pinpoint mental health issues. Research by Liu 3 and Zhang 4 highlights the promise of AI in determining the mental health status of students in schooling, demonstrating a new trend in the integration of technology and the mental health field. Chu and Yin 5 used cluster analysis algorithm to analyze the mental health data of college students, showing the potential of big data analysis in revealing the dynamics of students’ mental health. Liu et al. 6 went further and used facial recognition and neural network technology to diagnose the mental health status of college students, indicating the application of multidimensional data analysis in mental health assessment. Big data technology is not only limited to data collection and analysis but also involves the exploration of key factors that affect students’ mental health. For example, Marino et al. 7 studied how personality traits and metacognition predict positive mental health of college students. In addition, Deng 8 used fuzzy qualitative simulation to study the mental health status of college students, providing a new perspective for assessment methods. In summary, these studies show that big data technology and advanced analytical methods are playing an increasingly important role in the study of college students’ mental health status. Future studies can further explore the effectiveness and impact of these technologies in a broader context of application and provide more accurate mental health assessment and intervention strategies for universities.
The purpose of this study is to make comprehensive use of SCL-90 scale and big data technology to deeply explore the mental health status of college students. Through the precise analysis of large-scale data, the research is expected to reveal the key influencing factors of students’ mental health and the complex relationship between these factors and students’ daily life and academic performance. From the perspective of practical application, this study aims to provide a more objective and detailed mental health assessment tool for colleges and universities. For educators and mental health professionals, this will not only help them more accurately identify and intervene in potential mental health problems but also provide data to support the development of mental health education strategies. From a broader societal perspective, a mentally healthy student population will have a more positive impact on society, which undoubtedly enhances the importance and urgency of this study.
This study covers multiple dimensions from theory to demonstration, in order to conduct a comprehensive discussion and analysis of college students’ mental health status. In the theoretical part, the research will deeply explore the core definition and importance of mental health and introduce the evaluation mechanism and application value of SCL-90 scale in detail. In addition, the concept, characteristics, and potential applications of big data in the field of mental health will also be fully elaborated. Turning to empirical research, this study will demonstrate data collection strategies and preprocessing processes, including the design and implementation of the SCL-90 scale and the determination of sample strategies. Next, the research will introduce the model design method adopted in the research, explore how to choose the appropriate statistical or machine learning model, and how to build the model according to the index system of SCL-90. In the results analysis section, this study aims to provide insights into the mental health status of college students, reveal the detailed differences in their mental health scores, and the core factors that influence their mental health. Based on these findings, the study will also make specific recommendations and strategies to help colleges and universities respond more effectively to students' mental health issues.
Mental health concept, evaluation, and big data application
Definition and importance of mental health
Mental health is generally viewed as a stable and balanced state of a person’s emotional, behavioral, and cognitive health. It relates not only to the absence of mental illness or disorder but also to an individual’s subjective well-being, quality of life, and ability to cope with everyday stress. 9 A good state of mental health enables individuals to perform effectively, cope with the stresses of normal life, perform effectively at work, and contribute to society.
For example, a mentally healthy college student may be more able to actively participate in academic and social activities, demonstrate better learning and adaptability, build positive interpersonal relationships with classmates, and thus gain more satisfaction and a sense of accomplishment in campus life. Conversely, psychological issues such as anxiety or depression may cause students to avoid social activities, affect academic performance, and even lead to more serious health problems.
For college students, the importance of mental health is particularly prominent. A good psychological state is not only related to students’ academic performance but also directly affects their interpersonal relationships, quality of life, and future career development. 10 For colleges and universities, students' mental health is an important part of their educational goals, related to the quality of education, training quality, and social image of the school. Ignoring students’ mental health problems may lead to a series of negative consequences, such as academic failure, interpersonal tension, reduced quality of life, and even serious mental illness or behavioral problems, with long-term negative effects on students, schools, and even society. 11
Therefore, it is particularly important to pay attention to and promote the mental health of college students. Understanding the complexity of mental health and its profound impact on individuals and society is essential for developing effective intervention strategies and improving the quality of education.
Overview and application value of SCL-90 scale evaluation
SCL-90 scale is a widely used assessment tool in the field of mental health. It covers nine main dimensions of mental health, including anxiety, depression, interpersonal sensitivity, and hostility, through the detection of individual psychological symptoms and stress response.12,13 Each dimension contains a series of items designed to assess the specific symptoms associated with it.
SCL-90 scale has a wide range of application values. From a clinical point of view, it can help mental health experts quickly and accurately identify patients’ psychological symptoms, providing an important basis for diagnosis and treatment. 14 In the field of mental health research, SCL-90 serves as a standardized, reliable, and effective tool for researchers to assess mental health status in different groups or cultural contexts.
In the university environment, the application of SCL-90 is particularly important. For example, in one practical application case, a university used SCL-90 to screen incoming students for timely detection and intervention for mental health issues. By analyzing the scale results, the school mental health center was able to identify students with psychological distress and provide them with personalized counseling and support. In addition, the scale has also been used to assess the impact of specific events (such as the epidemic period) on students’ mental health, providing data support for schools’ crisis response and psychological support strategies.15,16
SCL-90 scale can not only help schools find students’ mental health problems in time but also evaluate the educational environment and the effect of mental health education in schools, so as to promote the continuous improvement and development of schools in these fields.
Concept, characteristics, and technological development of big data
Big data is a key term in the field of modern information technology, referring to huge data sets that are difficult for traditional data processing applications to process. 17 Its main characteristics are summarized as the “5V”: data Volume, data Velocity, data Variety, data Veracity, and data Value.
Data volume refers to the size and complexity of the data.
Data speed focuses on the rate at which data is generated and processed.
Data types emphasize the types of data generated from diverse data sources, including structured, semi-structured, and unstructured data.
Data accuracy is related to the quality and reliability of data.
Data value represents extracting valuable information from big data.
In recent years, the technological development of big data includes key technologies such as cloud computing, distributed storage, and efficient parallel computing. 18 In particular, open source frameworks such as Hadoop and Spark provide important tools for processing and analyzing big data. At the same time, the development of machine learning and artificial intelligence, especially the application of deep learning techniques, makes it possible to extract complex patterns and predictions from huge data sets. 19
Recent trends include the development of real-time data processing and stream computing technologies that allow rapid analysis and instant decisions on large amounts of data. In addition, the rise of edge computing, where data is processed at the point where it is generated, has reduced the time and cost of data transmission and improved the efficiency and security of data processing.
In today’s society, big data technology has been widely used in business, medical care, transportation, education, and other fields. It provides decision-makers with more accurate and deeper insights to help them make more informed decisions. For example, in the field of education, big data technology has begun to be applied in many aspects such as curriculum design, learning habits analysis, and educational resource optimization.
Role and application of big data in mental health research
In the field of modern mental health research, big data technology has become a key tool. Its core value is to provide researchers with rich and multidimensional data resources, so that mental health analysis can draw more accurate and representative conclusions based on a broader and more realistic data background. 20
Big data technology enables researchers to capture and analyze a variety of data related to mental health, such as traditional psychological test results, social media behavior, online interaction records, and biomarkers. 21 These data sources provide multiple perspectives into an individual's mental state and help uncover complex mental health factors and their interrelationships.
In addition, big data technology has driven innovation in mental health research methods. Whereas traditional research methods relied on small sample sizes and simple statistical methods, using machine learning and complex algorithmic models, researchers can now delve into large-scale data sets to reveal patterns and trends in mental health that were previously difficult to detect.
In specific applications, big data technology has been widely used to predict mental health risks, optimize psychological treatment methods, and personalize intervention strategies. 22 For example, some studies have successfully predicted anxiety and depression trends in students by analyzing their posts and interactions on social media. Another study used physiological data collected from wearables to assess students’ stress levels and provide personalized stress-reduction recommendations accordingly. In addition, certain universities have begun to use big data to analyze students' learning behaviors and lifestyle habits to identify early signs of mental health problems and provide timely interventions.
All in all, the role and application of big data in mental health research is deepening, bringing unique opportunities and challenges to the field of mental health, especially in the accurate identification, prevention, and intervention of mental health problems.
Data acquisition strategy and preprocessing
Data source and collection method
SCL-90 scale design and implementation strategies
The SCL-90 scale, the central data tool of this study, was converted into an online questionnaire form. Given the high usage rate of electronic devices among college students, this approach not only facilitates data collection but also enhances the accuracy of data.
The scale covers the nine main dimensions of SCL-90, each consisting of a series of related questions that students are asked to rate based on how they actually feel.
Examples of SCL-90 scale contents.
The questionnaire implementation strategies are as follows: (1) Target audience: It is mainly for college students, including undergraduate and graduate students. (2) Distribution strategy: It sends questionnaire links to students through official channels of the school, such as student email and school APP. (3) Data privacy: It ensures the anonymity of the questionnaire, complies with relevant data protection regulations, and ensures the privacy of students. (4) Encouragement mechanism: It encourages students to participate in the questionnaire by providing small rewards, such as books or learning materials. (5) Data summary: Data is automatically collected through the online platform and then imported into the big data processing system to prepare for subsequent analysis.
With this strategy, the research is expected to obtain a large number of high-quality SCL-90 data, which will provide strong support for subsequent analysis.
Collection range and sample strategy
In order to ensure the representativeness and diversity of data, this study designed a detailed data collection scope and an accurate sample strategy. (1) Collection scope
Target audience: college students.
Geographical scope: In order to comprehensively consider the impact of regional differences on students’ mental health, universities in the north, south, east, west, and middle regions of China were selected for data collection.
Types of schools: Comprehensive universities, engineering schools, liberal arts schools, and arts schools are included to ensure the comprehensiveness of the sample. (2) Sample strategy
In this study, stratified random sampling was used to select samples. The specific steps are as follows:
Classification: First, it is classified according to geographical area and type of school.
Random selection of schools: A specific number of schools are randomly selected in each category.
Random selection of students: Students are randomly selected from the selected schools for investigation.
Examples of SCL-90 content.
Data validity
Completeness requirement: Ensure that each participant provides complete questionnaire feedback.
Control questions: Set control questions to identify and exclude random and inauthentic responses.
With such a collection scope and sample strategy, this study aims to obtain a comprehensive and in-depth data set that can lay a solid foundation for big data analysis and mental health research.
Data quality assurance and preprocessing technology
Ensuring data quality is the basis of this study. We employ a range of strategies and techniques to ensure the quality of the data and to pre-process it. (1) Data quality assurance strategy:
Integrity check: Perform an integrity check on all collected data to ensure that there are no missing entries or unfinished questionnaires.
Consistency verification: Control questions are set to rule out random or deceptive answers and ensure that participants’ answers are logically consistent.
Feedback evaluation: Set a reasonable rating range for each item of the SCL-90 scale, flag, and review the feedback beyond the range. (2) Data preprocessing technology:
Data cleansing: Removal or correction of abnormal, inconsistent, or missing data based on quality assurance policies.
Data conversion: All data is converted to a uniform format, for example, text ratings are converted to numeric values.
Feature engineering: Based on the structure of SCL-90 scale, a comprehensive mental health score is generated as a new feature.
Data normalization: Use Z-score or Min–Max methods to standardize data and eliminate the impact of scale differences.
Examples of SCL-90 content.
Through the above data quality assurance strategies and preprocessing techniques, this study ensured the reliability and validity of the data, and laid a solid foundation for the subsequent big data analysis and mental health research.
Analysis model design and selection
Define research variables and evaluation criteria
Analysis of mental health impact factors
Mental health impact factors.
Based on the above factors, a linear regression model was constructed to describe the relationship between these factors and students’ mental health status, as shown in the following formula (1):
By analyzing the coefficients of each factor, the research can determine which factors have the greatest impact on students' mental health status and propose corresponding intervention measures or recommendations accordingly. At the same time, this model can also be used to predict students’ future mental health status and provide more targeted support and help for schools and students.
The index structure design of SCL-90 is adopted
The core value of the SCL-90 scale lies in its nine main dimensions, which provide a wealth of mental health indicators for research. To derive meaningful information from these dimensions, studies need to design an indicator structure that combines scores from these dimensions to produce a comprehensive mental health score.
The nine main dimensions of SCL-90 are as follows: anxiety, depression, interpersonal sensitivity, host, fatigue, paranoid thinking, psychosis, anxious insomnia, and somatization.
Each dimension has a score range, such as 1 to 5. In order to design a comprehensive index structure, the weighted average method is used to multiply the score of each dimension by a weight and then sum to get the total score. This weight can be determined based on previous mental health impact factor analyses such as F1 to F5 above. Formula expression is as shown in formula (2):
Based on the previous analysis, the study assigns a weight Weights of the indicator dimensions of SCL-90.
The sum of weights is 1. These weights reflect the relative importance of each dimension to overall mental health.
Using this index structure design, the study can not only obtain the comprehensive mental health score of each student but also better understand which dimensions have the greatest impact on students’ mental health and provide a more targeted reference for subsequent strategies and suggestions.
Analysis model and method strategy
Adaptive statistical and machine learning models
Based on the indicator structure of SCL-90 and the mental health impact factors defined by the research, it is key to select the appropriate statistical model and machine learning model. Several adaptation models and methods are listed below, along with a brief description and adaptation analysis. (1) Multiple linear regression: The model is shown in the following formula (3):
Applicability: It is suitable for predicting mental health scores and understanding the contribution of each influencing factor.
Reason for choice: When the data set has multiple continuous variables and the relationship is assumed to be linear, multiple linear regression provides intuitive causal analysis. (2) Decision trees
Model: Predicts the output by splitting the input space and making independent decisions in each partition.
Applicability: Provides clear, interpretable rules to predict mental health status.
Reason for choice: Suitable for scenarios that require intuitive and interpretable results, especially for identifying key mental health influences. (3) Random forests
Model: Integrate multiple decision trees for greater accuracy.
Applicability: Used to improve prediction accuracy and provide importance assessment of variables.
Reason for choice: When the data set has many variables and complex relationships, random forests can provide more stable and comprehensive prediction results. (4) Support vector machines (SVMs), model, are as shown in the following formula (4):
Applicability: Suitable for high-dimensional data sets, especially linearly separable or approximately linearly separable data.
Reason for choice: SVM can effectively distinguish complex mental health states when processing high-dimensional mental health data.
When choosing a model, the key is to consider the explanatory and predictive power of the model. To ensure optimal performance, research can use cross-validation to evaluate and compare the effects of different models, thus selecting the best model for final data analysis.
Model tuning and parameter selection
The performance of the model depends not only on the algorithm chosen but also on the choice of model parameters. Optimizing these parameters can significantly improve the predictive ability of the model. (1) Multiple linear regression:
Main parameters: regularization coefficient. Used to avoid overfitting, such as L1 and L2 regularization.
Tuning strategy: Select the regularization intensity parameter by cross-validation (e.g., (2) Decision tree:
Main parameters:
Maximum depth of tree: Limit the size of the tree to avoid overfitting.
Minimum number of slitting samples: The minimum number of samples that a node must have to split again.
Feature selection criteria: Such as information gain and Gini impurity.
Tuning strategy: Using grid search and cross-validation, different combinations of the above parameters are evaluated to select the parameter combination that makes the model perform best. (3) Random forest:
Main parameters:
Number of trees: More trees can provide a more stable forecast, but it is more computationally expensive.
Maximum feature number: The number of features considered for each split.
Other decision tree parameters also apply here.
Tuning strategy: Similar to decision trees, grid search and cross-validation are also used. (4) Support vector machine:
Main parameters:
Kernel function: Such as linear, polynomial and radial basis function (RBF).
For nonlinear kernel functions, such as RBF, other parameters need to be selected, such as
Tuning strategy: Use cross-validation and grid search while evaluating different
Examples of random forest tuning results.
To sum up, selecting the right model parameters is crucial to ensure the accuracy and generalization ability of the model. Through methods such as cross-validation and grid search, the research can effectively adjust and select these parameters to obtain the best model performance.
Result analysis
Preliminary analysis and descriptive statistics
Based on the collected data and subsequent model analysis, the study first conducted descriptive statistics to obtain the overall situation of the data. Include average score ( (1) The score distribution of SCL-90 in each dimension is shown in Figure 2. (2) Distribution of mental health impact factors is shown in Figure 3. Scores of SCL-90 in each dimension. Distribution of mental health impact factors.


It can be observed from the above described statistics that:
On the SCL-90 scale, scores for anxiety, depression, and fatigue were relatively high, suggesting that the student population may have some problems with these dimensions.
For mental health impact factors, academic stress and interpersonal stress had higher average scores, suggesting that these two factors may be the main causes of students’ mental health.
These preliminary descriptive statistics provide the general situation of the data for the research and provide a valuable reference for the subsequent in-depth data analysis and strategy formulation.
Detailed analysis of mental health scores
Comparison and interpretation of scores in each dimension
In order to have a deeper understanding of students’ mental health status, this study conducted a detailed analysis and comparison of nine dimensions in the SCL-90 scale.
The average score ( (1) The average score for the anxiety dimension is 2.8, which is the highest score among the nine dimensions. This could mean that students generally face greater anxiety stress, which is also consistent with high academic stress found in the study’s Mental Health Impact Factors. (2) The average score for the psychopathic dimension was 2.2, which is the lowest score of the nine dimensions, but still in the middle. This suggests that while students may have certain mental health issues, they may not be very common. (3) Standard deviation can provide the dispersion degree of scores in each dimension for the study. For example, the standard deviation of the paranoid thinking dimension is 1.1, which is the highest of the nine dimensions. This means that there is a wide distribution of students’ scores in this dimension, with some students likely to score very high while others score very low. Comparison of SCL-90 scores in each dimension.

Through this meticulous analysis, not only can students’ mental health status be better understood but also which dimensions need the most attention. For higher-scoring dimensions, for example, higher education institutions can provide more mental health services and resources to help students cope with and manage these issues.
Core factors affecting students’ mental health
After a detailed analysis of the mental health scores, in order to more accurately identify the core factors affecting students’ mental health, the study weighted and ranked the previously defined impact factors.
Using the regression model, the coefficients corresponding to each influence factor can be obtained, which reflect the degree of correlation between each factor and the mental health score.
The coefficient of the impact factor ($\beta$) is shown in Figure 5. (1) The coefficient of academic stress is 0.35, which is the highest among the four influencing factors, indicating that academic stress has the greatest impact on students’ mental health. This is consistent with the high anxiety and depression scores previously found in the study’s SCL-90 dimensional analysis. (2) Interpersonal stress and personal experience are also important influencing factors, with corresponding coefficients of 0.28 and 0.22, respectively. These two factors were strongly correlated with interpersonal sensitivity and paranoia scores. (3) Although the impact of living habits is not as good as the above three, it still cannot be ignored. Bad lifestyle habits, such as late sleepers and unhealthy diets, can have a negative impact on students’ mental health. Coefficient of impact factor.

Through this analysis, schools and educational institutions can gain a clearer understanding of which factors have the greatest impact on students’ mental health and develop mental health intervention and support strategies accordingly.
Result-based recommendations and strategy design
Based on the previous data analysis and model results, this paper provides the following suggestions and strategy design for universities to improve students’ mental health status: (1) Establish special mental health education courses: Since academic pressure is the factor that has the greatest impact on students’ mental health, it is suggested that colleges and universities set up mental health courses related to academic pressure management, teaching students how to balance study and leisure, and how to face exam pressure. (2) Strengthen psychological counseling and counseling services: Considering that interpersonal stress and personal experience also have an important impact on students’ mental health, it is recommended that colleges and universities strengthen psychological counseling and counseling services. In addition, regular mental health talks and workshops are organized to help students build healthy relationships and deal with past traumatic experiences. (3) Promoting healthy living habits: Living habits also have a certain impact on students’ mental health. Therefore, colleges and universities can promote healthy living habits through activities such as healthy eating and physical exercise, such as setting up healthy food restaurants and encouraging students to participate in physical exercise. (4) Establish a mental health platform and social network: Use modern technology to establish an online mental health platform to provide students with psychological assessment, psychological consultation appointment, mental health knowledge sharing, and other services. In addition, a mental health social network is established to encourage students to share experiences and support each other. (5) Regular mental health assessment: It is recommended to conduct a mental health assessment of students every semester or every academic year, timely detection of students’ mental health problems, and provide corresponding support and help. (6) Cooperation with parents: Family environment and parents’ educational style have an important impact on students' mental health. It is suggested that colleges and universities cooperate with parents to organize regular parent education lectures to teach parents how to communicate with their children and how to support their children’s mental health.
In summary, based on the results of data analysis and model, colleges and universities can adopt the above suggestions and strategies to improve students’ mental health in a targeted way and create a healthy and harmonious learning and living environment for students.
Conclusion
This study used big data technology and SCL-90 scale to systematically evaluate and analyze the mental health status of college students. The study found that academic pressure is the main factor affecting students’ mental health, in addition, interpersonal relationship, living habits, personal experience, and other factors are also closely related to students’ mental health status. Based on these findings, this study puts forward suggestions and strategies for universities to increase mental health education courses, strengthen psychological counseling services, promote healthy living habits, establish mental health platforms, and cooperate with parents.
However, there are some limitations in this study. First, the diversity of data sources and the breadth of sample selection may affect the generalization of research results. The current sample focuses on specific regions and school types and fails to cover a broader student population. Secondly, the factors affecting mental health are diverse and complex, and this study only considers a limited number of factors, which fails to fully reveal the whole picture of mental health problems.
Future research directions could include expanding the sample to include more regions and different types of universities; introduce more factors affecting mental health, such as family background, economic status, and cultural factors. More advanced data analysis techniques are used to improve the accuracy and depth of the research. In addition, interdisciplinary research methods can be explored, combining knowledge from fields such as pedagogy, psychology, and sociology to provide more comprehensive solutions for improving the mental health level of college students.
In conclusion, this study provides a valuable reference for colleges and universities to better understand the mental health status of college students and formulate effective education and intervention strategies. Through these efforts, the research hopes to create a more healthy and harmonious learning and living environment for students, thereby promoting their all-round development.
Statements and declarations
Footnotes
Conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
