Abstract
Background
Short video platforms have become important channels for the public to obtain health information, but the quality of health science popularization content varies greatly. Existing studies lack a comprehensive exploration of the determinants of video quality and their interaction mechanisms.
Objective
This study aimed to identify the key features influencing the quality of cerebrovascular disease health science popularization short videos and clarify their configurational effects.
Methods
Python web-crawling technology was used to collect health science popularization short videos on TikTok related to cerebrovascular diseases over the past year, and the video quality was evaluated using the Grade Quality Score (GQS) tool by two medical professionals. Eight machine learning models were constructed to identify key quality-related features. The joint effect of six features was analyzed for necessity and sufficiency by using the fuzzy set qualitative comparative analysis (fsQCA) method. Finally, the Kruskal-Wallis H test was employed to evaluate differences in quality among videos of varying duration.
Results
A total of 541 valid videos were collected. Most videos were posted by medical staff (77.27%), among which high-quality videos (with GQS > 3) accounted for 14.42%. The importance of video duration reached 30.5%, making it the most crucial feature affecting video quality. The fsQCA results indicated that short duration was one of the conditions for high-quality videos, with the optimal duration being 3 to 5 minutes.
Conclusions
Video duration was the main determinant of the quality of cerebrovascular disease health science popularization short videos. Improving the short-video communication skills of medical professionals and optimizing video duration are effective ways to enhance the quality of health science popularization content.
Keywords
1. Introduction
The rapid growth of the internet has made social media an essential channel for the public to obtain information, especially in the field of health. 1 A growing number of individuals now rely on these platforms to seek health-related information.2,3 At the same time, licensed and certified healthcare professionals are using platforms like Weibo, short-form video services, and WeChat official accounts to share knowledge about disease prevention, progression, and prognosis.4–6 While this development has significantly improved the public’s ability to access health information, it has also boosted public health literacy and treatment adherence.7,8 However, the quality of health content created by different users varies widely, even as the public continues to demand higher-quality information.
Short-form video has become an indispensable part of daily life for a significant segment of the population, with approximately 1.068 billion users in China. 9 In particular, TikTok, as a widely popular short-form video platform with global reach, has emerged as a key channel for the public to access health information.10,11 With comprehensible language and an entertaining style, short-form video platforms can rapidly disseminate complex medical knowledge, making it highly popular among viewers. 12 However, as open and interactive platforms, short-video websites host a vast number of health science popularization videos. While audiences have access to high-quality content, they are also exposed to low-quality and even pseudo-scientific health information.13,14 The widespread dissemination of such misleading content not only distorts people’s perceptions of certain diseases but also undermines their capacity to make sound health decisions and adopt rational health behaviors. 15 In contrast, high-quality health science popularization short videos can not only drive positive changes in public health behaviors but also help enhance public health literacy to a certain extent.16,17 At present, there are a large number of studies focusing on the quality of health science popularization short videos, and the results show that the quality of health science popularization short videos on short video platforms is generally low.18–20 Moreover, due to the inherent algorithmic recommendation mechanisms of these platforms, 21 social interactions such as likes and shares of low-quality health science popularization short videos have further expanded their reach, thereby overshadowing the dissemination of high-quality counterparts.
Existing studies have confirmed that the quality of health science popularization short videos depends not only on the accuracy and comprehensiveness of health information content, but also on inherent video characteristics such as duration. 22 Meanwhile, numerous studies have revealed that video quality is significantly positively correlated with communication performance indicators, including the number of shares, likes, and comments.23,24 These findings have clarified the internal relationships among video characteristics, quality, and communication effects, providing an important basis for understanding the communication rules of short health education videos and offering significant reference value for optimizing video communication and improving health education effectiveness. However, the communication indicators concerned in the above studies are mostly formed after video release as gradually accumulated outcomes over time, and are easily interfered with by multiple factors such as video style and user preferences, thus exerting relatively limited guiding effects for creators during the video planning stage. Although existing research has preliminarily explored the correlations between some video characteristics and quality,25,26 systematic investigations on the relationships between inherent characteristics definable before release, such as duration and emotional tendency, and video quality remain insufficient. In addition, current studies mostly adopt correlation analysis, which can only determine the direction of linear correlations between characteristics and quality, 27 can hardly quantify the relative importance of each characteristic in affecting video quality, and fail to provide support for the optimization of video creation. Based on the achievements and limitations of previous research, this study aims to deeply explore the significance of the inherent features of videos that can be defined before release for the quality of short health education videos, and to quantify the importance of key features. The relevant results not only provide a practical basis for video creators to optimize content quality, but also help more effectively identify and select high-quality health science popularization short videos.
Machine learning methods have shown outstanding performance in exploring key factors from health-related data and can serve as reliable analytical tools. Khan et al. 28 developed a random forest model based on linguistic and emotional feature extraction to identify COVID-19-related health misinformation and achieved an accuracy of 88.5%. Chen YF et al. 29 further applied text feature fusion combined with machine learning algorithms to detect derivative health rumors, reaching an accuracy rate of 97.01%. These studies have verified that fine-grained feature extraction can effectively improve the accuracy of health information identification. More importantly, machine learning approaches are also capable of quantifying the relative importance of influencing factors, which provides a feasible and effective way to further explore the intrinsic relationship between pre-defined video features and the quality of health science popularization short videos.
This study systematically combined eight machine learning models with fuzzy set qualitative comparative analysis (fsQCA) to identify the key determinants of the quality of health science popularization short videos on the TikTok platform. Unlike previous studies that only relied on correlation analysis or single model prediction, our comprehensive approach not only determined the most influential individual predictors but also revealed the configuration effects and sufficient conditions for generating high-quality videos. This methodological integration provides a more comprehensive understanding of how multiple features jointly influence video quality. Our research findings will provide practical references for audiences to distinguish and select high-quality health science popularization short videos, offer guidance for creators to optimize video design and content production, provide evidence-based strategies for short video platforms to improve health content review mechanisms and algorithm recommendation logic, and lay the foundation for the government to formulate standardized management policies for online health science popularization information.
Therefore, this study selected health science popularization short videos related to cerebrovascular diseases on the TikTok platform as the research object, systematically extracting the relevant features of these videos, including content features and publisher features. Subsequently, eight machine learning models were constructed to identify the key features that have a significant impact on the quality of health science popularization short videos. Additionally, the fsQCA method was integrated to explore the configuration effects among the most important features.
2. Methods
2.1. Data collection
Using Python web scraping technology, we collected short videos related to the following seven Chinese keywords from the Chinese version of the TikTok platform: “Cerebrovascular disease”, “cerebral apoplexy,” “Stroke”, “Cerebral infarction”, “Transient ischemic attack”, “Cerebral hemorrhage”, and “Subarachnoid hemorrhage”. We included all the videos related to these keywords that were posted between December 10, 2023, and December 10, 2024. Using these 7 keywords, 2,205 videos were retrieved from TikTok. After removing duplicates, 1,467 videos remained. Through manual screening to exclude videos unrelated to health science popularization, duplicate content, purely image videos, and non-existent videos, a total of 541 videos were finally included in this study. We collected video information, including video titles, account types, number of account followers, number of likes received by accounts, number of accounts followed, release area, release times, video duration, number of video likes, number of video favorites, number of video comments, and number of video shares.
2.2. Text preprocessing
Text extraction was conducted on the videos to facilitate subsequent processing. Specifically, SPSSAU software was employed to extract the text and dialogue content from the videos, converting the video materials into text format. Based on word cloud technology, the most frequently occurring single words and word combinations were extracted, thereby achieving the simplification and relevant visualization of the text data set derived from videos. Subsequently, textual emotion analysis was performed, resulting in the classification of the videos into five categories: negative, slightly negative, text-absent, slightly positive, and positive.
2.3. Manual evaluation
Global quality scale.
The continuous GQS score was further dichotomized with a threshold of GQS ≥ 3 to categorize videos into high-quality and low-quality groups. This cutoff was rationally determined according to the moderate boundary of the 1–5 rating scale and referenced to existing published standards for health science popularization short video quality grading. A stricter threshold of GQS ≥ 4 would markedly reduce the sample size of high-quality videos and aggravate class imbalance, whereas a lenient threshold of GQS ≥ 2 would incorporate low-quality and irrelevant content and weaken the discriminative performance of the prediction model. Setting GQS ≥ 3 fully balances sample distribution rationality, content misrepresentations, and practical applicability for quality screening.
2.4. Pre-release feature selection
We selected six features before the video was released: the features of the publisher (account type, release area, account influence) and the content features (video duration, release time, emotional score). The account influence was calculated using the following formula
31
: Uinfl=
2.5. Validation of machine learning algorithms
Eight machine learning models were constructed using R software, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Logistic Regression (LR), Decision Tree (DT), K Nearest Neighbors (KNN), Adaptive Boosting (AdaBoost), and Gradient Boosting Decision Tree (GBDT). Accuracy, Precision, Recall, and F1-score were employed as the evaluation metrics for model performance. The Area Under the Receiver Operating Characteristic Curve Score (ROC AUC Score) was used for testing the probability prediction ability of the algorithms and was the most important overall measure for evaluating the machine learning models: the higher the ROC AUC Score is, the better the probabilities calibrated. All models were implemented in R software with fixed and uniform hyperparameters, which were not optimized via grid search, random search, or Bayesian optimization. This experimental setup guarantees full reproducibility under identical experimental conditions and enables fair comparison of model performance. A global random seed was set throughout the entire modeling process, and the number of computing threads was limited to eliminate random fluctuations and computational bias. Detailed hyperparameter configurations are as follows: the random forest model adopts 500 decision trees, enables variable importance calculation, and operates in single-threaded mode. XGBoost is configured with a maximum tree depth of 3, a learning rate of 0.1, 50 boosting iterations, and a binary logistic regression objective function. The support vector machine uses a linear kernel (vanilladot) with a penalty coefficient of 1, and outputs class probabilities for model evaluation. KNN is set with k = 5 nearest neighbors. AdaBoost runs 30 iterations with a learning rate of 0.1. GBDT adopts 100 decision trees with a tree depth of 3, a shrinkage coefficient of 0.1, and uses the Bernoulli distribution as the loss function. Logistic regression and decision trees are implemented with default algorithm parameters.
2.6. Statistical analysis
Data analysis was mainly conducted using R software (version 4.5.2) and SPSS 27.0, with machine learning analysis performed by R software. This study developed eight machine learning algorithms for video quality prediction. To prevent data leakage, the data set was randomly split into an 80% training set and a 20% independent test set before any resampling operation. To ensure the stability of the results, the random split was independently repeated 10 times. After the data split, the SMOTE oversampling technique was strictly applied to the training set only, while the independent test set retained its original class distribution without any resampling or data modification. Model performance was evaluated by accuracy, precision, recall, F1 score, and the area under the receiver operating characteristic curve (AUC), with all metrics presented as the mean ± standard deviation of 10 repeated tests. The overall performance differences among models were assessed using the Friedman test, followed by paired Wilcoxon tests with Holm–Bonferroni correction for post hoc pairwise comparisons, with significant differences indicated by letter annotations. Normalized feature importance was calculated for each model, and an AUC-weighted integrated feature importance was generated to identify key predictors. All modeling processes were executed with a fixed random seed to ensure complete reproducibility of the results.
The multi-feature collaborative configuration driving high-quality health science popularization short videos was completed using the QCA package in R software. The result variable was defined as GQS motional score, account type, release region, and release time period. For continuous variables, the 95th percentile, 50th percentile, and 5th percentile were respectively used as the anchor points for complete membership, intersection, and complete non-membership. For categorical variables, membership was assigned based on the proportion of high-quality videos in each category. A necessity analysis was conducted for all conditions, with a consistency threshold set at 0.9 to identify necessary conditions. A truth table was constructed, with a case frequency threshold set at 2, an original consistency threshold set at 0.8, and a PRI consistency threshold set at 0.7. All rows meeting the above thresholds and with an output of 1 were extracted from the truth table, and the original coverage and unique coverage of each combination were calculated. Non-redundant combinations with a unique coverage greater than zero were retained as the final results, and their consistency and coverage were recorded. To ensure the robustness of the results, multi-dimensional robustness tests were conducted: adjusting the calibration percentiles of continuous variables (90th percentile, 50th percentile, 10th percentile, and 75th percentile, 50th percentile, 25th percentile), raising the original consistency threshold to 0.85, and increasing the case frequency threshold to 3, respectively repeating the above analysis process.
The Kruskal-Wallis H test was employed to assess the quality disparity among health science short videos of varying duration.
3. Results
3.1. Basic information about the videos
Video feature statistics.
Two experts conducted quality assessments on the 541 videos in accordance with the GQS scoring criteria. The consistency between the two raters was good, with an ICC of 0.831, and the 95% confidence interval was 0.800 - 0.858. The results indicated that the overall quality of the videos was relatively low: 126 videos (23.29%) scored below 3 points, corresponding to low quality, 337 videos (62.29%) were of moderate quality, and only 78 videos (14.42%) scored above 3 points, representing high quality.
Text processing was performed on the video transcripts. The results revealed that conceptual terms such as “blood vessel”, “stroke”, and “aneurysm” appeared frequently. Specifically, terms related to disease manifestations—including “hemorrhage”, “prevention”, and “treatment”—were also commonly observed, while disease-related factors like “hypertension”, “diet”, and “exercise” were present as well, with the word cloud presented in Figure 1. The detailed word frequency table can be found in the supplementary materials. In terms of emotional scores, 64.7% of the videos scored ≤ 3 points and exhibited negative emotions, while only 5.18% of the videos scored > 3 points and demonstrated positive emotions. Word cloud of text content from health science popularization short videos on cerebrovascular diseases.
3.2. Comparison of machine learning models’ performance
The performance of machine learning algorithms.

Roc curves (10-run average on test set).

Model performance metrics (10-run average).
There was no statistically significant difference in the performance of these models (P > 0.05). The detailed performance results and comparative rankings of the eight machine learning models were visually presented in Figure 4. Model performance: Auc with 95% confidence interval.
3.3. The importance of AUC-weighted aggregated features in all models
The weighted ensemble feature importance based on the AUC of all eight models was shown in Figure 5. The results showed that the importance of video duration was the highest, accounting for 30.5%. The importance of the release region was the second, accounting for 16.6%. The type of account had an importance of 15.5%, the emotional score accounted for 14.9%, and the influence of the account had an importance of 14.6%. In contrast, the importance of the release time was relatively low. Feature importance analysis. (a) Weighted ensemble importance. (b) Individual model importance.
3.4. Analysis of the combined effects of the key features
Necessity analysis of a single feature.
“∼” represents negation.
Analysis of the configurations for videos with QRS ≥ 3.
Note. “●” indicates the presence of a condition, meaning its calibrated fuzzy-set membership score is equal to or greater than 0.5. “⊗” indicates the absence of a condition, meaning its membership score is below 0.5.
Results of robustness test.
Note. √ indicates the existence of a solution; blank indicates that no solution exists. “*” represents “AND”. It connects different conditions and indicates that these conditions must be combined together.
3.5. The optimal duration for high-quality health science popularization short videos
To further explore the quality of health science popularization short videos of different durations and identify the most suitable duration, we conducted a segmented comparison of health science popularization short videos of various durations. The specific comparison results were presented in Figure 6. The quality of health science popularization short videos with durations of 1 to 3 minutes, 3 to 5 minutes, and over 5 minutes was all superior to that of those with a duration of less than 1 minute (P < 0.001). The quality of health science popularization short videos with a duration of 3 to 5 minutes was better than that of those with a duration of 1 to 3 minutes (P < 0.001). Segmented comparison of video quality across different durations.
4. Discussion
This study systematically explored the pre-features that affect the quality of health science popularization short videos on cerebrovascular diseases, and comprehensively utilized methods such as machine learning modeling and fuzzy set qualitative comparative analysis (fsQCA) to clarify the key features influencing video quality and their interaction mechanisms. Factors such as video duration and releasing region were the main features of high-quality short health science popularization videos.
4.1. Discrepancy between content producers and video quality
In this study, health science short videos with a Grade Quality Score (GQS) below 3 were classified as low-quality. The results indicated that although accounts from medical staff comprised the largest proportion of videos (77.27%), only 14.42% of these were rated as high-quality. This finding revealed a mismatch between the professional skills of medical professionals and the dissemination effect of health science popularization short videos. While medical professionals possess specialized knowledge, they may lack the skills to adapt complex scientific information to the short video format. 32 Cerebrovascular disease knowledge is inherently complex and requires sufficient duration to be explained clearly and comprehensibly. In contrast, the platform’s algorithm inherently favors concise and emotionally engaging content. 33 By contrast, low-quality or even false health science popularization videos typically employ sensational language to capture attention.34,35
Meanwhile, the platform’s algorithmic recommendation exacerbated this issue. 36 Videos released during peak user hours (12:00–18:00), which account for 43.81% of all video releases, gained more exposure. However, due to heavy clinical workloads, medical staff may be unable to align their video release times with periods of high user activity. Furthermore, regional disparities reflected the uneven distribution of medical communication resources. While developed regions produced more content, they failed to create videos of higher quality.
4.2. Textual and emotional patterns in science communication
Word cloud analysis showed frequent use of terms like “blood vessel,” “stroke,” and “aneurysm,” indicating that videos prioritize disease-specific concepts. In contrast, the word cloud from the study by Hongyu Wu et al. 11 Non-alcoholic fatty liver disease (NAFLD) health science videos on TikTok leaned more toward terms related to lifestyle interventions, such as “diet” and “reversible.” In terms of quality, the overall excellence rate of videos in our study was lower than that in the NAFLD research, reflecting the greater challenges in producing high-quality science videos that balance rigor and accessibility in the more specialized field of cerebrovascular diseases.
The distribution of emotional scores for videos of different qualities indicated that videos of poorer quality tend to have lower emotional scores. This fear-inducing communication strategy can stir up negative emotions among the public and may cause excessive psychological pressure and fear for those who are fighting the disease or have just been diagnosed. 37 Moderate fear appeals also raise self-protection awareness, 38 as exemplified by the classic case of graphic warning images on cigarette packages triggering negative emotional responses to promote self-protective motivation and action. Emotional scores exhibited a weak correlation with user engagement. This weak relationship indicated that users’ interaction with health science popularization videos was not driven solely by emotion, as practical value, such as actionable prevention advice, also played a crucial role. Although high-quality videos provide more accurate information and have a more neutral or rational emotional tone, they struggle to compete with fear-inducing false content in terms of capturing immediate attention.
4.3. Machine learning model performance and feature importance
This study employed eight classic machine learning models to predict the quality of health science popularization short videos on cerebrovascular diseases, including Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, K-Nearest Neighbor, AdaBoost, GBDT, and XGBoost. All models performed well on ten independent, randomly divided test sets, with the area under the receiver operating characteristic curve (AUC) exceeding 0.75 for each. There was no statistically significant difference in AUC among the models. Therefore, model selection should be based more on interpretability and computational efficiency rather than pursuing minor improvements in AUC. This approach aligns with the conclusion of Ding et al., 39 who found that traditional machine learning models can provide higher interpretability while maintaining predictive performance and at a lower training cost. The machine learning models used in this study can also be trained quickly in a regular computing environment and directly output feature importance. The basic hyperparameters of each model, such as the number of trees, maximum depth, learning rate, and kernel function, have been publicly reported to ensure reproducibility. As Cao et al. 40 emphasized, strictly adhering to methodological norms is crucial for enhancing the credibility of machine learning research. Although this study did not conduct extensive hyperparameter searches for each model, it still provides practical and interpretable machine learning evidence for predicting the quality of health science popularization videos with a moderate sample size and clear features.
Feature importance analysis derived from the weighted AUC ensemble of all models indicates that video duration is the most critical predictor. This finding was consistent with our correlation analysis and the results reported by Rongguang Ge et al., 24 which documented a positive correlation between video length and quality. Furthermore, release region, account type, emotional score, and account influence were also assigned relatively high importance scores. From the perspective of account type distribution, videos published by medical professionals account for the largest proportion. This distribution suggested that professional medical personnel were the primary producers of high-quality video content, and their professional knowledge backgrounds might directly improve the accuracy and credibility of the content, which explained the high importance of account type in the prediction models. In terms of releasing region, North China had the largest number of published videos, while Northwest China had the fewest. This geographical distribution characteristic was highly consistent with the regional disparities of medical resources and economic development in China. North China, especially Beijing, hosts the country’s top-tier medical institutions and medical education resources. East China and Central China boast developed economies and dense populations, where the dissemination of online medical information is more active. In contrast, Northwest China has relatively insufficient medical resources and, accordingly, has lower production capacity for high-quality health science popularization videos. The uneven regional distribution further confirms the importance of the publishing region feature. This characteristic not only reflects the geographical agglomeration effect of medical resources but may also affect the applicability of video content to audiences in different regions. These results are consistent with social cognitive theory, 41 which posits that users generally evaluate content based on the authority or popularity of the content source. In the domain of medical videos, videos published by medical professionals from regions with developed medical resources inherently carry higher credibility signals and thus are more likely to be regarded as high-quality content by users. In comparison, releasing time showed relatively low predictive importance, which may be attributed to the fact that releasing time has no inherent logical association with the intrinsic quality of video content, and only reflects users’ uploading habits or peak traffic periods of the platform. Therefore, compared with attributes directly related to content quality and source credibility, such as video duration and account origin, contextual factors, such as releasing time, have significantly weaker explanatory power.
4.4. Identification and equivalence analysis of configuration paths
This study found that the main configuration of high-quality health science popularization short videos includes five key conditions, including short duration, low account influence, high account type (such as medical professional accounts), high-activity release regions, and high-activity release time periods. Among them, the core driving role of short duration is particularly prominent. Existing empirical studies have shown that the completion rate of 15–30 second short videos is significantly higher than that for longer videos, and the platform algorithm accordingly gives stronger recommendation weighting. 42 Meanwhile, short videos, those with a duration of less than 60 seconds, have a much higher information density than long videos. The camera switches more frequently, resulting in the core information being presented upfront. These pieces of evidence support the results of this study. Controlling the duration not only benefits creators in obtaining algorithmic recommendations but also enhances users’ efficiency in absorbing health information. Additionally, the emergence of low account influence as a core condition is a counterintuitive but positive finding, indicating that even if the creator’s fan base is limited, as long as the content is concise and powerful, and is released by a highly credible account type (such as a hospital or official science popularization accounts) in active regions or time periods, high-quality videos can still be produced. For short video users, this result means they will have the opportunity to access more high-quality health information from medical professional accounts. Short video creators do not need to overly pursue the accumulation of followers but should focus on optimizing duration, account verification, and release timing to improve content quality.
This study identified two equivalent configurations, which differed only in the level of emotional intensity. One required a low emotional score, meaning a calm and objective presentation. The other demanded a high emotional score, indicating an enthusiastic and engaging expression. This interchangeability suggests that emotional intensity is not a decisive factor for high-quality health science popularization videos. Although previous research has indicated that in general social media content, 43 audiences often prioritize emotional stimulation and emotional resonance over the authenticity of information or depth of thought, and rational cognition often gives way to emotional preferences, in the specific field of health science popularization, this study found that there are two equivalent paths for emotional scores, high and low, suggesting that the clarity of fact transmission and scientific accuracy may be more important than the intensity of emotions, thus complementing the traditional view that emotional content is more easily spread. At the same time, the influence of the accounts in all configurations was at a low level, once again confirming that the quality of the content itself can make up for the lack of the creator’s traffic foundation.
4.5. The importance of controlling the duration of health science popularization short videos
For high-quality health science popularization short videos, choosing the appropriate duration is of great importance. Although the above configuration analysis indicated that short durations had advantages in completion rate and recommendation weighting, overly short durations often lead to fragmented knowledge presentation, making it difficult to systematically explain the prevention or rehabilitation knowledge of diseases. Such fragmented content not only weakens the audience’s ability for in-depth thinking and rational judgment but also impairs memory effects. 44 More seriously, the continuous information flow causing frequent context switching significantly damages individual prospective memory, that is, the ability to remember to execute a certain intention in the future. 45 Our research results further suggested that the optimal duration for truly high-quality health science popularization videos should be between 3 and 5 minutes. This is mainly because a duration that is too short cannot fully explain the causes, symptoms, and countermeasures of diseases, affecting the completeness and scientific nature of the content. On the other hand, a duration that is too long may exceed the audience’s attention span, reducing their willingness to like, collect, or share. Therefore, we call on creators of health science popularization short videos, especially medical professionals, to consciously control the video duration within 3 to 5 minutes to balance information density and knowledge systematicity. At the same time, short video platforms should also optimize their recommendation mechanisms, giving appropriate preferences to short videos with moderate duration and complete content, and avoiding excessive rewards for fragmented content. By reasonably controlling video duration, creators can not only improve video quality but also help the public more efficiently and systematically acquire knowledge on disease prevention and health management, thereby promoting a substantial improvement in health literacy.
4.6. Limitations
This study had several limitations. First, the data were exclusively derived from videos on the Chinese TikTok platform collected over the past year, and it remains unclear whether the findings can be generalized to other platforms or extended to longer time periods. Second, the GQS scores rely on subjective ratings from two experts. Although the inter-rater reliability was relatively high, individual bias cannot be ruled out. Third, the model only utilizes ex-ante features available before content release and excludes ex-post user behavior data. While this design avoids reverse causality, it also limits the model’s capability to leverage user feedback signals. Finally, this study only includes Chinese-language health science popularization short videos, and its applicability to contexts with other languages, cultures, or healthcare systems remains to be investigated.
5. Conclusion
This study systematically identified the main determinants of the quality of cerebrovascular disease-related health science popularization short videos on TikTok by integrating machine learning algorithms and fuzzy set qualitative comparative analysis (fsQCA). The results showed that all eight machine learning models achieved robust predictive performance, and no statistically significant differences were observed among the models. Feature importance and configuration effect analysis indicated that video duration was a key feature affecting video quality, and short duration was one of the core conditions for high-quality health education short videos. The optimal duration range for such high-quality videos was determined to be 3 to 5 minutes. Overall, this study confirmed that optimizing video duration is a core strategy for improving the quality of health education short videos on cerebrovascular diseases.
Based on these findings, creators of health science popularization short videos should keep the video duration within 3 to 5 minutes, as this time range is considered the optimal duration for information dissemination. Creators should also enhance the credibility of their accounts by prominently displaying their professional qualifications. For users, they can choose to watch videos from creators affiliated with more authoritative hospitals or those from regions with more developed economies. They should give priority to videos from certified medical professionals, hospitals, or government health accounts. Since the time of posting is relatively less important, users do not need to overly focus on when the creator posts the video.
Supplemental material
Supplemental material - The pivotal role of video duration in health science popularization: A mixed-methods analysis integrating machine learning and fuzzy-set qualitative comparative analysis
Supplemental material for The pivotal role of video duration in health science popularization: A mixed-methods analysis integrating machine learning and fuzzy-set qualitative comparative analysis by Xueping Jiao, Xingyu Liu, Mengting Liu, Yueting Wang, Shuhan Yang, Xueqin Yang, Yuhuan Xie, Yufang Guo, Fanghong Yan and Yanan Zhang in Digital Health.
Footnotes
Acknowledgements
The authors thanked the Gansu Provincial People’s Hospital for supporting this study.
Ethical considerations
This study has been approved by the Medical Ethics Committee of the School of Nursing, Lanzhou University, with the approval number: LZUHLXY20250052. All research data were derived from publicly accessible health science popularization short videos on TikTok, and did not involve personal privacy information.
Author contributions
Yanan Zhang conceived and designed the study. Xueping Jiao and Xingyu Liu collected the videos. Mengting Liu and Xueqing Yang collected the characteristics of the videos and authors. Yueting Wang, Shuhan Yang, Yuhuan Xie, and Yufang Guo were responsible for reviewing, classifying, and scoring the videos. Xueping Jiao and Xingyu Liu analyzed and visualized the data. Fanghong Yan managed the project, and Yanan Zhang provided financial support. Xueping Jiao and Xingyu Liu wrote the original draft. Yanan Zhang, Xueping Jiao, and Xingyu Liu reviewed and edited the manuscript. All the authors contributed to manuscript writing and editing, and approved the final draft for submission.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Project of Gansu University Teachers Innovation Fund [No.2025B-016], the 2025 Research Project of the Chinese Nursing Association [No. ZHKYQ202516], and the General Project of the Gansu Provincial Department of Science and Technology [No. 26JRRA195].
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The datasets generated or analyzed during this study are available from the corresponding author on reasonable request.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
