Abstract
BACKGROUND:
Machine learning offers diverse options for effectively managing blood glucose levels in diabetes patients. Selecting the right ML algorithm is critical given the array of available choices. Integrating data from IoT devices presents promising opportunities to enhance real-time blood glucose management models.
OBJECTIVE:
This meta-analysis aims to evaluate the effectiveness of machine learning models utilizing IoT device data for predicting blood glucose levels.
METHODS:
We systematically searched electronic databases for studies published between 2019 and 2023. We excluded studies lacking ML model derivation or performance metrics. The Quality Assessment of Diagnostic Accuracy Studies tool assessed study quality. Our primary outcomes compared ML models for BG level prediction across different prediction horizons (PHs).
RESULTS:
We analyzed ten eligible studies across prediction horizons of 15, 30, 45, and 60 minutes. ML models exhibited mean absolute RMSE values of 15.02 (SD 1.45), 21.488 (SD 2.92), 30.094 (SD 3.245), and 35.89 (SD 6.4) mg/dL, respectively. Random Forest demonstrated superior performance across these PHs.
CONCLUSION:
We observed significant heterogeneity across all subgroups, indicating diverse sources of variability. As the PH lengthened, the RMSE for blood glucose prediction by the ML model increased, with Random Forest showing the highest relative performance among the ML models.
Introduction
Diabetes is a global health problem that is expected to worsen in the next decade [1, 2]. The impact of uncontrolled diabetes on an individual’s health and wellbeing is significant; this underscores the urgent need for effective management strategies to prevent these problems and improve outcomes [3, 4]. Diabetes management is important to reduce the risk of serious and chronic diseases associated with diabetes. Machine learning combined with IoT applications is expected to revolutionize diabetes management [5]. The IoT plays a significant role in the healthcare industry, both in application-oriented tasks and in maintaining patient health records. IoT enables automatic and continuous monitoring, which is particularly useful in mobile healthcare applications [6, 7, 8]. IoT devices such as continuous blood glucose monitors (CGM) can instantly and continuously monitor physical data and provide valuable information that can be used by ML algorithms for predictive modeling and personal impact. ML models are valuable tools for identifying and managing diabetes. These models have proven to be excellent predictors of diabetes development, leveraging data from a person’s medical history, risk factors, and genetic makeup [9, 10]. Machine learning algorithms can analyze complex data provided by CGMs, electronic health records (EHRs), and lifestyle factors to predict glycemic changes and improve glycemic control [11, 12]. Various machine learning algorithms, such as random forests (RF), support vector machines (SVM), neural networks, and autoregressive models, have been examined for their effectiveness in predicting blood glucose (BG) levels and diabetes complications. However, the effectiveness of these models may vary between studies due to differences in data elements, designs, and patient populations [13, 14, 15].
The role in management is important for people with diabetes. With the continuous improvement of machine learning models and the rapid development of IoT devices, this meta-analysis examines the different trends and changes observed in the last five years. By reviewing studies published between 2019 and 2023, we aim to provide a new assessment of the future of machine learning in the context of blood sugar monitoring with IoT technology. This meta-analysis aimed to evaluate the effectiveness of ML models in predicting blood glucose outcomes and improving glycemic control in diabetic patients. This study aims to identify advances, challenges, and trends in the integration of machine learning and IoT technologies for diabetes management by integrating existing literature. Specific machine learning and IoT architectures that have been shown to be useful in diabetes prediction and management will also be examined in research. This study will provide important information that will guide future research and inform machine learning-driven solutions in clinical practice, ultimately improving diabetes glucose management and reducing the burden of diabetes complications.
Methods
This study strictly followed the reporting guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA; Liberati et al., 2009) [16, 17]. PRISMA provides an effective and reproducible method for data analysis, article selection, evaluation, and analysis. A pre-defined protocol was established to document the analysis methodology and criteria for inclusion.
Study design
This section outlines the research design and methodology utilized in this study. It covers the eligibility criteria, information sources, research inquiries, study selection procedures, data collection methods, and the article selection process for publication.
Research questions
General questions (GQ):
What advancements and trends have emerged in diabetes management with the integration of machine learning and IoT technologies over the past five years? What are the key challenges and gaps identified in the literature regarding the practical implementation of machine learning and IoT solutions in diabetes management during this period?
Specific questions (SQ):
These questions aim to delve deeper into the ways in which specific health parameters and physiological data are monitored and analyzed using machine learning and IoT devices in the context of diabetes management. They focus on understanding: what are the outcomes and effectiveness observed from the application of machine learning and IoT techniques in the management and control of diabetes, particularly in predicting blood glucose levels and optimizing glycemic control? Which machine learning algorithms and IoT architectures are predominantly employed in the development and deployment of solutions for diabetes management?
The answers to these research questions will provide comprehensive insights into the current state of machine learning and IoT applications in diabetes management. They will help evaluate the impact, challenges, and opportunities associated with these technologies in improving diabetes care and patient outcomes.
The process of interpreting the search string involves conducting searches in scientific databases and cross-referencing known terms, including synonyms, acronyms, and word combinations relevant to the study’s context. We use the PICOS method to refine our search strings [18]. This approach is recommended for the development of various concepts from PRISMA, such as defining objectives, research questions and appropriate criteria. Each letter in PICOS contains a domain of expertise: participants (P), intervention (I), comparison (C), outcome (O), and design.
Participants: Adult individuals diagnosed with diabetes mellitus, including those with type 1 diabetes, type 2 diabetes, or gestational diabetes.
Interventions: Utilization of machine learning algorithms and IoT technologies, including wearable devices and smart sensors, for monitoring, management, and prediction of blood glucose levels in diabetic patients. Comparisons: Comparison of the effectiveness and outcomes achieved through the integration of machine learning and IoT technologies with traditional methods of diabetes management. Outcomes: Assessment of outcomes related to glycemic control, blood glucose prediction accuracy, improvement in patient outcomes (such as quality of life, morbidity, and mortality rates), identification of challenges and gaps in the implementation of machine learning and IoT solutions in diabetes management. Study Design: Inclusion of research articles, clinical trials, observational studies, and feasibility studies that investigate the integration of machine learning and IoT technologies in diabetes management. Emphasis on studies reporting outcomes related to the application of machine learning algorithms and IoT architectures in predicting blood glucose levels, optimizing glycemic control, and addressing challenges in diabetes management.
Based on the search strategy, we demonstrated the search string defined to be used in querying the databases:
For article selection, we retrieved studies published within the last five years (2019–2023) from electronic databases using our predefined search string. The databases surveyed included Scopus, Springer, IEEE Xplore, PubMed, CINAHL, Embase, Web of Science, and Nature. These databases were selected due to their comprehensive coverage of relevant articles in the field addressed in this paper. Moreover, they offer access to full-text journals and conference proceedings from prominent health conferences focusing on patient self-care, IoT, diabetes, wearable devices, and related topics. The last search was done on January 15th, 2024.
Exclusion criteria
Articles focused on pediatric populations, including children and adolescents (up to 18 years of age), were excluded.
Our meta-analysis specifically focuses on Continuous Glucose Monitoring (CGM) technologies used in diabetes management.
Articles not reporting primary research studies, such as thesis, opinions, abstracts, dissertations, criticisms, books, protocols, posters, reviews, and oral presentations were excluded.
Articles that do not specifically discuss the utilization of IoT techniques, including wearable electronic devices, for monitoring, self-care, and management during the treatment phase of diabetes patients were excluded.
Inclusion criteria
Studies involving adult men and women diagnosed with diabetes mellitus, including type 1 diabetes, type 2 diabetes, or gestational diabetes.
Studies published within the last 5 years to capture recent advancements and trends in the field of diabetes management.
Articles written in English to ensure accessibility and comprehensibility for analysis and interpretation in the meta-analysis.
Data extraction and management
Both reviewers independently conducted data extraction and quality assessment. Any disagreements were resolved by an impartial third reviewer. When a study reported multiple test results for the same ML model, the most favorable outcome was chosen for extraction. Similarly, if a study evaluated multiple ML models, performance metrics for each model were extracted individually. In studies focusing on blood glucose level prediction, root mean square errors (RMSEs) for different prediction horizons (PHs) were extracted. For studies not specifying PHs, performance metrics such as R-squared value and Accuracy of ML models were extracted.
Specifically, the following information was extracted:
General characteristics: first author, publication year, country, data source, and study purpose (i.e., predicting blood glucose). Experimental information: participants (type of DM, type 1 or 2), sample size (patients, data points, and hypoglycemia), demographic information, models, study place and time, model parameters (i.e., input and PHs), model performance metrics, IoT applications used.
The quality of the included studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. This tool evaluates studies across four domains: patient selection (5 items), index test (3 items), reference standard (4 items), and flow and timing (4 items). All four domains were used to assess the risk of bias, while the first three domains were specifically used to evaluate concerns regarding applicability. Each domain consists of a set of questions (totaling 7) related to either risk of bias or applicability [19].
Data synthesis and statistical analysis
The performance metrics of models used for blood glucose level prediction were evaluated independently based on their specified prediction horizons. Studies that did not specify prediction horizons were analyzed separately. The primary performance metric used was the root mean square error (RMSE) of ML models in predicting BG levels. For each study, effect sizes (Cohen’s d) and standard errors were calculated. Study heterogeneity was assessed using I2 values obtained from multivariate random-effects meta-regression, which accounted for within- and between-study correlations. Heterogeneity was categorized into quartiles based on these values: 0% to
Furthermore, studies focusing on BG levels were divided into four subgroups based on different prediction horizons (15, 30, 45, 60, and 120 minutes). A two-sided
Results
Out of 1,174 studies identified through systematic search of predefined electronic databases, 1,067 (91%) remained after removing duplicates. Following screening of titles and abstracts, 734 (68.79%) studies were excluded due to irrelevant topics or lack of predefined outcomes. The remaining 333 (31.2%) studies underwent full-text evaluation. Of these, 323 (97%) were excluded for various reasons, leaving 10 (3%) studies included in the final meta-analysis.
PRISMA flow diagram of identifying and including studies.
In total, the 10 studies included 8,776 participants with over 20 different ML models and different IoT devices (Table 1).
Quality assessment of included studies
The evaluation findings using the QUADAS-2 tool indicated that 30% of the studies included did not provide detailed reporting of patient selection criteria, resulting in substandard patient selection quality.
Statistical analysis
Machine learning models for predicting blood glucose levels
In our meta-analysis evaluating the performance of machine learning (ML) models at a 15-minute prediction horizon, we observed significant heterogeneity across the included studies. This analysis incorporated data from 4 studies [22, 23, 25, 27], collectively examining 5 distinct ML models. The mean RMSE was 15.02 (SD 1.45) mg/dL. The omnibus test of model coefficients yielded a statistically significant result (Q
Baseline characteristics from included studies of predicting BG levels
Baseline characteristics from included studies of predicting BG levels
RF, Random Forest; SVR, Support Vector Regression; SVM, Support Vector Machine; BRNN, Bayesian Regularized Neural Networks; RMSE, Root Mean Square Error;
Assessment of study quality. Graph (A) depicting risk of bias and concerns about applicability, and Summary (B) showing risk of bias and applicability concerns.
heterogeneity on the meta-analysis results. Furthermore, regression testing for funnel plot asymmetry using Egger’s test detected significant asymmetry (
Forest Plot for comparing ML models at a PH 
For PH
For PH
Forest Plot for comparing ML models at a PH 
For PH
Funnel Plot for studies comparing ML models at a PH 
For PH
Forest Plot for comparing ML models at a PH 
Studies without a specific predictive horizon were included in the analysis to assess the performance of machine learning models in diabetes management irrespective of time-based forecasting; this includes 3 studies [28, 31, 30]with 13 different ML models. The omnibus test of model coefficients yielded a statistically significant result (Q
Key findings
This meta-analysis comprehensively evaluated the effectiveness of various ML models in improving blood glucose management among patients with diabetes mellitus (DM), from a selection of 10 eligible studies. Through a thorough and exhaustive literature searches, we obtained comprehensive evidence to assess the collective predictive capacity of ML models for BG level prediction in diabetes management.
Included studies comparison
Clearly, RMSE of machine learning models in predicting blood glucose levels increased as the PH extended from 15 to 60 mins, suggesting that extended PHs are associated with greater prediction inaccuracies. Based on these findings, the Random Forest (RF) model consistently demonstrates superior performance compared to other models (SVR, SVM, ARISES) across different studies for a prediction horizon of 15 minutes. Therefore, RF may be considered the best-performing model for predicting BG levels at this specific prediction horizon based on the available data. In our research focusing on a 15-minute prediction horizon for blood glucose management in diabetes, we analyzed multiple studies with Cohen’s d values ranging from
In the investigation of a 30-minute prediction horizon the results from Rodríguez-Rodríguez et al. consistently demonstrated that Random Forest exhibited superior performance compared to Support Vector Machine and Bidirectional Recurrent Neural Network models, with Cohen’s d values ranging from
For PH
In our comparative analysis of predictive models with no specific time frame, the Ensemble machine learning consistently emerged as the most effective model. This model demonstrated a substantial advantage over Linear Regression (LR), Random Forest (RF), and Gradient Boosting (GB), with Cohen’s d effect sizes ranging from
For Zahedani et al. [29] study, among evaluated models (CGP, XGBoost, RF), CGP outperformed with the lowest RMSE (13.4), highest correlation (0.71), and lowest percent error (10.3%). These results highlight CGP’s suitability for accurate predictions in similar datasets, underscoring the impact of advanced machine learning techniques on predictive accuracy.
Strengths and limitations
The study is subject to several limitations. Despite employing a comprehensive search strategy, there is a possibility of missing relevant studies. To enhance literature retrieval, major medical databases such as PubMed, CINAHL, and Embase were included, and baseline models from relevant studies were screened to minimize omissions. Additionally, significant heterogeneity was observed across all subgroups due to various factors, including different types of diabetes mellitus, machine learning models, data sources, reference indices, and the timing and settings of data collection. To address this, meta-regression analyses were conducted within subgroups to explore potential sources of heterogeneity. Moreover, some studies lacked the required outcome measures or had inconsistent ones, necessitating the use of estimation methods for calculating indicators, which may have introduced some estimation error. However, this error was considered acceptable due to the use of appropriate estimation methods, enriching the study’s findings. Nonetheless, future studies should report all relevant outcome measures for comprehensive evaluation.
Future directions
In the future, improved ML models will enhance BG management for patients with DM, reducing adverse BG events and improving quality of life. Future studies should prioritize enhancing ML model performance in longer prediction horizons (e.g., 60 minutes) and address imbalanced CGM data to improve model accuracy. Integrating factors like meal intake and exercise into ML models, optimizing ensemble structures, and validating models in clinical settings are crucial steps for advancing BG management to support real-time feedback and medical intervention. Additionally, leveraging IoT benefits such as continuous monitoring and data integration could further enhance the effectiveness of these ML models in managing blood glucose levels.
Conclusion
In summary, as the prediction horizon (PH) extends, the RMSE for blood glucose level prediction models increases, with Random Forest (RF) demonstrating the most robust performance among the ML models assessed. Future research should prioritize improving predictive accuracy and implementing ML models effectively in clinical settings. Additionally, exploring enhanced approaches for integrating data from IoT devices could further optimize glucose management strategies.
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Author contributions
All authors contributed to the study’s conception and design. Material preparation, data extraction and analysis were performed by YH and YK. The first draft of the manuscript was written by YK. All authors read and approved the final manuscript.
Data availability
The original contributions presented in the study are included in the article/Supplementary materials, further inquiries can be directed to the corresponding author.
Supplementary data
The supplementary files are available to download from https://dx-doi-org.web.bisu.edu.cn/10.3233/THC-241403.
Footnotes
Acknowledgments
We would like to thank the senior management of Delhi Technological University for their constant support and guidance.
Conflict of interest
The authors declare no conflict and competing interest.
