Abstract
Background:
The use of continuous glucose monitoring (CGM) devices in managing type 1 diabetes (T1D) has been associated with improved glycemic control in individuals with T1D. A key challenge for CGMs, however, is achieving accuracy, particularly under conditions where glucose levels may fluctuate rapidly, such as during exercise. Another factor contributing to blood glucose variability is the menstrual cycle, during which hormonal fluctuations affect insulin sensitivity, leading to variable glucose levels. This study aimed to assess the accuracy of FreeStyle Libre-3 (FSL3) during continuous moderate-intensity aerobic exercise (CONT) performed in the follicular and luteal phases of the menstrual cycle in females with T1D.
Methods:
Participants underwent CONT sessions on a cycle ergometer, one in the follicular phase and one in the luteal phase of the menstrual cycle, at the Research Laboratory of the Faculty of Physiotherapy. Glucose levels were measured every 10 min using FSL3 and the YSI 2500 as a gold standard. Measurements began 20 min before CONT and continued for 20 min after exercise.
Results:
A total of 26 females (mean age 32.2 ± 6.1 years and mean duration of diabetes 16.4 ± 8.4 years) participated in this study. FSL3 showed significant differences compared with YSI glucose data for both phases of the menstrual cycle (about 16 mg/dL higher in FSL3). There were no differences in mean absolute relative differences (MARDs) between the follicular (16.06%) and luteal (16.43) phases. Moreover, exercise did not affect MARDs, which were 14.21% pre-exercise and 17.63% postexercise for the follicular phase and 14.95% pre-exercise and 17.71% postexercise for the luteal phase.
Conclusions:
The findings suggest that the accuracy of FSL3 is not affected by CONT, showing good accuracy levels in both phases of the menstrual cycle. Thus, this study is the first to examine the influence of the menstrual cycle and exercise on the accuracy of a CGM device.
The study was also prospectively registered at clinicaltrials.gov (NCT06086067).
Introduction
Technology in the management of type 1 diabetes (T1D) is rapidly advancing, with the goal of providing various tools and treatments to support patients in managing their condition. 1 Among these tools, continuous glucose monitoring (CGM) systems are among the most rapidly evolving. 2,3 A notable example is the FreeStyle Libre (FSL) system (Abbott). The first version, FSL1, functioned as an intermittent scanning CGM that required active scanning with a mobile phone or reader to capture glucose data. The latest version, FSL3, has advanced to a real-time CGM that records glucose levels every 5 min—an improvement over the 15-min intervals of its predecessor—and also features a smaller sensor size. 4 In addition, CGM devices have paved the way for innovations like integrated sensor-pump systems, artificial pancreas technology, and smart insulin pens. 5
The use of CGMs in managing T1D has been associated with improved glycemic control in individuals with T1D, including reductions in glycemic variability, hypoglycemic episodes, and glycated hemoglobin (HbA1c) levels. 6,7 A key challenge for CGMs, however, is achieving accuracy, particularly under conditions where glucose levels may fluctuate rapidly, such as during exercise. 8 –10 Many CGMs lack accuracy when glucose values reach extreme levels or fluctuate quickly, as often happens during physical activity. 11 –13
Another factor contributing to blood glucose variability is the menstrual cycle, during which hormonal fluctuations—primarily in estrogen and progesterone—affect insulin sensitivity, leading to variable glucose levels. 14 –16 The combination of two factors known to influence blood glucose—physical activity and the menstrual cycle in females with T1D—may further impact CGM accuracy. 17 However, to our knowledge, no study has evaluated CGM accuracy during physical activity across different menstrual cycle phases.
Continuous moderate-intensity aerobic exercise (CONT) is the most commonly performed exercise type among individuals with T1D. 6,18 –20 Prior studies indicate that CONT can affect the accuracy of CGMs like the Dexcom G6 and FSL2. 21 Moreover, CONT has been shown to have varying impacts on glycemia during the follicular versus luteal menstrual phases, 22 making it of particular interest to investigate whether the accuracy of CGMs—specifically FSL3—differs during CONT performed in these phases, given the differing glycemic variability associated with exercise. Therefore, the present study aimed to assess the accuracy of FSL3 during CONT performed in the follicular and luteal phases of the menstrual cycle.
Research Design and Methods
Patients and experimental design
Twenty-six females with T1D from the Diabetes Reference Unit at Clinic University Hospital of Valencia, Spain, were recruited for this study. The inclusion criteria were as follows: (1) female participants with a diagnosis of T1D for at least 2 years; (2) age between 18 and 42 years; (3) HbA1c ≤8.5% (≤69 mmol/mol); (4) maintaining a stable insulin regimen with less than a 20% change in the total daily insulin dose over the past 6 months; and (5) a minimum of 90 min of physical activity per week, assessed by International Physical Activity Questionnaire (IPAQ). Exclusion criteria included the following: (1) medical conditions or medications (excluding insulin) known to affect glycemic control (e.g., metformin, oral, or injectable steroids); (2) use of an insulin pump; (3) use of oral contraceptives or any hormonal contraceptive methods; and (4) pregnancy or breastfeeding. Before signing informed consent, each participant was briefed on the potential risks, benefits, and objectives of the study. All procedures adhered to the principles of the Helsinki Declaration, and the experimental protocol was approved by the Ethics Committee of the University of Valencia (Spain) (1586990).
Study measures
Interstitial glucose levels were monitored using the FSL3 device (Abbott Diabetes Care, Alameda, CA, USA). The sensor was placed on the first day of menstruation, 48 h before the initial aerobic exercise session, and remained in place for at least 48 h following the final session. After 15 days, a new sensor replaced the initial one (Fig. 1A). Both sensors were positioned on the upper back part of the arms, and participants followed the protocol by wearing the FSL3 devices as instructed. 23

Study design with
Plasma glucose was measured every 10 min during both aerobic sessions by a nurse who used a forearm catheter to obtain venous blood samples, which were centrifuged immediately after their extraction. Blood glucose measurements were taken 20 min before exercise, during the 30-min exercise session, and 20 min postexercise, using a YSI 2500 STAT Plus Analyzer (Yellow Springs, OH) (Fig. 1B). The YSI 2500 is considered the gold standard for blood glucose measurements, as it ensures precise measurement accuracy between tests through a calibrated solution containing 2.5 g/L of dextrose. The first blood glucose measurement of luteal session was also used to subsequently analyze hormone levels to determine whether the cycle was ovulatory (progesterone >5 ng/mL).
Throughout the study, participants maintained their usual dietary and insulin regimens. To minimize the effect of circulating insulin during exercise, they were advised to avoid fast-acting insulin for at least 3 h before each session. In addition, participants were instructed to refrain from eating during this period if their glucose levels remained within a safe range. All exercise sessions took place at around 7 p.m., under the supervision of a physiotherapist specialized in exercise, at the Research Laboratory of the Faculty of Physiotherapy, University of Valencia, Spain.
Exercise intervention
Before the two CONT sessions, all participants completed an incremental exercise test on a cycle ergometer to determine the target power for these sessions. This incremental exercise test was conducted 1 week before each participant’s expected menstruation date. In addition, anthropometric and sociodemographic data were collected, and physical activity levels were assessed using the IPAQ.
The incremental exercise test began with participants resting for 3 min at 0W, followed by a 3-min warm-up at 40W. Afterward, the workload increased by 20W every 3 min until participants reached volitional exhaustion. Participants were encouraged to give maximal effort throughout the test. The test concluded with a 3-min cooldown at 40W and a passive 3-min recovery at 0 W on the cycle ergometer. Heart rate was monitored using a Polar H10, and lactate turn point 1 (LTP1) and its corresponding power output were determined during the test to establish exercise intensity in the CONT sessions. The wattage increments were set by piloting with 10 healthy female participants to ensure a minimum of four to five lactate measurements.
The first CONT session was conducted on day 3 of each participant’s menstruation, coinciding with the follicular phase of the menstrual cycle. For the second session, participants performed CONT on day 21 following the onset of menstruation, during the luteal phase of the menstrual cycle.
Each CONT session started with participants sitting for 3 min on the cycle ergometer at 0 W, followed by a 3-min warm-up at 40 W. After the warm-up, power increased by 20 W/min until reaching the target power determined by LTP1 in the incremental exercise test. The CONT session itself lasted 30 min, followed by a 3-min cooldown at 20 W and a passive recovery at 0 W while seated on the cycle ergometer. This was common to all participants.
To prevent hypoglycemic episodes, both the incremental test and CONT sessions were interrupted—or not started—if participants’ blood glucose levels fell below 60 mg/dL. For those with blood glucose levels between 60 and 70 mg/dL, 20–25 g of carbohydrates were administered as 200 mL of juice, until blood glucose returned to normal levels. Blood glucose was then rechecked after a 10-min interval.
Data and statistical analyses
Participant characteristics were reported as means ± standard deviations (SDs). All statistical analyses were conducted with a significant level of α = 0.05. Analyses were performed using MATLAB R2022a, version 9.12.0.2009381 (MathWorks, Inc., Natick, MA, USA), along with the Statistics and Machine Learning Toolbox version 12.3.
Accuracy metrics included the mean absolute relative difference (MARD) and the 15/15 metric, representing the percentage of CGM values within 15% of the YSI reference for glucose levels above 70 mg/dL or within 15 mg/dL for glucose levels at or below 70 mg/dL. In addition, the 20/20, 30/30, and 40/40 agreement rates, which follow a similar comparison, were evaluated. 24 Clarke error grid analysis was also used to help visualize the accuracy of the sensor. 25
Following previous research approaches, 26 –28 delays between signals were identified through cross-correlation of the data. To simulate various delays, one data stream was shifted in time relative to the other. The delay between the two signals was indicated by the time shift yielding the highest cross-correlation. 29
For multiple testing, reported P values were unadjusted and two-sided. A two-sample t-test was applied to analyze aerobic exercise data across the follicular and luteal phases, unless otherwise specified. To compare results before and after CONT, a paired-sample t-test was performed.
The sample size calculation was based on the MARD differences between before and after exercise found in the previous study by Cuerda et al. (2024). 21 Thus, accepting an alpha risk of 0.05 and a power of 0.8 in a one-tailed test, 25 subjects are necessary to recognize as statistically significant a difference greater than or equal to 9% and an assumed SD of the difference of 18%, with an assumed SD of the difference of 18% and a dropout rate of 0%.
Results
Table 1 presents the baseline characteristics of the study participants. Of the 26 participants recruited, a total of 25 females with T1D completed both CONT sessions, as one participant in the study period exceeded HbA1c levels by more than 8.5% and was excluded. The mean age of participants was 32.2 years, with an average duration of diabetes since diagnosis of 16.4 years. No hypoglycemic events were recorded during exercise. At the end, the overall incidence of hypoglycemia was very low, with four participants experiencing values below 69 mg/dL in the first session and two in the second, which were quickly corrected with 200 mL of orange juice. In addition, all participants showed progesterone levels above 5 ng/mL in the luteal phase session.
Characteristics of Patients at Baseline
Mean values and SD.
Absolute and relative frequencies (%).
HbA1c, glycated hemoglobin; SD, standard deviation.
Table 2 displays the mean glucose values and statistical comparisons from YSI-FSL3 data pairs during the two CONT sessions conducted in different phases of the menstrual cycle. The FSL3 device showed statistically significant differences in glucose values compared with those obtained by the YSI during CONT in both menstrual phases, with mean glucose values in the follicular phase being 15.4 mg/dL higher on the FSL3 than on the YSI and 16.71 mg/dL higher in the luteal phase.
Mean and Statistical Comparison of YSI-FreeStyle Libre-3 Glucose Values (mg/dL) for Each Phase of the Menstrual Cycle
FSL3, FreeStyle Libre-3.
Table 3 displays the MARD values of the FSL3 device for each phase of the menstrual cycle, along with the corresponding P value for the comparison between the two phases. The performance of the FSL3 device was moderate for a CGM device, showing similar results during the two CONT sessions conducted in the follicular and luteal phases, with MARD values of 16.08% and 16.43%, respectively.
Comparison between Measurement Errors (Mean Absolute Relative Difference) for Each Phase of Menstrual Cycle and FreeStyle Libre-3
MARD, mean absolute relative difference.
Table 4 presents the MARD values obtained from the FSL3 device during the CONT sessions for each phase of the menstrual cycle, specifically measured before and after exercise. Sample measurement errors were found to be higher after exercise compared with those obtained before exercise. However, the differences in MARD values for the FSL3 device were not statistically significant in either phase of the menstrual cycle.
Measurement Errors of the Samples before and after each Continuous Moderate-Intensity Aerobic Exercise
Table 5 presents the performance assessment of the FSL3 device according to the integrated CGM guidelines, specifically the 15/15, 20/20, 30/30, and 40/40 criteria. According to the 15/15 criteria, the FSL3 demonstrated similar accuracy during CONT sessions in both the follicular phase and the luteal phase, with percentages of 58.16% and 60.96%, respectively, for glucose values between 70 and 180 mg/dL. For glucose values exceeding 180 mg/dL, FSL3 performance was also comparable in both phases (54.55% in the follicular phase vs. 52.17% in the luteal phase). However, the FSL3 exhibited improved accuracy in aerobic training during the luteal phase for glucose values below 70 mg/dL, achieving 75% compared with 50% in the follicular phase. As expected, slightly higher results can be observed in the 20/20 criteria, notably on the luteal phase, where values across all glucose ranges exceed those of the follicular phase. Regarding the 30/30 column, the FSL3 performance marked over 95% for all glucose ranges except for values above 180 mg/dL, where the sensor behaved very similarly for both phases (84.09% follicular, 78.26% luteal). Based on the 40/40 criteria, the FSL3 sensor demonstrated performance greater than 95% across all glucose ranges in both phases. Notably, in the follicular phase, the FSL3 achieved 100% performance across all glucose ranges during aerobic training.
Performance Assessment of FreeStyle Libre-3 Using the Integrated Continuous Glucose Monitoring 15/15, 20/20, 30/30, and 40/40 Guidelines
Figure 2 and Supplementary Table S1 summarize the Clarke error grid analysis for both phases, separating the performance of FSL3 before and after exercise onset. More samples are allocated in the B area after the exercise than before, which is true for both phases of the menstrual cycle. It is notable that very few samples fall in the C, D, or E areas of worst performance.

Clarke Error Grid Analysis for both CONT sessions. Follicular phase is plotted on the left panel and luteal on the right panel. Hollow diamonds represent the three samples before the beginning of exercise for each patient, and black dots are during and after exercise samples.
Figure 3 illustrates the glucose trends observed during the CONT sessions for both phases of the menstrual cycle. In each phase, glucose levels measured by the FSL3 were consistently higher than those obtained by the YSI, indicating that the FSL3 tended to overestimate glucose concentrations in comparison with the YSI. Despite this difference, the overall trends of both FSL3 and YSI were similar, showing a decline in glucose levels during CONT sessions. This similarity suggests that the FSL3 is capable of capturing the dynamics of blood glucose, following the trend established by the YSI.

The top panel illustrates the CONT sessions in the follicular phase of the menstrual cycle, whereas the bottom panel depicts those in the luteal phase. YSI measurements are represented by the orange line and band, whereas FSL3 measurements are shown in blue. The horizontal axis indicates the minutes at which YSI and FSL3 data were collected, whereas the vertical axis represents glucose levels during the CONT sessions of both menstrual phases. Time index 0 corresponds to the baseline glucose level at the beginning of the CONT, and time index 30 reflects the glucose level at the conclusion of the exercise. Solid lines are medians across the patient cohort, and the dotted lines represent the 25 and 75 percentiles. FSL3, FreeStyle Libre-3.
In the follicular phase, the shaded areas representing the interquartile ranges indicate greater variability in glucose measurements from the FSL3 compared with the YSI, with the blue shaded area (FSL3) being wider than the orange shaded area (YSI). This increased variability in FSL3 readings may suggest lower accuracy compared with the YSI, which appears to be more consistent during this phase of the menstrual cycle.
Discussion
The aim of this study was to examine the accuracy of the FSL3 during CONT in both the follicular and luteal phases of the menstrual cycle. Our findings reveal two main results as follows: (1) the accuracy of the FSL3 was similar across both menstrual phases during CONT, and (2) CONT did not significantly affect the accuracy of the FSL3 in either phase.
To our knowledge, this is the first study to assess the accuracy of a CGM system across different phases of the menstrual cycle while accounting for physical activity. Our results indicate that the accuracy of the FSL3 is consistent in both phases, with MARDs averaging around 16.2%. These MARD levels during physical activity are comparable to or lower than those observed with other CGM systems in similar CONT settings. For instance, Lindemose et al. analyzed the accuracy of three CGM systems (FSL2, Dexcom G6, and Guardian 4) during CONT and found no significant differences in MARD values, which were 17.2%, 12.6%, and 10.7%, respectively. These values are in line with our findings. 30 In addition, the Enlite-2 system demonstrated a MARD of 16.5% during CONT, which is also consistent with our results. 6 Conversely, a study by Da Prato et al. reported higher MARD values for sensors such as Dexcom G5, G6, Eversense, Guardian 3, and Enlite, ranging from 27.8% to 44.9%. 31
In addition, our assessment of the 15/15 criteria for both phases was similar, around 60% for values between 70 and 180 mg/dL. As expected, this percentage improved for the 20/20 and 30/30 criteria, around 80% and close to 100%, respectively. While the 15/15 values for values between 70 and 180 mg/dL are superior to previous studies of CGM accuracy in CONT (Dexcom G6 and FSL2 of 18.66% and 46.40%, respectively) 21 or nonexercise situation (FSL1 of 48%), 13 60% can be considered insufficient considering the 70% target set by the FDA’s iCGM requirements. 24 In contrast, compliance with >99% of the 40/40 criteria would be met in the follicular phase and would be close (98%) for the luteal phase. In addition, the recent Dexcom G7 and FSL3 series have shown 74.4% and 90.4% compliance for the 20/20 criterion under basal conditions, 32 respectively, versus close to 80% in our study during CONT. Therefore, while the 15/15 criterion is lower than required according to the FDA’s iCGM requirements, FSL3 appears to perform better than other sensors both during nonexercise and exercise, although it should be a target for future sensor improvements to examine whether such compliance is obtained during exercise.
Another significant finding of our study is that the precision of the FSL3 was not significantly altered by CONT in either phase, with MARDs ranging from approximately 14.5% to 16.7%. These levels are higher than those reported for the Dexcom G6 (which varied from a MARD of 21.9% to 36.1%) and the FSL2 (from a MARD of 17.9% to 26.3%) during CONT in males with T1D. 21 Previous studies have shown that CONT can influence CGM accuracy. For example, the FSL1 demonstrated a MARD of 13.7% at rest compared with 22% during exercise, 33 whereas the Enlite-2 reported a MARD increase from 9.5% to 16.5% during exercise. 6 Moreover, the Dexcom G4 and Enlite exhibited MARD values of 22.5% during exercise versus 13.8% at rest and 20.4% during exercise compared with 12.4% at rest, respectively. 8
Overall, these findings suggest that the FSL3 maintains a consistent level of accuracy during CONT across different phases of the menstrual cycle, which is promising for individuals with T1D engaging in physical activity. Further research is warranted to explore the implications of these findings on glucose management strategies in this population.
This study is the first to perform this type of accuracy analysis of a CGM system such as the FSL3 sensor during CONT, comparing between the follicular and luteal phases of the menstrual cycle, to the best of our knowledge. Therefore, the MARD of other sensors during exercise based on the menstrual cycle cannot be compared with previous studies. Authors such as Elhenawy et al. analyzed the performance of the MiniMed 780G system during glycemic variability occurring during the menstrual cycle 34 or Li et al. performed a study on glycemic changes during different phases of the menstrual cycle, but none of them considered the performance of a CGM sensor combining acute exercise with the menstrual cycle. 35
This study has several strengths as follows: (1) To our knowledge, it is the first to analyze the accuracy of CGM sensor during CONT across the menstrual cycle’s different phases. While previous studies have documented glucose fluctuations during the menstrual cycle in females with T1D using CGM devices, 35,36 none has evaluated the accuracy of these devices in the context of a physically demanding exercise such as CONT. (2) YSI 2500 glucose analyzer, which is considered the gold standard for plasma glucose measurement, was used as a reference to assess FSL3 performance. (3) CONT performed during both phases was standardized and tailored to each participant based on the results from the prior incremental exercise test. 21 (4) We conducted progesterone analyses for each participant before luteal session to confirm this phase of the menstrual cycle.
However, this study also has limitations. First, we analyzed the accuracy of only one CGM sensor during CONT across different menstrual phases. Previous studies, such as those by Moser et al., have investigated the accuracy of a single CGM sensor during various types of exercise. 37 More recent studies, such as those by Da Prato et al. and Hanson et al., have compared multiple widely used CGM systems. 31,32 Consequently, our performance results cannot be generalized to other sensors, as MARD values were only obtained for the FSL3 under aerobic exercise conditions. In addition, the postexercise measurement window was limited to the 20 min following the end of CONT. A broader postexercise observation window could yield different results. Notably, we did not observe significant differences in MARD values before and after exercise in either menstrual phase. While Guillot et al. restricted their observation window to 30-min postexercise, similar to our study, 12 Biagi et al. extended their observation to 1 h and found a decrease in CGM precision. 6
In conclusion, this study is the first to analyze the performance of a CGM device during CONT in both phases of the menstrual cycle, demonstrating good accuracy for the FSL3 sensor compared with plasma glucose measurements from the YSI. Further research is necessary to assess the accuracy of other CGM sensors and various types of exercise across each phase of the menstrual cycle in individuals with T1D.
Footnotes
Acknowledgments
The authors would like to thank the patients who participated in this study.
Authors’ Contributions
Conceptualization, R.M.-S.A., J.-L.D, P.R., F.J.A.-B., and J.B.; Methodology, R.M.-S.A., P.R., F.J.A.-B., and J.B.; data curation, A.C.-d.P., R.M.-S.A., and C.M.R.; Software, A.J.L.S.; Formal analysis, A.J.L.S.; Writing—original draft preparation, R.M.-S.A. and A.C.-d.P.; Writing—review and editing, J.-L.D, P.R., F.J.A.-B., and J.B.; project administration, J.-L.D, and J.B.; Funding acquisition, J.-L.D, and J.B.
Author Disclosure Statement
F.J.A.B. has served as a consultant/advisor for Abbott Diabetes Care, AstraZeneca, Boehringer Ingelheim, Eli Lilly, GlaxoSmithKline, LifeScan, MannKind Co., Medtronic, Menarini, Merck, Novartis, Novo Nordisk, and Sanofi and as a speaker for Abbott Diabetes Care, AstraZeneca, Boehringer Ingelheim, GlaxoSmithKline, LifeScan, Eli Lilly, Madaus, Medtronic, Menarini, Merck, Novartis, Novo Nordisk, and Sanofi and has received grant support from Novo Nordisk and Sanofi. No other conflicts of interest relevant to this article are reported.
Funding Information
This work received support from grants PID2019-107722RB-C21 and PID2022-137723OB-C21 funded by MCIN/AEI/10.13039/501100011033 and CIBER (Consorcio Centro de Investigación Biomédica en Red), group number CB17/08/00004, Instituto de Salud Carlos III, Ministerio de Ciencia e Innovación, and from the European Union–European Regional Development Fund. Cuerda-del Pino is supported by the University Teaching Training Program (FPU) of the Ministry of Science, Innovation and Universities of Spain. Grant number: FPU20/07337.
Supplementary Material
Supplementary Table S1
