Abstract
Background:
There is a dearth of comparative accuracy studies of continuous glucose monitoring (CGM) devices in the home-use setting, and none with the Eversense implantable CGM.
Methods:
We evaluated the accuracy of the Dexcom G5, Abbott Freestyle Libre Pro, and Senseonics Eversense during a 6-week free-living home-use bionic pancreas study involving 23 subjects with type 1 diabetes who wore all three devices concurrently. The primary outcome was the mean absolute relative difference (MARD) between CGM readings and point-of-care (POC) plasma-glucose (PG) values obtained approximately twice daily by the subjects. We compared PG values with CGM readings when available from all three CGMs in the 5 min preceding the PG values (n = 829 sets). Since the Libre Pro records readings every 15 min, we also did a two-way comparison between the G5 and the Eversense with a higher number of comparisons (n = 2277 sets).
Results:
All three CGM systems produced higher average MARDs than during in-clinic studies. However, since all three CGM systems were worn by the same individuals and used the same meter for comparator PG measurements, we could directly compare their performances. In the three-way comparison, Eversense achieved the lowest nominal MARD (14.8%) followed by Dexcom G5 (16.3%) and Libre Pro (18.0%) (Eversense vs. Libre Pro P = 0.004, other comparisons P = NS). There was a statistically significant difference (P = 0.008) in the two-way comparison of the MARDs for Eversense (15.1%) and G5 (16.9%).
Conclusions:
The point accuracy of the Eversense was better than two other CGMs when compared with POC PG values.
Introduction
As continuous glucose monitoring (CGM) systems become available, there is an ongoing need to evaluate their relative accuracy and reliability. Previous studies have directly compared available systems in the inpatient or otherwise controlled settings. 1 –13 Some of these studies that compared CGM devices worn concurrently by the same subject have suffered from the limitations of being short in duration (ranging from several hours to 1–2 weeks) 11 and/or involving small cohorts of participants, 12 using the CGM beyond the manufacturer specified lifetime along with an unapproved calibration schedule, 5 or not including large variations or rapid rates of change in glucose levels. 12,13 Moreover, accuracy metrics obtained by device manufacturers cannot be directly compared across devices due to differences in patient characteristics and experimental protocols.
In this study, we report the results of a study conducted entirely under free-living conditions directly comparing the accuracy of three CGM devices, namely the Dexcom G5 (G5), Abbott FreeStyle Libre Pro (Libre Pro), and the Senseonics Eversense (Eversense). The Eversense is unique for being the only CGM system in which the sensor is fully implanted subcutaneously with a 3-month in-use period in the United States and a 6-month in-use period in other markets.
Methods
Subjects
The experiments were conducted in subjects with type 1 diabetes who were participating in a free-living home-use study of the bionic pancreas (BP). The clinical protocol was approved by the human research committee at Massachusetts General Hospital (MGH). Subjects were at least 18 years old, had type 1 diabetes for at least 1 year, used an insulin pump for at least 6 months, and had used a CGM for at least 1 cumulative month for the previous year. Exclusion criteria included an inability or unwillingness to avoid acetaminophen, ascorbic acid (vitamin C), and salicylic acid for the duration of the study. As the Eversense sensor is implanted under the skin, potential subjects were excluded if they had any condition preventing or complicating the placement, operation or removal of the sensor, or wearing of the transmitter over the skin, including upper extremity deformities or skin condition. Potential subjects with any condition requiring, or likely requiring, magnetic resonance imaging, computed tomography scan, or high-frequency electrical heat (diathermy) were also excluded.
We planned to have 24 subjects complete the study.
Experimental protocol
Subjects wore all three CGMs for a total of ∼6 weeks. During this time, subjects were also participating in a study involving two different configurations of the BP that studied the effect of remote telemetric monitoring on hypoglycemia. Subjects completed three arms, each 2 weeks in duration, in random order: usual care (UC), insulin-only BP (IOBP), and bihormonal BP (BHBP). One week of each of the three arms included remote monitoring for severe biochemical hypoglycemia (CGM glucose <50 mg/dL for >15 min) and the other week did not include monitoring, also in random order, within each 2-week arm. Since the G5 CGM provided the input signal to the algorithm of the BP, subjects were also remotely monitored for signal loss from the G5 throughout the study period.
The Eversense sensor was inserted in the subcutaneous tissue of the upper arm. The insertion was an office procedure under local anesthesia that took ∼15 min and was done 1–10 days before the start of data collection. At the time of the study, the Eversense was approved in Europe for up to 90-day use, but it was not approved in the United States, although it was later approved for sale in the United States. The G5 and Libre Pro sensors were inserted on the first day of the study. The glucose information from the Libre Pro and Eversense were blinded during this study.
Participants were instructed to calibrate the G5 and Eversense twice daily as per the manufacturer's instructions using the Nova Biomedical StatStrip Xpress meter that was provided to them. We had previously shown that the StatStrip Xpress meter is one of the most accurate point-of-care (POC) glucometers, with a mean absolute relative difference (MARD) versus YSI 2300 Stat Plus Glucose and Lactate Analyzer (YSI) of 6.3%. 14 The Libre Pro is factory calibrated and does not require or allow calibrations. Neither the Libre Pro nor the Eversense interacted with the BP in any way. In the event of a premature G5 sensor failure, participants were provided with spare G5 sensors to place at home. If the Libre Pro sensor fell off prematurely, participants were asked to wait until their next study visit (every 2 weeks) to replace the sensor. No premature sensors failures occurred with the Eversense. Participants were not asked to keep a record of unscheduled sensor insertions.
In addition to calibrations, subjects were asked to check POC plasma-glucose (PG) values with the StatStrip Xpress meter before each meal and anytime they thought the glucose value reported by the unblinded G5 was not consistent with how they felt. They were asked to calibrate the G5 if the reported value was >20% from the reference value and if it was a “good” time to calibrate, that is, if the glucose trend arrow was flat (rate of change <1 mg/min) and the subject had not eaten in the past 30 min.
A 24-h telephone support service was available for any clinical or technical issues that arose during the study. User manuals and troubleshooting literature for all three CGMs were provided to all participants. Data from all three sensors were compared with reference POC PG values downloaded from the StatStrip Xpress. We compared PG values with CGM values that were obtained up to 5 min before the PG measurement. This allowed the PG values used for CGM calibration to also be used for accuracy measurements, but was expected to increase the calculated MARD by exaggerating the effect of physiological lag between changes in PG and the interstitial glucose measured by CGM sensors. 15
Statistical analysis
The primary outcome was a three-way comparison of MARDs, including the G5, Libre Pro, and Eversense. The comparison was made between the CGMs and the POC meter when readings were available from all three CGMs in the 5 min preceding the meter value. This used readings from the Libre Pro measured every 15 min and from the G5 and Eversense measured every 5 min. We also calculated the MARD for each CGM in the following ranges: <70, 70–120, and >180 mg/dL. P-values reported are nominal and have not been corrected for multiple comparisons. Since the G5 and Eversense transmitters store readings every 5 min, there was a larger data set for comparison of these two CGMs with reference POC PG values. The proportion of CGM values that were within (±)15%, 20%, or 30% of the relative difference from the reference value at glucose levels >80 mg/dL and within (±)15, 20, or 30 mg/dL of absolute difference at glucose levels ≤80 mg/dL (hereafter referred to as 15/15%, 20/20%, and 30/30%) were used to evaluate the overall accuracy of the CGM devices.
Statistical analyses were performed using SAS 9.4 software (SAS Institute Inc. Cary, NC). Repeated-measurements models were used to account for within-subject correlations when analyzing the differences between the paired measurements. The repeated-measurements models were fitted with the generalized estimating equation method. For Figure 1, P-values testing the hypothesis that the slopes of the relative deviation (RD) versus the reference glucose were nonzero were based on mixed effects models with subject-level random intercept and slope.

RD (top row) and ARD (bottom row) for the Dexcom G5, Senseonics Eversense and Abbott Freestyle Libre Pro for a range of POC blood sugar measurements. 15,16 The P-values for the slopes in top panel being nonzero are G5 P < 0.0001, Eversense P < 0.0001, Libre Pro P = 0.0464. ARD, absolute relative deviation; POC, point-of-care; RD, relative deviation.
Results
Between April 2017 and May 2017, 23 subjects (17 females) with a mean age of 38 ± 14 years and T1D duration of 25 ± 9 years completed the study. They had a mean HbA1c of 7.2% ± 1.0% and body mass index of 28 ± 6.0 kg/m2 (Table 1). Each participant completed a 6-week-long experiment with 2 weeks on UC, 2 weeks on the IOBP, and 2 weeks on the BHBP.
Baseline Characteristics of Study Participants
Values are presented as mean ± SD (range) unless otherwise specified.
BMI, body mass index; CGM, continuous glucose monitoring; SD, standard deviation.
The scheduled calibrations of the Eversense and G5 sensors occurred approximately every 12 h. The data set included a mean of 2.4 PG values per participant per day, suggesting that most PG values were used as calibrations. The average percentage of time each CGM captured data through the duration of the study was 94.5% for the G5, 82.9% for the Libre, and 69.6% for the Eversense. Of the expected ∼138 weeks of wear for each CGM by all subjects, some CGM data were available for all 138 weeks for the G5, for 126 weeks for the Libre Pro, and for 121 weeks for the Eversense.
As expected, all of these CGM systems produced higher average MARDs than during in-clinic studies in the context of this study. For the three-way comparison (829 sensor-meter glucose pairs), the aggregate MARD obtained with the G5 was 16.3% ± 15.4% compared to 18.0% ± 17.9% obtained with the Libre Pro (P = 0.09 vs. G5) and 14.8% ± 14.8% obtained with the Eversense (P = NS G5 vs. Eversense and P = 0.004 Libre Pro vs. Eversense).
In the three-way analysis (Table 2), the G5 and Eversense had similar MARDs in the hypoglycemic range (PG <70 mg/dL), 23.6% versus 24.9%, respectively, and both had lower MARDs than the Libre Pro, 36.1% (P = 0.004 G5 vs. Libre Pro and P = 0.04 Libre Pro vs. Eversense). In the normoglycemic range (PG 70–120 mg/dL), all three sensors had similar MARDs (G5 18.5%, Libre Pro 17.8%, Eversense 16.1%, and all P-values NS). In the hyperglycemic range (PG >180 mg/dL) the G5 and Eversense had similar MARDs of 13.3% and 12.8%, respectively (P = NS) and both had lower MARDs than the Libre Pro 17.3% (P = 0.001 G5 vs. Libre Pro and P = 0.003 Libre Pro vs. Eversense).
Three-Way Comparison of Mean Absolute Relative Differences Between the G5, Libre Pro, and Eversense Sensors in the Hypoglycemic, Normoglycemic, and Hyperglycemic Ranges (n = 829)
P-values are uncorrected for multiple comparisons.
PG, plasma glucose.
In the two-way analysis (Table 3) using the larger data set (n = 2277), the MARD for the Eversense was lower than the G5, 15.1% versus 16.9%, respectively (P = 0.008). For PG <70 mg/dL, the MARDs for both G5 and Eversense were similar (29.1% vs. 29.6% P = 0.81). For PG 70–120 mg/dL, the MARD for the Eversense was lower (15.9%) than for the G5 (18.5%), P = 0.0001. For PG >180 mg/dL, the MARD for the Eversense (12.1%) was again lower than for the G5 (13.8%, P = 0.001 G5 vs. Eversense).
Two-Way Comparison of Mean Absolute Relative Differences Between the G5 and Eversense Sensors in the Hypoglycemic, Normoglycemic, and Hyperglycemic Ranges (n = 2277)
P-values are uncorrected for multiple comparisons.
Overall, the Eversense was found to be more accurate than the Libre Pro for the whole glucose range in terms of the 15/15% (P < 0.0001), 20/20% (P = 0.0001), and 30/30% (P = 0.0301) criteria (Table 4). The G5 was found to be more accurate than the Libre Pro for the whole glucose range in terms of the 15/15% (P = 0.0079) and 20/20% (P = 0.0271) criteria, and the Eversense was found to be more accurate than the G5 for the whole glucose range in terms of the 30/30% (P = 0.0092) criterion (Table 4). Figure 1 displays the RD and absolute relative deviation of the G5, Eversense, and Libre Pro for a range of POC blood sugars. The average bias of the G5 and the Eversense was slightly negative in the hypoglycemic range, was minimal in the normoglycemic range, and became slightly positive as the comparator glucose increased above normoglycemia. The Libre Pro had a negative bias across the reference glucose range. Supplementary Figure S1 displays the deviation (D) and absolute deviation (AD) for all three sensors.
Sensor Accuracy for the Whole Glucose Range and Stratified According to Point-of-Care Plasma Glucose in Terms of 15/15%, 20/20%, and 30/30% Criteria
P-values are uncorrected for multiple comparisons.
Discussion
With the improved accuracy of CGMs, their role in diabetes care has evolved from adjunctive use to fingerstick replacement with insulin dosing claims. 17 –19 All three of the CGMs included in this study have FDA approval for nonadjunctive use, meaning that they can be used to determine insulin dosing without confirmatory POC PG measurement. Accuracy metrics published by manufacturers cannot be directly compared between different CGMs due to differences in the characteristics of participants and differences in the testing protocols, so comparative accuracy can only be assessed in studies where multiple CGMs are compared in the same study. In addition to a head-to-head comparison of these devices, it is important to test them in free-living conditions so that the accuracy under those conditions can be assessed.
We compared the G5 (the latest Dexcom system available at the time of the study) to the Abbott Freestyle Libre Pro (the only available Abbott Freestyle system at the time of the study) and the Senseonics Eversense CGM (which was investigational at the time of the study and has since been cleared by the FDA for sale in the United States). To our knowledge, this is the longest study directly comparing the accuracy of multiple CGMs conducted entirely in the home-use setting. To avoid interference in the subjects' daily routine and for practicality, we chose to use the meter readings as reference values. We chose one of the most accurate meters on the market, namely the Nova Biomedical StatStrip Xpress meter (MARD 6.3% vs. YSI), with accuracy similar to the Ascensia Contour Next. 14 The StatStrip Xpress has no bias relative to the YSI. 14 Therefore, although the POC PG meter is less accurate than reference methods, this should affect the apparent accuracy of all the CGM systems equally since, to our knowledge, factory calibrations and calibrations algorithms of CGM devices use the YSI as the standard.
As expected, in this setting all three CGMs produced higher average MARDs than are found in in-clinic studies. In previous in-clinic studies using reference-quality PG measurements as the comparator, the G5 was shown to have a MARD of 9% in adults and 10% in the pediatric population. 20 The Abbott freestyle Libre Pro, which is factory calibrated and requires no further calibrations by the user for its 14-day wear period, was previously shown to have a MARD of 10.1% by the manufacturer 21 and 12.3% in another study. 22 Several studies, including Abbott's own report, have shown that the MARD for the Freestyle Libre CGM is higher in the hypoglycemic range, 7,21,23 –25 with one study suggesting that there was no bias. 23 One study found underdetection of hypoglycemia, 24 whereas another found overdetection of hypoglycemia. 26 The Eversense was shown to have a MARD of 8.8% for a 90-day wear period when tested in subjects with type 1 and type 2 diabetes. 27
Welsh et al. compared the MARDs of Dexcom G5 and G6 from separate studies using a propensity score method to balance cohort characteristics. 17 They found that the G5 MARD (9.0%) was lower than the G6 (9.9%), and the percentages of values within ±20%/20 mg/dL were 93.1% and 92.5%, respectively. This implies that the difference between the G6 and Eversense might be greater than the difference we found for the G5 and Eversense, but a direct comparison study would be required to test this hypothesis.
Differences in CGM accuracy between the in-clinic and at-home settings have been observed in a recently published study that compared two CGMs. 28 The apparently lower accuracy of the CGM systems tested in this study relative to in-clinic studies could be due to (1) inherently lower accuracy of the POC PG meter relative to reference-quality methods (e.g., YSI), (2) the use of capillary versus arterialized venous blood, (3) the potential for dirty fingers while performing blood collections by fingerstick, and (4) other differences in technique.
In addition to these factors, there is another that might explain the relatively lower accuracy of the CGM systems tested in this study relative to in-clinic studies. We compared PG values with CGM values that were obtained up to 5 min before the PG measurement. This allowed the PG values used for CGM calibration to also be used for accuracy measurements; however, it also likely exaggerates error due to lag between PG and CGM measurements. Furthermore, despite our request that subjects check POC PG before meals, they performed an average of 2.4 POC PG measurements per day, meaning that most measurements were used as calibrations. This meant that any error due to sensor drift since the previous calibration was maximal because at the time of the PG measurements used for calculating the MARD the last calibration was, on average, 10 h old. This makes the MARDs we obtained a likely upper limit on the MARDs that might be obtained if POC PG measurements were taken more frequently (e.g., sooner after the last calibration). Owing to the design of the study, all the aforementioned factors should have affected all CGMs equally, so relative accuracy comparisons between systems are valid even if the absolute measures of accuracy represent an upper bound on the MARD.
In conclusion, we have performed a direct comparison of three CGMs in free-living settings to determine relative accuracies of the different systems under typical-use conditions. We have previously shown that safe and effective glycemic control is achievable when the G5 is used to provide the input signal to the algorithm of the BP. 29,30 Results from this study imply that the Eversense is also sufficiently accurate to provide the input signal to the BP system, and possibly other automated glucose control systems as well.
Authors' Contributions
S.J.R., F.E.-K., and E.R.D. designed the study, performed the analysis, and interpreted the data. F.E.-K. and E.R.D. designed and built the BP. S.J.R. supervised the human studies. R.Z.J. enrolled and screened subjects for the study, conducted study visits, interpreted data, and wrote the first draft of the article. S.J.R, F.E.-K., and E.R.D. participated in revision of the article for important intellectual content. H.Z. and F.E.-K. performed the analysis, interpreted the data, and contributed to drafting and reviewing the article. S.J.R. had full access to the data and takes full responsibility for this study as a whole, including the study design and the decision to submit and publish the article.
Footnotes
Acknowledgments
The authors thank the volunteers for their time and enthusiasm; the diabetes care providers who referred potential subjects for the study; Mary Larkin, Camille Staco-Targete, Nancy Kingori, Stephanie Dimodica, Khadija Tlaiti, Diabetes Research Center, MGH, for organizational and logistical support; Nancy Wei, MD, Kerry Grennan, NP, and Takara Stanley, MD for serving on the data safety and monitoring board for the study; and the members of the Partners Human Research Committee.
Author Disclosure Statement
E.R.D., F.E.-K., and R.S. are inventors on patents and patents pending related to the BP technology. E.R.D. and F.E.-K. are employees, cofounders, and equity holders in Beta Bionics, Inc. R.S. is an employee in, and holds options to purchase stock in, Beta Bionics, Inc. E.R.D. is on the board of directors of Beta Bionics, Inc. S.J.R. is an inventor on a patent and patents pending on aspects of the BP that are assigned to MGH and are licensed to Beta Bionics, Inc., has received honoraria and/or travel expenses for lectures from Novo Nordisk, Roche, and Ascensia, serves on the scientific advisory boards of Unomedical and Companion Medical, has received consulting fees from Beta Bionics, Novo Nordisk, Senseonics, and Flexion Therapeutics, has received grant support from Zealand Pharma, Novo Nordisk, and Beta Bionics, and has received in-kind support in the form of technical support and/or donation of materials from Zealand Pharma, Ascencia, Senseonics, Adocia, and Tandem Diabetes. No other potential conflicts of interest relevant to this article were reported.
Funding Information
Partial support for this study was provided by a grant from the Leona M. and Harry B. Helmsley Charitable Trust (2017PG-T1D025 to S.J.R.).
Supplementary Material
Supplementary Figure S1
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
