Abstract
Introduction:
In order to assess patient experiences of telemedicine, researchers and administrators use the net promoter score (NPS), based on a likelihood to recommend (LTR) question. However, there is reason to doubt validity of this metric for this purpose. We assessed the degree to which the LTR question reflects actual patient preferences about telemedicine.
Methods:
Using data from a patient experience survey collected in Spring 2020, we compared LTR responses to open comments. Through content analysis, we transformed comments into categorical variables and used those variables in a multiple logistic regression model to predict LTR responses. We also thematically analyzed comments to further elucidate our results.
Results:
Only about half the comments mentioned telemedicine at all. Around 6% of comments were wholly incongruent with LTR responses. In many comments, ideas about telemedicine were semantically entangled with ideas about providers. Our logistic regression found strong associations between sentiments expressed in comments and LTR responses. However, comments about telemedicine were relatively poor predictors for LTR compared to comments about the provider.
Discussion:
NPS, which is included on many patient experience surveys used by health systems across the United States, has limitations for use as a measure of the acceptability of telemedicine for patients. Patients have more than telemedicine in mind when responding to the LTR question, and ratings conflate attitudes about providers, office policies, and staff with the telemedicine modality. More direct measures are necessary for meaningful research on the acceptability and usability of telemedicine for patients.
Introduction
After the COVID-19 pandemic accelerated telemedicine use across health systems, 1,2 health systems sought to evaluate how its continued use impacts patient experience. 3 The net promoter score (NPS) is a leading performance metric used by health care executives for strategy development and quality improvement initiatives. Researchers have used NPS to evaluate the acceptability of a variety of health care interventions, including e-Health 4 –8 and telemedicine. 9 –18 One review found that NPS was the most widely used measure of satisfaction for m-health. 19 However, there is cause to question its validity.
Prior to use in health care settings, NPS was developed in business as an alternative to measuring customer/patient satisfaction. 20,21 Customers respond to a likelihood to recommend (LTR) question on a scale from 0 to 10. People responding with a 9 or 10 are defined as “promoters,” 7 or 8 are “passives,” and 0–6 are “detractors.” NPS is calculated across groups by subtracting percentage of detractors from percentage of promoters.
Use of NPS is debated among business scholars, 22 –24 but it may be even less relevant in health care where lack of choices and urgent need constrain customers. 25 –27 Validity studies find that NPS is less informative and reliable than multi-item satisfaction measures. 28 The measure is overly sensitive to patient panel compositions and modes of administration. 25,29,30 The interpretability of responses is uncertain, particularly across cultures and settings, 31 and the NPS has been shown to lack specificity. 22,32,33
Before using NPS to evaluate telemedicine acceptability, we wanted to assess the extent to which responses to an LTR question on a patient experience survey following a telemedicine encounter reflect attitudes about telemedicine. We inferred that comments would reflect a patient’s state of mind when answering the LTR, so these comments could serve as a secondary measure of patient opinions that could be compared with ratings.
Methods
We used a mixed methods design to understand what LTR is measuring. The approach involved triangulating LTR ratings and qualitative comments. We used content analysis to transform content from comments into categorical data, which could be used in regression models to predict LTR ratings. To better understand regression results, we used inductive analysis to identify themes across comments, including references to telemedicine, understandings of the survey, and entanglements between provider and telemedicine comments.
The Maine Medical Center Institutional Review Board determined this study did not constitute human subjects research, and survey respondents did not provide informed consent.
DATA COLLECTION
The LTR question was included on a survey (see Appendix 1) conducted for our health system by a third party (NRC Health) to get feedback relevant to telemedicine. Patients completed the survey by phone (automated), email, or text message following a telemedicine encounter. They answered nine questions about their experience and then rated the following LTR question on a 0–10 scale: “How likely are you to recommend this service to your family and friends?” A 10 is “extremely likely,” whereas a 0 is “not at all likely.” Finally, patients are given the opportunity to leave a comment.
NRC Patient Satisfaction Survey for Telehealth Encounters
SAMPLE
Our dataset included 5,148 encounters, from April until June 2020, representing 5,022 unique patients. For analyses, we excluded 2,874 encounters without comments. For patients with multiple encounters, we included only the patient’s most recent encounter, excluding 43 encounters. The sample for these analyses (n = 2,231) is compared with the larger sample in Table 1. The sample with comments is slightly older, with a greater proportion of women, compared with the larger sample. We do not believe these differences affect our goal to understand the relationship between responses to the LTR question and comments.
Characteristics of Sample
n(%)
CONTENT ANALYSIS Table 2
We used content analysis, with MAXQDA™ (Version 22.8.0), to transform comments into categorical data. 34 We developed a codebook following review of 100 comments (see Table 2 for codebook). One code was used to identify when telemedicine was mentioned in a comment, even implicitly. Other codes captured instances in which participants either directed comments at the survey itself or left comments that included no interpretable information. Finally, we used object-valence codes to capture positive or negative (e.g., reporting problems) valence and the object of the valence, where applicable. Throughout the coding process, M.K. used memoing to note broader patterns (themes) not captured through other codes.
To avoid issues with unitization, 35 we applied codes to entire responses, which varied in length. Responses often included mixed commentary on more than one topic, so responses were given as many codes as were applicable. In order to ensure codes were applied consistently and nonarbitrarily, 36,37 we employed a codebook and checked intercoder reliability. A second, novice coder (R.A.) was trained to use the codebook and then blind-coded a subset of 249 comments. Coding decisions were compared using Cohen’s Kappa, with a preset threshold of kappa = 0.7. For all codes, except the Provider Negative code, kappa exceeded the threshold, the minimum being kappa = 0.78. Given the infrequency of Provider Negative code application (used 34 times across 2231 comments), we used the negotiated coding procedure to review this specific code. 35,38
Codebook Used for Content Analysis
MULTIPLE LOGISTIC REGRESSION
We assigned binomial predictor variables based on applications of each content analysis code. The variables were Telehealth Positive, Provider Positive, Staff/Office Positive, Telehealth Negative, Provider Negative, and Staff/Office Negative.
To test whether positive comments about telemedicine would predict Promoter Ratings better than other comments, we used a logistic regression to produce odds ratios. All six individual variables were significant with bivariate modeling, so all six variables were included in the final multiple logistic model.
We used LTR responses to create a binomial outcome variable. Because the LTR ratings were heavily skewed toward recommending (1,813 patients gave ratings of 9 or 10, 266 gave ratings of 7 or 8, and 152 gave ratings of 6 or lower), we combined “passives” and “detractors,” converting LTR ratings into a binary outcome: “promoter” (rating 9–10) or “nonpromoters” (rating 0–8).
The quantitative analysis was conducted using R Version 3.5.1.
Results
Of 2,231 comments, 64 comprised uninterpretable or no information. Fourteen were solely about the survey, including to report errors or confusion. The remaining 2,153 comments included information related to patient experience.
PATIENT COMMENTARY DOES NOT FOCUS ON TELEMEDICINE
Though the survey was conceived to elicit patient feedback on telemedicine, only 50.0% of comments mentioned telemedicine (explicitly or implicitly), and only 46.3% included a positive or negative comment about telemedicine. In contrast, 53.1% of comments included either a positive or a negative comment about providers. In some cases, comments suggested that patients believed the survey was about the provider:
This was my first experience with [Doctor] and she was exceptional. I find it disturbing that you do not have enough faith in your own medical staff and you think I do not have the tools to find a way to communicate my feelings if I found I had a bad experience.
Others indicated confusion about what opinions the survey was trying to elicit:
What kind of question is this? “How likely would you be to recommend this service to your family and friends?” you had questions regarding [Doctor], is this about using telemed?
These comments, and the relative infrequency of comments about telemedicine, suggest that many participants are thinking about other topics than telemedicine when they responded to the LTR question.
INCONGRUENCE BETWEEN THE LTR AND COMMENTS
LTR ratings were sometimes incongruent with telemedicine-related comments. About 7.1% of promoters expressed a problem with or distaste for telemedicine without saying anything positive about it. The following comment was left after responding with “10”:
I understand the reason for this Zoom call and I’m fine with this appointment. I would prefer to actually be in the doctor’s office and I hope Zoom does not become the norm for doctor’s appointments.
Meanwhile, 5.3% of “Detractors” were positive about telemedicine and said nothing negative about it. For example, a respondent who gave a “5” on LTR said:
Grateful for the opportunity to visit a provider without taking any risks at this time. Also the convenience of not dealing with the 45 minute drive helps to relieve additional stress.
Altogether, 6.1% of respondents gave opinions of telemedicine incongruent with their LTR ratings—either promoters expressing problems with telemedicine (and nothing positive) or detractors only expressing positivity about telemedicine. For comparison, LTR ratings were only incongruent with 1.3% of commentary about providers.
ENTANGLED IDEAS ABOUT PROVIDERS AND TELEMEDICINE
Patient telemedicine comments often overlapped with comments about providers. Many patients qualified their assessment of telemedicine based on the provider:
[Doctor]… does an excellent job balancing the use of technology with connecting with the patient.
Familiarity with a provider prior to telemedicine made the technology more acceptable to some patients:
I have been with [Doctor] for many years so I am comfortable with him on line. A new doctor maybe not so much.
Some providers were able to overcome reluctance through attention to patients’ discomfort:
It was a little weird but [Doctor] made it very comfortable.
Other patients referenced provider skills and proficiency using telemedicine technology:
I realize that there might be circumstances in which a virtual visit with a medical professional might not work, but in my case, it worked very well. [Doctor] made very good use of the Zoom platform.
Patients with positive experiences mentioned the provider competence and ability to effectively navigate telemedicine. Conversely, negative experiences with telemedicine resulted from providers being unfamiliar with the technology:
My suggestion is to get more organized before you do this again to an unknowing person. THE WHOLE meeting was very frustrating.
These comments suggest that patients’ perceptions of telemedicine are difficult to disentangle from perceptions of the providers and offices using them.
DO COMMENTS PREDICT PROMOTER RESPONSES? RESULTS OF MULTIPLE LOGISTIC REGRESSION
If the LTR question is a reliable measure of patient opinions about telemedicine, we would expect that the strongest predictor of LTR rating would be positive or negative comments about telemedicine and that comments about other topics would be less predictive.
Our analysis found that there were significant associations between content in comments and the patient LTR rating. Positive comments, whether about the provider, staff, or telemedicine, correlated with being a promoter, while negative comments correlated with lower likelihoods of being a promoter, in both the bivariate and multivariate logistic models. Fig. 1 plots odds ratios on a logarithmic scale to allow readers to visually compare strength of associations, both positive and negative.

Odds ratios for code applications to predict NPS score of 9 or 10, shown on log scale. Significance (α = 0.001) is denoted with ***.
As can be seen overall, negative comments were more strongly associated with promoter status than positive comments. The strongest predictor for being a promoter was whether a respondent made negative comments about the provider. Our bivariate logistic regression yielded an odds ratio of 0.036, meaning that if the patient gave a negative comment about a provider, they were nearly 28 times as likely to be a nonpromoter (p < 0.001). A multivariate analysis, which controlled for all six comment variables, yielded an identical odds ratio of 0.036.
The weakest association was between positive comments about telemedicine and being a promoter. The bivariate analysis odds ratio of 1.738 indicated that if the patient made a positive comment about telemedicine in general, they were about 1.7 times more likely to be a promoter (p < 0.001). The multivariate analysis produced an odds ratio of 1.975.
Discussion
Because the open comment question immediately follows the LTR question, patient comments should be consistent with LTR ratings, and yet we identified some incongruence. We found that NPS reflects experiences during medical encounters, but responses to the LTR question are nonspecific, biased toward providers and away from the telemedicine modality. Based on the qualitative comments, some patients understood the questions to be focused on telemedicine, while others may have understood it to be about the provider or medical office.
Thus, patients may be miscategorized as promoters while harboring negative attitudes about telemedicine, or detractors while feeling positive about it. Finally, LTR ratings were better predicted by comments about providers or staff than comments about telemedicine.
Previous research has also identified similar issues in terms of specificity and interpretability of NPS. 25,28,29,33 Providers appear to play as strong a role for LTR ratings as with other satisfaction measures. 39 We agree with previous assessments that found NPS to be helpful only in combination with deeper qualitative data. 40
These results suggest that NPS scores should be interpreted cautiously, understanding that LTR responses are based on multiple considerations that vary across patients, clinical settings, and points in time. Respondents will interpret the LTR prompt differently and may not consider telemedicine in their ratings. We echo previous research in observing that telemedicine as a modality is difficult to separate from the providers and staff who employ the technology. 6 Patient comments about telemedicine were nearly always framed in relation to its skillful or inept deployment. Thus, poor patient experience with telemedicine may be improved with strategic and effective telemedicine usage coupled with robust training and support for health care professionals in optimizing telemedicine encounters.
LIMITATIONS
Many patients referenced the COVID-19 pandemic in their comments, noting that telemedicine was preferable for avoiding infection. While recognizing the impact of the pandemic context on the acceptability of telemedicine, we see no reason to believe it impacted the relationship between NPS responses and comments.
Elements of the survey design—for example, switching from a 4-point to a 10-point scale—made respondent errors on LTR rating more likely. Comments confirmed that some “detractors” were misidentified based on such errors (e.g., accidentally entering 1 rather than 10). Notably, the LTR prompt ambiguously uses the term “this service” to refer to the telemedicine modality, which may have contributed to respondent confusion.
As elsewhere observed, 10 the vast majority of respondents on the survey were promoters. This pattern suggests a positivity bias, 41 and creates a highly skewed distribution, violating the assumptions of many statistical models. This distribution limited possible analyses and should be considered a more general limitation of LTR response data.
RECOMMENDATIONS
These results imply several recommendations to improve the utility of patient surveys with regard to telemedicine. First, improvements to the design of the survey noted above would likely have produced more reliable results. Second, a separate measure of likelihood of recommending the provider could accompany this question to allow for contrast between opinions about the service modality and the provider. Finally, qualitative responses, while more time-consuming to analyze, offer more direct expression of patient experience than NPS and should be utilized when possible.
Conclusions
NPS, which is included on many patient experience surveys used by health systems across the world, has limitations for use as a measure of acceptability of telemedicine for patients. Patients have more than telemedicine in mind when responding to the LTR question, and ratings conflate attitudes about providers, office policies, and staff with the telemedicine modality. More direct measures are necessary for meaningful research on the acceptability and usability of telemedicine for patients.
Footnotes
Acknowledgments
We would like to thank Adam Ouellette for assistance with data collection and Lee Lucas for feedback throughout the analysis.
Authors’ Contributions
M.K. conducted the qualitative analysis, conceptualized the article, wrote significant portions of the methods and results, and edited the article. T.J. wrote most of the first draft of the article. J.D. conducted the logistic regression, wrote significant portions of the methods and results, created tables and figures, and edited the article. R.A. assisted with qualitative analysis and reviewed the article. E.A. and J.B. reviewed the article.
Data Availability
We cannot share our full dataset to protect the privacy and confidentiality of patients.
Disclosure Statement
The authors declare that there is no conflict of interest.
Funding Information
This work was completed without external funding.
