Abstract
Using Apple’s Screen Time application to obtain reported actual iPhone and social media (SM) use, we examined the accuracy of retrospective estimates of usage, how inaccuracies bias associations between use and psychosocial well-being (depression, loneliness, and life satisfaction), and the degree to which inaccuracies were predicted by levels of well-being. Among a sample of 325 iPhone users, we found that (a) participants misestimated their weekly overall iPhone and SM use by 19.1 and 12.2 hours, respectively; (b) correlations between estimated use and well-being variables were consistently stronger than the correlations between reported actual use and well-being variables; and (c) the degree of inaccuracy in estimated use was associated with levels of participant well-being and amount of use. These findings suggest that retrospective estimates of digital technology use may be systematically biased by factors that are fundamental to the associations under investigation. We propose that retrospective estimates of digital technology use may be capturing the construct of perceived use rather than actual use, and discuss how the antecedents, correlates, and consequences of perceived use may be distinct from those of actual use. Implications of these findings are discussed in view of the ongoing debate surrounding the effects of digital technology use on well-being.
Introduction
The rapid rise in digital technology use (i.e., mobile devices and social media [SM]; Hitlin, 2018) combined with increasing rates of depression among youth in the US (Mojtabai, Olfson, & Han, 2016) have led many to question whether these technologies are to blame. Concerns about the effects of digital technology use on well-being have fueled a considerable amount of research in the past decade attempting to answer this question (for a systematic map of reviews, see Dickson et al., 2018). However, despite the plethora of research in this area, the prevalence of inconsistent findings has made it difficult to conclude whether or how digital technology use is associated with well-being.
For instance, recent longitudinal studies have found that more time spent on digital technology predicts higher levels of depression (Twenge, Joiner, Rogers, & Martin, 2018), that higher levels of depression predict more time spent on digital technology (Zink, Belcher, Kechter, Stone, & Leventhal, 2019), that the two reciprocally influence each other (Houghton et al., 2018), or that there is no prospective relationship (George, Russell, Piontak, & Odgers, 2018). Although these findings are incongruent, a common limitation to most research in this area—including the studies just cited—is the reliance upon participant retrospective estimates to measure digital technology use.
Relying upon estimates of use to make inferences about the relationship between digital technology and well-being raises concerns about the validity and reliability of the findings. Self-report measures of media use have consistently been shown to be inaccurate when compared against more objective measures (e.g., passive sensing, usage meters, or provider logs; Scharkow, 2016). Importantly, studies have shown that inaccuracies in estimates of media use are not solely due to random error, but are often related to variables that are fundamental to the relationship being investigated, such as amount of usage (Araujo, Wonneberger, Neijens, & de Vreese, 2017; Kahn, Ratan, & Williams, 2014; Kobayashi & Boase, 2012; Vanden Abeele, Beullens, & Roe, 2013). Errors in estimation can have significant consequences when attempting to detect associations and can lead to Type I and/or Type II errors (Kobayashi & Boase, 2012; Scharkow, 2016).
Though numerous studies have investigated the relationship between digital technology use and well-being, to our knowledge, no study has examined how levels of well-being may influence the accuracy of participants’ estimates of use. The present study uses Apple’s Screen Time application—a suite of tools that automatically tracks users’ iPhone usage and is included by default on all iPhones running iOS Version 12 or later—to obtain data on overall iPhone and SM use. We compare participants’ estimates of their iPhone and SM use with their actual use, 1 and investigate whether the amount of inaccuracy is influenced by levels of depression, loneliness, life satisfaction, and usage amount.
Digital technology use and psychosocial well-being
For the purposes of this study, we elected to focus on three psychosocial variables linked to well-being—depression, loneliness, and life satisfaction—as these variables are frequently examined when investigating the effects of digital technology use. According to a recent systematic map of reviews by Dickson et al. (2018), there have been over 41 systematic reviews or meta-analyses published in the last 10 years that focused on the relationship between screen-based activities and at least one of these psychosocial variables.
Although considered separate constructs, depression, loneliness, and life satisfaction share substantial conceptual and psychosocial overlap. Research indicates that these three constructs are strongly correlated with one another and share comparable patterns of associations with various psychosocial variables (Heinrich & Gullone, 2006; Proctor, Linley, & Maltby, 2009; Rubenstein & Shaver, 1982). The overlap between these constructs suggests that the cognitive, behavioral, and affective processes that are at play when providing retrospective estimates of media use may be similar to those for depression, loneliness, and life satisfaction.
The abundance of research in this area, however, has produced more noise than signal regarding the association between well-being and time spent on digital technology. Numerous studies have found a negative relationship between digital technology use and well-being (Booker, Kelly, & Sacker, 2018; Demirci, Akgonul, & Akpinar, 2015; Kelly, Zilanawala, Booker, & Sacker, 2018; Kross et al., 2013; Lin et al., 2016; Twenge et al., 2018), whereas others have found either a positive or null relationship (Berryman, Ferguson, & Negy, 2018; Hampton, 2019; Jensen, George, Russell, & Odgers, 2019; Orben & Przybylski, 2019a, 2019b). Despite these mixed findings, evidence from three meta-analyses examining 220 effects across 157 studies (Huang, 2017; D. Liu & Baumeister, 2016; M. Liu, Wu, & Yao, 2016) suggests that there is a slight negative association between digital technology use and well-being. Therefore, we hypothesized that (H1) estimated use and actual use will be positively correlated with loneliness and depression, and negatively correlated with life satisfaction.
However, to our knowledge, practically all the studies included in these meta-analyses relied upon retrospective estimates to measure digital technology use. Prior studies testing the accuracy of retrospective estimates of use have found these types of measures to be inaccurate (Araujo et al., 2017; Scharkow, 2016), which may lead to inflated correlations with outcomes when compared against more objective measures of media use (Kobayashi & Boase, 2012). This was illustrated in recent research by Orben and Przybylski (2019a), who found that retrospective estimates of digital technology use produced stronger associations with well-being compared to reports of technology use from ecological momentary assessment (EMA). Although EMA data are not equivalent to objective tracking data, they are generally more accurate than retrospective estimates due to their ability to drastically reduce recall bias (Shiffman, Stone, & Hufford, 2008). Therefore, we hypothesized that (H2) the actual use variables will exhibit weaker correlations with the well-being variables than the estimated use variables.
Systematic error and response inaccuracy
According to the cognitive response model (Tourangeau, 1984), responding to survey questions requires navigating a complex series of processes that involve comprehending the question, retrieving the appropriate information, estimating or judging the answer, and then reporting the answer. Measures requiring retrospective time estimation (e.g., “How much time have you spent this week using social media?”)—which are often used in research on the effects of digital technology use—are directly influenced by respondents’ memory, attention, perception, and emotional state (Grondin, 2010), and indirectly influenced by any characteristics or conditions that might impact these processes. Providing accurate retrospective estimates of digital technology use may be particularly challenging, as respondents are more prone to inaccuracy when attempting to estimate behaviors that are difficult to recall or are very integrated into their daily lives (Kahn et al., 2014). For these as well as a variety of other reasons (see Schwarz & Oyserman, 2001), retrospective estimates of digital technology use are susceptible to multiple sources of measurement error.
While random measurement error is generally of less concern, as the random error in responses is expected to cancel out, systematic error poses a substantial threat to the reliability of measurement; as responses are consistently biased by factors relating to the individual respondent or another aspect of the survey. Notably, studies have found that errors in retrospective estimates of general Internet use (Araujo et al., 2017; Scharkow, 2016), television use (Wonneberger & Irazoqui, 2017), video game use (Kahn et al., 2014), and mobile phone use (Vanden Abeele et al., 2013) are systematically caused by various individual factors. For instance, multiple studies (Boase & Ling, 2013; Scharkow, 2016; Wonneberger & Irazoqui, 2017) found that males were significantly more likely than females to overestimate their media use.
Crucially, research has shown that inaccuracy in retrospective estimates of media use was directly related to actual use, with heavier users reporting less accurate estimates. This finding has held across samples from multiple countries examining different types of media use, including mobile devices (Araujo et al., 2017; Boase & Ling, 2013; Scharkow, 2016; Vanden Abeele et al., 2013; Wonneberger & Irazoqui, 2017).
These findings strongly suggest that retrospective estimates of digital technology use may be systematically biased by factors that depend on respondents’ actual usage. The ubiquity of digital technology usage and the degree to which these behaviors have become integrated into daily experiences indicate that many respondents may have difficulty accurately estimating how much time they spent using digital technology over any considerable length of time. The task of estimating the duration of digital technology use depends heavily on attention, perception, recall, and estimation abilities, which represents a significant challenge to respondents—especially those with the greatest amounts of use. Given the evidence presented here, as well as recent work that showed a tendency for respondents to overestimate their mobile device use (Boase & Ling, 2013; Kobayashi & Boase, 2012), we hypothesized that (H3) on average, participants will significantly overestimate their overall iPhone use and SM use; and (H4) participants with higher amounts of reported actual usage will have greater discrepancies between estimated and reported actual use.
In addition to the amount of actual digital technology usage, retrospective estimates of use may also be systematically biased by factors related to psychosocial characteristics. However, to date, there has been limited research into the psychosocial characteristics that may predict inaccuracies in estimates of digital technology use. Kobayashi and Boase (2012) found that participants with higher levels of in-person social engagement were more likely to overestimate their mobile phone use. A study by Kahn et al. (2014) found that level of enjoyment and perceived social integration predicted the degree of misestimation among a sample of video game users. These findings suggest that the accuracy of retrospective estimates of digital technology use may depend, at least in part, on cognitive and affective factors that differ across individuals. Still, despite the dozens of studies evaluating the link between digital technology use and well-being over the past several years, to our knowledge, no study has investigated how levels of loneliness, depression, and life satisfaction may affect the accuracy of respondents’ retrospective estimates of use.
There are numerous ways that the level of respondent well-being may impact the complex task of estimating digital technology use. First, respondents with greater depression and loneliness, or lower life satisfaction, may have an impaired ability to accurately perceive time. Evidence suggests that the perception of time is related to our emotional states and subjective well-being (Wittmann & Paulus, 2008). For instance, research has shown that cognitive time passes more slowly in stressful situations (Fraisse, 1984), and respondents who are boredom-prone or depressed perceive time passing at a slower pace, leading to overestimations in time-estimation tasks (Wittmann & Paulus, 2008).
Second, while many people may be “cognitive misers” when answering survey questions (Fiske & Taylor, 1984), respondents experiencing lower well-being may be particularly vulnerable to satisficing behaviors. Those lower in well-being may have a decreased ability to accurately recall and estimate their digital technology usage due to possible impairments in motivation, concentration, and recollection—especially in the case of depression (Bschor et al., 2004). Given the cognitive burden of estimating a potentially high-frequency behavior, those with lower well-being may be more likely to opt for easy answers (i.e., guessing) or retrieve inaccurate information, thus increasing the likelihood of misestimating their use.
Third, the level of respondent well-being may influence the types and severity of cognitive biases that impact the accuracy of retrospective estimates of use. For instance, research shows that sensitivity to social information is directly tied to level of current loneliness (Gardner, Pickett, & Brewer, 2000). This may have implications for estimates of SM usage in particular, as respondents experiencing loneliness may misestimate their SM usage due to higher levels of sensitivity to social information. Furthermore, people experiencing higher levels of negative affect, especially sadness, are more likely to display negative cognitive biases (Beevers et al., 2019). This suggests that level of well-being may influence the type of information that respondents attend to and how they attend to it. The salience and valence of the information most easily retrieved by respondents when retrospectively estimating their digital technology use may bias the accuracy of those estimates.
Fourth and finally, level of respondent well-being and amount of digital technology usage may interact in a way that leads to biased estimates of use. Respondents with lower well-being may be more likely to have higher amounts of digital technology use, and this higher usage amount then may increase the likelihood that they misestimate their use.
Therefore, we hypothesized that (H5) participants who report more depression and loneliness, and lower life satisfaction, will have greater discrepancies between their estimated and reported actual use.
Studies relying upon retrospective estimates to examine the relationship between digital technology use and well-being may be prone to systematic bias from factors related to both the dependent and independent variables under investigation. Given the substantial public and academic interest in this area of research, as well as the potential significance to public health, understanding how psychosocial characteristics may impact the accuracy of people’s estimates of digital technology use is crucial. Particularly when studies are attempting to examine how these behaviors may relate to, or cause, those very psychosocial characteristics.
Current study
Our study addresses two important gaps in the literature. First, by examining how estimated and reported actual smartphone use differ in their associations with levels of psychosocial well-being, this study will help to identify how potential measurement error associated with estimated smartphone use may impact findings. Second, examining which variables predict the discrepancy between estimated and reported actual use will provide insight into the types of individual characteristics—such as well-being or amount of use—that may affect the accuracy of estimates provided by participants.
Method
Procedure
Participants were recruited from January 30, 2019 to February 2, 2019 from Amazon Mechanical Turk (MTurk). We posted a task on MTurk describing the research study as a 10- to 15-minute research study of iPhone use and well-being, offering $1.00 compensation for completing the survey. Inclusion criteria comprised the following: participants must use an iPhone with iOS Version 12 or later, speak English, reside in the United States, and be at least 18 years old.
Participants were routed to an online Qualtrics survey hosted by the University of Pittsburgh, where they were presented with the consent statement. Those who agreed to participate continued to the questionnaire where they provided estimates of their iPhone and SM use over the past week. Next, participants were guided on how to report their latest screen time data from the Screen Time application. Lastly, participants completed scales on depression, loneliness, and life satisfaction, as well as several demographics items. This study was approved by the University of Pittsburgh’s Institutional Review Board.
Participants
A total of 399 participants completed the online survey. To help ensure credibility of responses, we implemented eligibility and screening procedures. First, only MTurk users with a task acceptance rate of 95% or higher were allowed to take the survey (Peer, Vosgerau, & Acquisti, 2014). Second, we excluded 57 participants that failed an attention check (i.e., reporting actual SM use that was greater than actual overall use). Third, participants reporting usage variables greater than 3 SDs outside the mean were excluded (n = 17), yielding a final sample of N = 325.
Among the 325 participants, 57.5% identified as male, 41.9% identified as female, and 0.6% identified as nonbinary. The majority of participants identified as White (79.1%) and 16.6% as Hispanic (not mutually exclusive from other race/ethnicity categories). The average age was 33 (SD = 9.6). Most participants had completed a bachelor’s degree (60%).
Measures
Estimated iPhone usage
Participants estimated their weekly use (over the past 7 days) and their daily average (based on the last 7 days) in two categories: overall iPhone use and SM use (on their iPhone). Before providing their estimates, participants were instructed to not consult any applications that track their iPhone usage.
To measure weekly overall estimated use, participants were presented the item: “As accurately as possible, estimate the total amount of time you spent using your iPhone over the past 7 days, including today. Count all uses except listening to audio (e.g., music, podcasts) in the background,” and were instructed to fill in a blank number field with the total number of hours (values constrained at 0 and 168). To measure overall daily estimated iPhone usage, participants were presented the item “Over the past 7 days, including today, about how much time per day (on average) did you spend using your iPhone? Please include all uses except listening to audio (e.g., music, podcasts) in the background,” and were provided with response items ranging from “0 hours” to “12 or more hours” with half-hour intervals.
Estimated weekly SM use was measured with the item: Considering your use of social media (e.g., Snapchat, Instagram, Facebook, Twitter, instant messaging, etc.) over the past week, please estimate the total amount of time you spent using social media on your iPhone over the past 7 days, including today.
Estimated daily SM use was measured using the item: “Still considering your use of social media, about how much time per day (on average) did you spend using social media on your iPhone over the past 7 days, including today.”
Actual iPhone usage
After completing the estimated use section, participants were provided detailed instructions (including visual aids) first directing them to the Screen Time application, then directing them to each category to be reported. The Screen Time application automatically tracks the duration of time that the iPhone is actively engaged, excluding time spent when the iPhone is in use but on the lock screen (e.g., listening to audio in the background; Gower & Moreno, 2018; Hunt, Marx, Lipson, & Young, 2018). Furthermore, Screen Time tracks the duration of time that specific applications are used and groups applications by category (i.e., social networking, productivity, entertainment, etc.; Ceres, 2018). For instance, the social networking category includes popular applications like Instagram, Facebook, Snapchat, and Twitter. Only time spent using the SM application directly is counted toward the social networking category. Time spent accessing SM platforms via Internet browsers (e.g., Safari or Chrome) is counted toward the productivity category rather than the social networking category.
Participants were asked to report the data captured in the Screen Time application for reported actual weekly overall use and reported actual weekly SM use. For weekly overall and weekly SM use, participants were instructed to fill in blank fields corresponding to the number of hours (constrained at 0 and 168) and number of minutes (constrained at 0 and 59) shown in the application for the respective sections. We then derived daily averages for both categories by dividing the weekly values provided by 7. Since the estimations for the daily overall use and daily SM use variables were capped at “12 or more” in the survey, all values greater than 12 were adjusted down to 12 for the daily overall and daily SM reported actual use variables.
Depression
Depression was measured using the 10-item Center for Epidemiologic Studies Depression Scale Revised (CESD-R-10; Andresen, Malmgren, Carter, & Patrick, 1994). The CESD-R-10 includes 10 items corresponding to symptoms of depression, such as “I felt depressed,” “I felt hopeful about the future” (reverse-scored), and “My sleep was restless.” Respondents rate how often they experienced each symptom over the past week using a Likert-style scale (0 = rarely or none of the time, 3 = all of the time). Total scores are derived by summing across the 10 items (Items 5 and 8 are reverse-scored) and can range from 0 to 30. Higher scores suggest greater severity of depressive symptoms. The CESD-R-10 has displayed high internal reliability and convergent validity against other depression measures (Andresen et al., 1994). The coefficient alpha for the present sample was .88.
Loneliness
Loneliness was measured using the eight-item UCLA Loneliness Scale (ULS-8; Hays & DiMatteo, 1987). The ULS-8 includes eight items corresponding to characteristics of loneliness, such as “I lack companionship,” “There is no one I can turn to,” and “I feel isolated from others.” Respondents rate how often they have felt or experienced each characteristic over the past week using a Likert-style scale (1 = never, 4 = always). Total scores are derived by summing across the eight items (Items 3 and 6 are reverse-scored) and can range from 8 to 32. Higher scores suggest greater loneliness. Coefficient alpha for the ULS-8 has been reported as .84 with strong convergent and discriminant validity (Hays & DiMatteo, 1987). The coefficient alpha for the present sample was .86.
Life satisfaction
Life satisfaction was measured using the Satisfaction With Life Scale (SWLS; Diener, Emmons, Larsen, & Griffin, 1985). The SWLS includes five items corresponding to characteristics of life satisfaction, such as “In most ways my life is close to my ideal,” “I am satisfied with my life,” and “So far I have gotten the important things I want in life.” Respondents rate their degree of agreement with each item using a Likert-style rating (1 = strongly disagree, 7 = strongly agree). Total scores are derived by summing across the five items and can range from 5 to 35. Higher scores indicate greater life satisfaction. The coefficient alpha has been reported as .87 with strong convergent and discriminant validity (Diener et al., 1985). The coefficient alpha for the present sample was .90.
Analyses
The distributions of the estimated and reported actual use variables were right-skewed and kurtotic. Therefore, similar to prior studies (Boase & Ling, 2013; Vanden Abeele et al., 2013), we log-transformed the usage variables before computing paired-sample t tests and Pearson correlations.
Summary statistics of the raw (untransformed) variables were calculated for the estimated and reported actual use variables as well as the well-being variables. To examine how estimated and reported actual use variables correlate with each other and with the well-being measures (H1 and H2), we conducted zero-order Pearson correlations. To test for differences between estimated and reported actual use variables (H3), we conducted paired-sample t tests.
Finally, to test Hypotheses 4 and 5, we created two additional variables for both weekly use items by taking the absolute value of the difference between estimated and reported actual use for each category (referred to as “discrepancy scores”). We then conducted multiple regression analyses with weekly overall iPhone use discrepancy scores and weekly SM use discrepancy scores as dependent variables in separate models. In order to meet the assumption of homoscedasticity, the discrepancy score variables were log-transformed. The primary predictors in each of the models were the three well-being variables and the weekly reported actual use variables. Each of the well-being predictor variables were first entered separately in each model in order to obtain unique estimates for their associations with discrepancies. Age (continuous), gender (dichotomous, 1 = female), race (dichotomous, 1 = non-White), and education (dichotomous, 1 = bachelor’s degree or higher) were entered as covariates in every model. Since the “nonbinary” category of the gender variable only had two observations, these cases were dropped in the regression analyses.
Lastly, to examine the strongest predictors of the discrepancy between estimated and reported actual use, we included all the well-being and usage variables in a combined model. Due to model instability caused by multicollinearity, we used least absolute shrinkage and selection operator (lasso) regression to simultaneously perform variable selection and shrinkage of regression coefficients among all the predictors (Tibshirani, 1996). Briefly, a lasso is a modified form of ordinary least squares regression that penalizes overfit models via a regularization parameter that proportionally shrinks the magnitude of predictor coefficients toward zero, and in the case of less important predictors, coefficients shrink to zero. In doing so, variable selection is implicitly performed, as less important predictors are removed from the model without the biases of other variable selection techniques, such as multiple comparisons and collinearity between predictor variables. Tenfold cross-validation selected the optimal regularization parameter via the one-standard-error rule (Hastie, Tibshirani, & Friedman, 2009). Analyses were completed using Stata Version 15.1 (StataCorp, 2017).
Results
Table 1 provides the descriptive statistics for estimated use, reported actual use, discrepancy scores, and psychosocial well-being variables. For ease of interpretation, we present the descriptive statistics for the variables in raw form. On average, participants overestimated their weekly overall use, weekly SM use, and daily SM use, and underestimated their daily overall use. For weekly overall use, overestimators were, on average, more inaccurate and varied (M = 28.28, SD = 33.69) than underestimators (M = −13.09, SD = 18.27). This pattern was also true for weekly SM use; overestimators were more inaccurate and varied (M = 17.57, SD = 25.15) than underestimators (M = −5.42, SD = 11.37).
Summary statistics for primary variables (N = 325).
Note. SM = social media; discrepancy scores = absolute difference between actual and estimated use. aMeasured with the 10-item Center for Epidemiologic Studies Depression Scale Revised (CESD-R-10); bmeasured with the UCLA Loneliness Scale (ULS-8); cmeasured with the Satisfaction With Life Scale (SWLS).
Results of the paired-sample t tests for the log-transformed use variables are presented in Table 2. Results show that participants significantly overestimated their weekly and daily SM use. The differences between estimated and reported actual use for the overall use variables were not significant. Table 3 presents the zero-order Pearson correlations for the primary study variables (all usage variables log-transformed). All estimated use variables were significantly correlated with their corresponding reported actual use variables, with magnitudes ranging from r = .39 to r = .52.
Paired t tests comparing estimated versus reported actual use measures (N = 325).
Note. SM = social media. Variables log-transformed.
Zero-order correlation matrix for estimated and reported actual use variables (N = 325).
Note. SM = social media. ϮLog-transformed.
p < .05. **p < .01. ***p < .001.
Of the 12 correlations between the four estimated use variables and three well-being measures, eight were statistically significant (life satisfaction exhibited no significant correlations with any of the usage variables). Specifically, estimated daily SM use consistently exhibited the strongest relationships with the well-being measures—loneliness (r = .28, p < .001), depression (r = .41, p < .001), and life satisfaction (r = .09, ns). On the other hand, of the 12 correlations between the actual use variables and the well-being measures, only two were significant, with weekly and daily SM use exhibiting a weak association with depression.
The multiple regression results for weekly overall use discrepancies are presented in Table 4a, and the results for weekly SM use discrepancies are presented in Table 4b. Results of the usage variables indicate that reported actual weekly overall use was significantly positively related to discrepancies across all models. Reported actual weekly SM use was significantly positively related to discrepancies in five of the six models, with stronger effects when predicting weekly SM use discrepancies. Results of the well-being variables indicate that depression was positively associated with discrepancy scores for both overall and SM use, whereas loneliness was only positively associated with overall use discrepancies. Of the four models that included life satisfaction, only one exhibited a significant association with discrepancy scores (Model 6a).
Predictors of discrepancy between estimated and reported actual overall use (N = 323).
Note: SM = social media. Gender, age, race, and education were entered as covariates in each model.
The dependent variable (discrepancy) was log-transformed for this analysis.
p < .05. **p < .01. ***p < .001.
Predictors of discrepancy between estimated and reported actual SM use (N = 323).
Note: SM = social media. Gender, age, race, and education were entered as covariates in each model.
The dependent variable (discrepancy) was log-transformed for this analysis.
p < .05. **p < .01. ***p < .001.
In the combined model, lasso results show that reported actual overall use and depression were the strongest predictors of overall use discrepancies. Reported actual overall use, depression, reported actual SM use, and life satisfaction were the strongest predictors of SM use discrepancies (see Table 5).
Predictors of overall and SM use discrepancies in combined model (N = 323).
Discussion
The current study examined the accuracy of retrospective estimates of overall iPhone and SM use, as well as the psychosocial and usage factors that are associated with the degree of inaccuracy. Our study is unique in its original and convenient approach to gathering reported actual usage data by utilizing Apple’s new Screen Time application, as well as in our examination of how depression, loneliness, and life satisfaction may bias retrospective estimates of iPhone and SM use. Overall, the results of this study show that estimates of iPhone and SM use may be inaccurate, that these inaccuracies can lead to inflated correlations, and that estimates of use may be systematically biased by levels of actual usage and psychosocial well-being.
At first glance, our sample of MTurk participants appears to be relatively heavy iPhone and SM users. Using the raw means, participants used their iPhones for an average of 31.5 hours over the past week and used SM on their iPhones an average of 17.6 hours over the past week. However, the amount of smartphone use is similar to prior studies that used passive sensing applications to track actual usage. For instance, Elhai et al. (2018) and Andrews, Ellis, Shaw, and Piwek (2015) found that their samples of college students used their phones for an average of 4 and 5 hours per day, respectively. Therefore, it can be reasonably assumed that our sample of MTurk users does not represent an extreme selection of iPhone and SM users.
We found moderate associations between retrospective estimates and reported actual use. While these associations were statistically significant, given that estimates of use are designed to measure actual use, the magnitudes of the correlations are relatively weak. With Pearson correlations ranging from r = .39 to r = .52, the amount of variance in estimated use that can be explained by reported actual use is as low as 15% for weekly overall use and only as high as 27% for weekly SM use. The relatively weak correlations between estimated and reported actual use measures are in line with findings from similar media use studies (Araujo et al., 2017; Kahn et al., 2014; Scharkow, 2016), and suggest that these measures may be tapping into different constructs.
To provide an estimate of iPhone/SM use, respondents must navigate a series of complex steps that rely upon cognitive processes related to attention and perception (Tourangeau, 1984). This process is considerably different from reporting a number from a passive sensing application. Therefore, we suggest that retrospective estimates of iPhone/SM use are measuring a construct more closely related to perceived use than actual use. The gap between these related yet disparate constructs engenders two primary concerns, both of which are borne out in the findings of our study.
First, if estimated and actual use reflect distinct constructs, then it is highly unlikely that they will exhibit identical relationships with various psychosocial variables. For instance, research on social support has shown that perceived social support and actual social support are distinct constructs and it is the perception of social support that is more strongly related to well-being (Wethington & Kessler, 1986). We observed a similar phenomenon in our study, as the correlations between estimated use and the well-being outcomes were almost uniformly greater than the correlations between reported actual use and the same outcomes. The difference was most evident with depression, where the correlation with estimated overall weekly use (r = .25) was more than 4 times as large as the correlation with reported actual overall weekly use (r = .06). Such differences between estimated and actual use can alter the observed correlation from a moderate and statistically significant effect worth noting to a small and insignificant effect warranting less concern.
Aside from altering the size and statistical significance of observed effects, it is likely that the causes, consequences, and correlates of estimated digital technology use are different from those of actual use. For example, it may be that increases in actual SM use are prospectively associated with decreases in well-being, thus implicating actual SM use as a potential risk factor. Alternatively, decreases in well-being may be associated with increases in perceived SM use, which leads to higher estimates of use—even though the amount of actual SM use has not changed. This latter scenario is particularly concerning, as we would erroneously conclude that increased SM use is related to decreased well-being, even though the “true” association between actual SM use and well-being is null.
The notion that perceived and actual use may be implicated in different causal processes relates to the second primary concern. Since estimates of use rely upon numerous cognitive and affective processes (Grondin, 2010), the accuracy of estimated use measures may depend, at least in part, on factors that impact those cognitive and affective processes. In our study, we found that the accuracy of estimated iPhone and SM use was directly related to the level of respondent well-being and how much they used digital technology.
Higher levels of reported actual use were associated with greater inaccuracies of estimated use for both overall iPhone use and SM use. This finding supports the assertion that high-frequency behaviors are difficult to accurately recall and estimate (Schwarz & Oyserman, 2001). The high amounts of daily use observed in our sample suggest that iPhone and SM use are highly integrated into respondents’ daily lives. This high degree of integration, coupled with the typically sporadic nature of mobile device use, may place a high cognitive burden on respondents—particularly the heaviest of users—when tasked with estimating their average or weekly usage. As a result, respondents may resort to guessing or other error-prone response behaviors.
Importantly, we also found that levels of respondent well-being were associated with the amount of inaccuracy in estimated digital technology use. When analyzing the well-being variables in separate models, we found that higher levels of respondent depression were associated with greater inaccuracies in estimated overall use and estimated SM use. Higher levels of loneliness were associated with greater inaccuracies in estimated overall use, but not estimated SM use, while higher levels of life satisfaction were associated with greater inaccuracies in estimated SM use, but not overall use. In the combined lasso model, depression emerged as a psychosocial predictor of both overall use and SM use inaccuracies, while life satisfaction was a predictor only of SM use inaccuracies.
These findings have two important implications. First, when using retrospective estimates to examine the relationship between digital technology use and well-being, the accuracy of those estimates may be impacted by the respondent’s level of psychosocial well-being. This coheres with the reasoning supporting H5 discussed before, as the processes involved when reporting estimates of digital technology use are likely differentially impaired depending on the respondent’s level of depression, loneliness, and/or life satisfaction. For instance, respondents with higher levels of depression may experience impaired concentration and motivation, as well as decreased cognitive speed, which inhibits their ability to accurately recall and estimate their usage.
Second, the variegated results for overall versus SM use inaccuracies suggest that the level of inaccuracy in estimates of use—and the predictors of inaccuracy—may differ across digital technology platforms. For instance, we found that respondents significantly overestimated their SM use, but not their overall iPhone use. Due to the social importance of and expectations involved with SM, it is possible that SM use is salient to respondents in a way that other iPhone use is not, thus leading to higher perceived use and inflated estimates of use. Alternatively, given the sporadic nature of SM use and the different ways that people define SM, it may be that time spent on SM is particularly difficult to estimate.
Furthermore, we found that the amount of inaccuracy in overall use estimates was predicted by depression, while the amount of inaccuracy in SM use estimates was predicted by depression and life satisfaction. This suggests that additional mechanisms may be at work when estimating SM use that are not for overall use. The positive relationship between life satisfaction and inaccuracy of SM use estimates may indicate how aspects related to greater well-being may adversely impact the accuracy of estimates of use. Those higher in life satisfaction may derive greater gratification and enjoyment while using SM, leading to a misestimation of the amount of time spent on SM (i.e., the “time flies when you’re having fun” effect; Hornik, 1984).
Limitations
This study should be considered within the context of the following limitations. First, our data were drawn from a convenience sample of MTurk workers with an iPhone. Both of these factors may have biased the data due to potential differences in how MTurk workers and/or iPhone users differ from the general population. For instance, prior research found that MTurk participants were more likely to screen positive for depression than participants from a nationally representative sample (Walters, Christakis, & Wright, 2018). Furthermore, one study found that iPhone users were more likely than Android users to be female, younger, and have higher levels of emotionality (Shaw, Ellis, Kendrick, Ziegler, & Wiseman, 2016). Therefore, it is unclear the extent to which these findings may replicate in non-MTurk and/or non-iPhone-based samples.
Second, we were unable to independently verify that the data provided by participants for their reported actual use matched exactly what was contained in the Screen Time application. Although we implemented screening and eligibility procedures, it is possible that some participants misreported or provided fictitious usage data. Since these are reported data, the accuracy of participants’ reported actual use may have been influenced by social desirability. For instance, some participants may have deflated their reported actual use due to embarrassment about how much they use their iPhone, or may have adjusted the reported actual use to be closer to their estimated use so as to appear more competent at estimating their usage. Therefore, these findings should be replicated in future research that uses a method for objectively recording actual usage data (i.e., screenshots or independent verification).
Third, the items we used to measure estimated weekly and daily use may have been impacted by question effects. Asking participants to provide a continuous estimate of their use may lead to more inaccuracies compared to asking participants to select a categorical response (e.g., “1–2 hours per day”). Research by Boase and Ling (2013) showed that people were generally more accurate when providing categorical rather than continuous estimates.
Fourth, participants who access SM platforms largely via a web browser rather than via the SM application itself may have overreported their total SM usage due to the fact that the Screen Time application counts time spent on web browsers toward the productivity category, regardless of the particular website that is being accessed. However, the vast majority of mobile device users access SM platforms through the application rather than mobile browsers (“Global Digital Future,” 2018), so this issue may have been limited in our data. Still, future research using the Screen Time application to track SM usage should query participants about whether they access these platforms predominantly via web browsers or directly via SM applications.
Fifth, since Screen Time only tracks time spent when the device is actively open, factors relating to how long the iPhone remains open before going to the lock screen may have influenced the amount of misestimation. To illustrate, certain applications (e.g., navigation) usually stay open on the screen while in use and, in general, iPhone users can determine how long their screens stay open before going to the lock screen. One way to address this issue in future research is to have participants upload a screenshot of their most used applications, which are tracked by the Screen Time application. This will allow for researchers to see if certain applications that participants may not consider “active use” are inflating the total amount of screen time tracked by the application.
Finally, the three psychosocial well-being constructs we examined in this study—depression, loneliness, and life satisfaction—are not a complete representation of the construct of well-being. Thus, it is unclear how other aspects related to well-being, such as self-esteem, stress, and positive/negative affect, may be associated with the accuracy of retrospectively reported estimates of digital technology use. This is an important area for future research.
Conclusions
This study has important implications for the field of digital technology research. We found that retrospective estimates of iPhone/SM use are highly inaccurate, and that the degree of inaccuracy is directly related to the amount of time respondents use their phone and their level of well-being. Based on our findings, we suggest that retrospective estimates of use capture factors more closely related to the construct of perceived use rather than actual use. The differences between these constructs are crucial, as perceived use depends upon various cognitive and affective processes that actual use does not. Therefore, the antecedents, correlates, and consequences of perceived use are likely distinct from those of actual use.
However, most studies in this area rely upon retrospective estimates to measure digital technology use, implicitly assuming that they are satisfactorily capturing actual use. It is therefore unclear if the findings linking well-being to time spent on digital technology would replicate if reexamined using objective measures of use. Given the strong public and academic interest in this topic, as well as potential implications that findings in this area may have on policy and practice recommendations, researchers should carefully consider the measures they use to capture digital technology use and the potential impact this may have on the reliability of findings.
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by the Robert and Sally Schwartz Endowed Resource Fund, an internal University of Pittsburgh School of Social Work award. The funding source was not involved in the study design or the collection, analysis, or interpretation of data.
