Abstract
Research on digital inequality has found that aging adults are often at risk of digital exclusion. Understanding the validity of survey measures assessing Internet skills in this population is critical to providing the high-quality data needed for effective digital inclusion policy interventions. This cross-validation study examines the structural validity and measurement invariance (across age, gender, and education groups) of the Web-Use Skills scale (WUS), which is commonly used as a proxy measure of Internet skills. We tested the 14-item version of the WUS. The scale was translated into the Slovenian language and pretested with older Internet users. Data were collected from two independent samples of Internet users aged 50+ years (N1 = 259 and N2 = 256) drawn from an online opt-in panel in Slovenia. The examination of structural validity confirmed that the WUS adequately reflects the one-factor structure of the web-use skills construct, although in a shorter six-item form. Moreover, the analysis confirmed strict measurement invariance between the two samples and, at least, scalar invariance between age, gender, and education groups. The results support the applicability of WUS in cross-group comparisons of Internet skills in the population of aging Internet users and point to several opportunities for future work.
Introduction
Arecent increase in Internet access and use in the general population is mainly due to the growth of users in older age groups, 1 but research indicates that the risk of digital exclusion in the aging population remains high. 2 Thus, it is vital to understand the validity of measures of Internet skills for this population because digital inclusion policies and interventions should not only be conceptually grounded but also empirically supported by valid, large-scale survey data.
In this sense, the Web-Use Skills scale (WUS)3–5 seems useful for self-assessment of Internet skills among Internet users in later life because of its wide use in prior research and favorable properties related to scale comprehension and interpretation (e.g., scale brevity, item complexity, response option format).
The WUS is defined as a proxy measure of Internet skills; it was designed to assess a person's self-reported understanding of various Internet-related terms. It assumes that the more knowledgeable a person is about various Internet-related technologies, the better the person's Internet skills will be. 3 Importantly, the WUS tackles only one aspect of Internet skills, namely web-use skills, a unidimensional construct that represents a user's ability to locate content on the web effectively and efficiently. 3
Hargittai 3 first proposed a 7-item scale (derived from a 38-item list) and confirmed its criterion validity by demonstrating the hypothesized correlations between responses to the WUS and an individual's observed online browsing behavior. She 4 extended the original scale to 27 items in 2009, while Hargittai and Hsieh 5 showed that 6-, 10-, and 15-item versions have adequate reliability and can be administered to the general population and—in adapted forms—to low-skilled Internet users.
Despite noting that longer scales have higher reliability, they suggested that scholars should always consider the trade-off between scale length and respondent burden when using the WUS. They also argued that due to the fast-changing nature of Internet tools and services, it is important to update such an instrument over time. 5
In later years, many Internet researchers have followed their suggestion. In fact, the importance of the WUS is illustrated by the large number of articles in Google Scholar citing Hargittai and Hsieh's adaptations of the WUS. 5 A September 2021 search retrieved 60 peer-reviewed articles that empirically used the WUS, with the number of items ranging from 4 to 38. Ninety-four different items formed the inventories in various combinations.
Most articles included 6 (14 articles) or 27 items (7 articles). In 40 articles, respondents answered on a scale from 1 (do not understand) to 5 (fully understand), but other response scales were also found (1–4, 0–11, and dichotomous). Ten articles applied the WUS to older adults.6–15
While a handful of studies addressed the measurement properties of the WUS, only 110 of the 60 identified articles assessed its structural validity. Structural validity refers to the extent to which a scale's score adequately reflects the dimensionality of the construct being measured. 16 Although Hofer et al. 10 confirmed the one-factor structure of the 6-item WUS among older adults, they did not validate the 10- and 15-item scales developed by Hargittai and Hsieh. 5
In addition, none of the retrieved articles reported on the assessment of measurement invariance of the WUS in the aging population. Measurement invariance assumes that scale items have the same meaning and are used in the same way by different respondent groups. 17 For instance, invariance is required when researchers want to obtain unbiased comparisons of correlations between Internet skills and covariates (e.g., Internet outcomes) for middle-aged and older adults or when their aim is to compare latent means of Internet skills between age groups. 18
Hence, the original contribution of this study is not only in testing the structural validity of the longer version of the WUS using a cross-validation design (two samples) but also in validation of its measurement invariance on the population of aging adults.
The assessments conducted here are important for two reasons. First, an assessment of structural validity is needed to determine whether the scale items truly reflect the dimensionality of the construct. Second, an assessment of measurement invariance is warranted because the aging population is highly heterogeneous7,9,19–23 and generally has lower levels of Internet skills.6,9,24,25
Indeed, there is a large difference between the lifestyles of those aged 50–64 years (largely employed) and those aged 65+ years (largely retired), leading to an increasing risk of social and digital exclusion in the older group.21,26–28 Differences in Internet skills also relate to gender, with women reporting lower levels of Internet skills,6,29–32 and education; here, Internet skills are shown to increase with the level of education.1,30,31
Materials and Methods
Data collection and procedure
As in many recent studies,10,31,33,34 respondents were recruited from an online opt-in panel, operated by a leading professional market research agency, using a quota sampling design based on region, education, and age and gender combined. The target population was Slovenian Internet users aged 50+ years. According to the rules of the University of Ljubljana and the funding agency, the methods and subjects involved in this study were classified under research categories that are considered exempt from the Ethics Committee's oversight.
For the purpose of cross-validation, in July 2021, two independent samples were obtained as Sample 1 (N1 = 259, participation rate: 65 percent) and Sample 2 (N2 = 256, participation rate: 62 percent). Demographic characteristics are shown in Table 1. The results of chi-square tests showed no significant differences in the demographic structure of the two samples.
Characteristics of Data Samples
The chi-square test included only units with no missing values per variable. The share of missing values per variable was very small (≤2%).
M1 = 64.3 and M2 = 63.6.
Educational level reported according to the ISCED 2011. 38
ISCED, International Standard Classification of Education.
The Web-Use Skills scale
We tested a 14-item WUS (Table 2), with 10 items taken from Hargittai and Hsieh's scale for the general population 5 and 4 items added from previous adaptations of the WUS in the target population to improve its content validity.4–6,9,35,36 Notably, Hashtag was included because it is commonly used in social media, 36 while Operating System, Cookie, and Application embrace the contemporary experience of everyday (mobile) Internet use. 6
Items Included in Validation of the Web-Use Skills scale
Because the scale had never been used in Slovenia before, we prepared a Slovenian translation of the instrument following the translation, review, adjudication, pretesting, and documentation procedure. 37 The translated scale was pretested on three older Internet users with the use of cognitive interviews.
In accordance with pretest results, we modified the translation of several terms and decided to keep the English words alongside Slovenian translations for five items (i.e., Cache, Tagging, Malware, Phishing, and Hashtag) because these original English terms are widely used in Slovenian everyday communication.
Respondents answered the following question: “Listed below are some items related to the Internet. How well would you say you understand each item, on a scale from 1 to 5 (where 1 means ‘I don't understand’ and 5 means ‘I fully understand’)?”
Analytic strategy
The analyses consisted of several sequential steps. First, the structural validity of the one-factor 14-item WUS was assessed through confirmatory factor analysis (CFA) using data from Sample 1. Because of the model's poor fit, we conducted the CFA with 10 items from the study by Hargittai and Hsieh. 5 Since this model also exhibited a bad fit, we ran an exploratory factor analysis (EFA) on all 14 items. The EFA suggested a meaningful one-factor solution with six items. We cross-validated the six-item one-factor model with CFA using data from Sample 2.
Since the model had an adequate fit, its measurement invariance was assessed between the two samples. After confirming strict invariance between them, we merged both samples and assessed the measurement invariance of the six-item WUS across age (50–64 vs. 65+ years), gender (male vs. female), and education (secondary school or lower vs. higher education) groups. As (at least) scalar invariance was established in all cases, we compared latent mean differences between group members.
For EFA, we used maximum likelihood (ML) estimation with oblimin rotation. 17 The number of factors to retain was determined based on the scree plot and parallel analysis. 17 We retained items that had a high standardized loading on only one factor (>0.50) and communality >0.50. 17 Since we found deviations from normality for some WUS items, we used the robust maximum likelihood (MLR) estimator in the CFA.39,40 Standardized loadings had to be >0.50. 17
The model fit was assessed with the Satorra–Bentler scaled chi-square test statistic (SBχ 2 ), which was interpreted with caution because of its sensitivity to sample size. 17 We also observed the root mean square error of approximation (RMSEA), where a value <0.08 indicates a reasonable fit and <0.10 indicates acceptable fit.41,42 The values of the comparative fit index (CFI) >0.95 and standardized root mean residual (SRMR) <0.08 were considered acceptable.17,41
Composite reliability (CR; also known as McDonald's omega) and Cronbach's alpha (α) were used to assess internal consistency, with values >0.70 defined as acceptable. 17 Convergent validity was evaluated by the average variance extracted (AVE), with AVE >0.50 considered as acceptable. 17
Measurement invariance was assessed with a multigroup CFA using a series of increasingly stringent models in which between-group restrictions constrain different elements of the measurement model. 17 We tested configural (equivalence of model form), metric (equivalence of factor loadings), scalar (equivalence of item intercepts), and strict (equivalence of items' residuals) invariance. 43 In such testing, if the fit of the more constrained model is not worse than the fit of the less constrained model, then measurement invariance is supported.
To assess the change in model fit, we conducted the ΔSBχ 2 difference test. However, because this test is sensitive to sample size and violations of the normality assumption, 44 we primarily observed the ΔCFI to be <−0.010, supplemented by ΔSRMR <0.030 for metric and <0.010 for scalar and strict invariance. 43 All analyses were conducted in R 45 using the psych 46 and lavaan 47 packages.
Results
Sample 1
Before conducting the CFA, we analyzed correlations among the items in Sample 1. They ranged from 0.331 to 0.784 and were all significant at p < 0.05 (Appendix Table A1). Results of the CFA showed that all items had acceptable standardized factor loadings (0.604–0.850). However, the structural validity of the model was not confirmed because of its poor fit [SBχ2(77) = 482.214, RMSEA = 0.155, SRMR = 0.086, and CFI = 0.816].
The model–data fit of the 10-item WUS 5 was not acceptable either [SBχ2(35) = 237.397, RMSEA = 0.167, SRMR = 0.071, and CFI = 0.841]. Thus, we performed an EFA on the 14 items. The scree plot suggested a one-factor solution, whereas the parallel analysis indicated a two-factor solution.
We tested a two-factor model, and results of the EFA showed that six items (i.e., PDF, Computer Virus, Blog, Application, Cookie, and Operating System) loaded highly on the first factor (standardized loadings between 0.615 and 0.911), whereas the remaining items (i.e., Cache, Tagging, Wiki, Malware, Phishing, and Hashtag) loaded highly on the second factor (standardized loadings between 0.558 and 0.895), except for Advanced Search and JPG, which cross-loaded (Table 3).
Results of Exploratory and Confirmatory Factor Analyses on Samples 1 and 2
N1 = 259.
N2 = 256.
Items in bold are included in the proposed one-factor 6-item Web-Use Skills scale.
CFA, confirmatory factor analysis; EFA, exploratory factor analysis.
We found that items that load on the first factor relate to general computer and Internet use, while items loading highly on the second factor relate to specific and varied Internet domains.
In the next analyses, we focused on six items of the first factor for three reasons. First, web-use skills have been conceptually and empirically defined as a unidimensional construct.3–5 Second, it has been suggested that context-specific items referring to specific digital platforms or online activities should be avoided when measuring Internet skills. In fact, Hargittai and Hsieh 5 note that some individuals might be more familiar with certain terms not because they possess higher skills, but because these terms are related to their specific online interests or activities. In addition, van Deursen et al. 31 underscored that to ensure applicability of a set of items for a longer period of time, items that do not depend on currently popular Internet activities or platforms should be used in Internet skills survey inventories.
Third, the meaning of the second factor could not be unambiguously interpreted since the items with the highest standardized loading on this factor (>0.85; Cache and Tagging) have little common content. DeVellis 48 recommends that such factors should not be considered as indicators of a latent variable. When we ran the EFA with only six items of the first factor, the results were satisfactory. The items had standardized loadings between 0.713 and 0.883, with all items having communalities >0.50 (Table 3) and α = 0.923.
Sample 2
For Sample 2, we first analyzed correlations between the six items. They ranged from 0.409 to 0.753 and were all significant at p < 0.05 (Appendix Table A2). Next, the cross-validation based on the CFA showed that the six-item one-factor structure was acceptable. The standardized factor loadings ranged from 0.597 to 0.902, with SBχ2(9) = 20.402, RMSEA = 0.085, SRMR = 0.028, and CFI = 0.981 (Table 3). Internal consistency and convergent validity were also adequate, with the following values: AVE = 0.599, CR = 0.898, and α = 0.878.
Measurement invariance
To assess the measurement invariance of the six-item WUS between samples 1 and 2, we first fitted the model into each sample. The results indicated acceptable fit for both samples [Sample 1: SBχ2(9) = 15.815, RMSEA = 0.071, SRMR = 0.021, and CFI = 0.990; and Sample 2: SBχ2(9) = 20.402, RMSEA = 0.085, SRMR = 0.028, and CFI = 0.981]. Furthermore, we tested for configural invariance, and the model exhibited an acceptable fit (Table 4).
Results of Measurement Invariance Testing of the Six-Item Web-Use Skills Scale in Samples 1 and 2
Note: Ntotal = 515 (N1 = 259 and N2 = 256).
CFI, comparative fit index; CI, confidence interval; RMSEA, root mean square error of approximation; SBχ 2 , Satorra–Bentler scaled chi-square test statistic; SRMR, standardized root mean residual.
We then introduced constraints across samples to test for metric, scalar, and strict invariance. At each step, we evaluated the decrease in fit. While the ΔSBχ 2 was significant at p < 0.05 in all cases, the changes in CFI and SRMR were small enough to provide support for invariance: ΔCFI was <−0.01 for all models, and SRMR increased by 0.031 in the case of the metric model, which is just at the threshold, whereas the increase for the scalar and strict models was <0.01. Thus, we concluded that the six-item WUS demonstrated full, strict measurement invariance between samples.
After confirming measurement invariance across the two samples, we could merge them to have a larger sample for analysis of measurement invariance across selected sociodemographic characteristics. Applying the same procedure to the merged dataset (N = 515), the results showed full scalar measurement invariance across gender (Table 5) and age (Table 6), as well as full strict invariance across educational groups (Table 7).
Results of Measurement Invariance Testing of the Six-Item Web-Use Skills Scale by Gender (Male vs. Female)
Note: Ntotal = 515 (Nmale = 246 and Nfemale = 269).
Results of Measurement Invariance Testing of the Six-Item Web-Use Skills Scale by Age Group (50–64 and 65+ years)
Note: Ntotal = 515 (N50–64 = 243 and N65+ = 272).
Results of Measurement Invariance Testing of the Six-Item Web-Use Skills Scale by Education Level Groups (Secondary School or Lower vs. Higher Education)
Note: Ntotal = 513 (NupToSecEdu = 286 and NhighEdu = 227).
As scalar invariance was confirmed in all cases, we compared the WUS latent mean values across age, gender, and educational groups. We expected that females, older individuals, and less educated individuals have lower Internet skills than males, younger individuals, and more educated individuals.32,49 If these differences are confirmed, then this provides additional evidence of scale validity (i.e., known-group validity). 50
We constrained the latent means to be zero for females and older (65+ years) and less educated individuals (secondary school or lower), while freely estimating them for males, younger individuals (50–64 years), and individuals with higher education. The analysis showed that the WUS latent mean values were significantly higher for males and younger individuals (p ≤ 0.010 in both cases).
However, the difference was not significant in the case of education (p = 0.697), although more educated individuals had slightly higher WUS scores.
Discussion
This is the first study to comprehensively investigate the structural validity and measurement invariance of the WUS in a population of aging Internet users. Results show that the WUS—although in a short six-item form—is a valid and reliable proxy measure of the ability of Internet users, aged 50+ years, to find information efficiently and effectively on the web. Yet, while our findings provide convincing evidence for the future use of the six-item WUS in this population group, they also raise several conceptual and methodological issues for discussion.
First, we could not confirm the one-factor structure of either the 14-item or the original 10-item WUS. 5 On the one hand, this could be due to rapid changes in digital technologies, in that some Internet-related terms from the original scale could be outdated. 49 On the other hand, the 14- and 10-item scales included terms related to various computer and Internet domains, 51 but there is no guarantee that a person knowledgeable in one domain will be knowledgeable in another. 5
Hence, our finding that only the six-item scale demonstrates adequate structural validity is consistent with Hargittai and Hsieh's definition of Internet skills and their goal of proposing scales that have fewer components, but still optimally capture differences among respondents' web-use skills.3,5 Moreover, this is consistent with current research 20 showing that Internet proficiency decreases with age, which might explain why the general nature of the six-item WUS fits well in this population group.
In fact, Hargittai and Hsieh 5 proposed (shorter) scales with more general items for low-skilled groups of Internet users. However, our EFA results highlight the possibility that the concept of web-use skills could be extended to capture not only general but also more conceptually distinct and domain-specific dimensions of Internet use in later life.
Second, compared with the original 10-item WUS, 5 our proposed 6-item scale includes 3 of the originally proposed items (PDF, Computer Virus, and Blog), whereas 3 items come from adapted versions of the WUS (Table 2). This supports our decision to add items to the scale, confirming prior work arguing that Hargittai and Hsieh's inventory needs to be adapted after almost 10 years, in the sense of what comprises general Internet skills.4,5,35,52,53
The newly added items relate to basic understanding of computer software (i.e., Operating System), important contemporary web privacy issues (i.e., Cookie), and widespread use of mobile Internet (i.e., Application). They are also general enough not to be platform specific, which is important because Internet skills scales should be cognizant of large-scale changes in Internet technology while avoiding context-specific items related to short-lived services and/or transitory online activities. 31
In this sense, the proposed set of items provides a good balance between generalizability and temporality, which are needed in general Internet skills scales. 5
Third, the concise scale used here is particularly welcome in general social survey research. Measures of Internet skills are included as covariates in almost every study of digital engagement of any population. 49 A shorter inventory is also appropriate for use in large-scale studies5,49 because it reduces the cognitive load, which is critical in research with aging adults. It may also help reduce measurement bias related to a high response burden. 5 This is likely to be the case for self-administered questionnaires where respondents have very limited or no external memory support. 54
Fourth, given the heterogeneity of the aging population,9,19–23,27,28 assessment of measurement invariance is essential. Confirmation of (at least) scalar measurement invariance across age, gender, and educational groups suggests that the six-item WUS measures the same concept in the same way within these groups and that users with the same level of the latent trait have the same scores on the manifest (observed) variables.
These results also imply that both comparisons of structural relationships between the WUS and other constructs and comparisons of the latent means across age, gender, and education groups are valid and reliable only when the WUS contains items that are only related to proficiency in general Internet use. In fact, the WUS latent mean comparisons confirmed the assumption that older users (65+ years) and females have lower WUS scores.
The nonsignificant difference with respect to education could be a direct consequence of highly understood items (i.e., items with high averages) and might suggest that both educational groups share similar experiences with the fairly general Internet use related to the terms in the six-item list.
Limitations and future work
While a quota sample can be considered representative of aging Internet users in Slovenia in terms of sociodemographic characteristics, future research is invited to use probability-based sampling designs. This could further eliminate potential biases related to the fact that participants in opt-in Internet panels may have higher Internet proficiency when compared with the general population. 55 Accordingly, their mean scores on the WUS may be higher than those of the general population of aging Internet users.
In addition, further examination of the multidimensionality of the WUS is warranted. Future research could use more or even all of the original WUS items 5 to examine to what extent they fit multiple factor or bifactor models. 56 As a next step, research focusing on measurement invariance of the WUS could highlight other factors of digital inequality, such as occupation and income, 2 and investigate whether the WUS is also invariant in cross-country settings. 30 Moreover, it may be interesting to test the convergent validity of the WUS with other survey measures of Internet skills. In particular, this scale could be compared with those based on multidimensional measurement models to assess which theoretical dimensions of Internet skills besides web-use skills are captured by the WUS.
Footnotes
Author Disclosure Statement
No competing interests exist.
Funding Information
This research received public financial support from research grants (nos. L5-9337, Z5-8234, J5-2558, and P5-0399) administered through the Slovenian Research Agency and a Young Researcher Fellowship, cofounded by the Slovenian Research Agency from the national budget, awarded to the fourth author.
Appendix
Descriptive Statistics and Correlations of Web-Use Skills Scale Items in Sample 2
| Item | M | SD | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|---|---|
| PDF (1) | 3.91 | 1.311 | — | |||||
| Computer Virus (2) | 4.33 | 0.808 | 0.497 | — | ||||
| Blog (3) | 4.11 | 0.966 | 0.447 | 0.538 | — | |||
| Application (4) | 4.30 | 0.831 | 0.506 | 0.686 | 0.638 | — | ||
| Cookie (5) | 4.33 | 0.727 | 0.409 | 0.628 | 0.542 | 0.753 | — | |
| Operating System (6) | 4.18 | 1.018 | 0.579 | 0.656 | 0.514 | 0.744 | 0.680 | — |
Note: N2 = 256. All correlations are significant at p < 0.05.
