Measurement of Internet Addiction: An Item Response Analysis Approach

Abstract

Two widely used scales of Internet addiction (IA), the Internet Addiction Test (IAT) and the Chen Internet Addiction Scale (CIAS), were compared and a new scale of IA was assembled from their items with improved reliability in terms of classification consistency. A total of 467 Chinese college students participated in the study. Items were calibrated using the Muraki's Generalized Partial Credit Model. Most items had higher item information on medium levels of addiction, but much lower item information on the two ends of the latent trait continuum. The average item information of the CIAS was significantly larger compared with IAT on most of the latent trait levels. A new scale assembled using the cutoff points of IAT had a larger classification consistency than the original IAT. It was shown that the classification consistency of the IA measurement could be improved by selecting items to optimize test information around cutoff points. Implications for test and item development of IA were discussed.

Introduction

Internet has become an indispensable part of civilized life with an ever increasing number of users. Reflecting that trend, Mainland China, taking up nearly one fifth of the world population, has experienced rapid growth of Internet usage in recent years. Data released from the China Internet Network Information Center indicated that in June 2011 there were 480 million Internet users in Mainland China; the bulk of them were youth between 20 and 29 years old.¹ The misuse of the Internet has become a serious issue to a considerable number of people, and college students are especially at risk with the ease of access and flexible time schedules.^2,3

The term Internet addiction (IA) originally coined by Goldberg⁴ was adopted by many researchers, including Young to refer to problematic Internet use associated with significant social, psychological, and occupational impairment.⁵ Alternative terms include pathological Internet use,⁶ Internet pathological use,⁷ problematic internet use,⁸ Internet dependency,⁹ and Internet behavior dependence.¹⁰ This article has chosen to use the term of IA due to its prevalence and acceptance.

Measurement is central to the research method of IA. One of the first instruments is a short inventory with eight questions adapted from the criteria for pathological gambling in the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV), known as the 8-item Diagnostic Questionnaire of Internet Addiction.¹¹ Criticized for its lack of reliability,¹² Young expanded it to a 20-item Internet Addiction Test (IAT).¹³ Respondents are typically classified into two or three categories based on their sum scores. For example, for the 20-item IAT, Young¹³ suggested two cutoff scores at 50% and 80% of the total score. Taiwanese scholars developed an alternative scale, the Chen Internet Addiction Scale (CIAS), which has been widely used in China.¹⁴ The two scales share the characteristics of partial credit, self-report questionnaires, and classification based on the total score. They were chosen in this study because of reported good internal consistency and concurrent validity as well as their popularity.^15–17

The psychometric analysis of IA scales has some problems. First of all, standard statistics to compare different measurement instruments, including correlation coefficients and internal consistency offer little information regarding the reasons why one scale outperforms another.^15–18 Second, most of the current measurement instruments were developed years ago; given the rapid transformation of the Internet, the psychometric properties of IA scales need to be reinvestigated. Last, but not least, the scales of IA are often used as criterion-referenced tests since diagnosis is of interest, but they are generally treated as norm-referenced tests in psychometric analysis. For one thing, items in a certain scale usually have no relationship with its cutoff scores, while 50%, 80%, or 95% of the total scores are usually used as cutoff scores. For another, there has been a lack of reliability evidence concerning the classification of IA scales.

This study is an attempt to address the above problems using the item response theory (IRT).^19,20 The main purpose is to improve the measurement of IA using calibrated items selected from IAT and CIAS. Another goal is to compare the performances of the two scales.

Methods

Participants

An on-line survey was constructed. Internet users were recruited through popular social networking sites. The 467 participants (293 females, 174 males; average age M=23.31 years, SD=3.43) volunteered to complete the questionnaire received a chance to participate in a small lottery as a gesture of appreciation.

Instruments

This study used a 46-item questionnaire, which included 20 items from Young's IAT and 26 items from CIAS. Participants were asked to rate each item on a 5-point Likert scale (Not at all, Rarely, Occasionally, Often, Always). Upon scrutiny of the items in each instrument, the wording of some particular questions was revised.

Analyses procedures

In this study, the focus lies on the overall notion of IA. Although some instruments were developed with several subscales, the overall concept of IA can be seen as a second-order factor. Besides, the dimensionality of scales is generally ignored when classifying respondents according to their sum scores.^15,21 Therefore, this study applied a unidimensional IRT model to each of the scale.^22,23

Analysis of dimensionality and scalability. Before item response analysis, Mokken tests were conducted for IAT and CIAS, respectively, to examine the dimensionality and scalability using the Mokken package in the computer program R2.13.0.^24,25

IRT Analysis. Muraki's Generalized Partial Credit (GPC) Model was chosen as the IRTmodel¹⁹ in the computer program IRTProbeta.²⁶ There are three types of item parameters: a_i is the discrimination parameter of item i, which indicates how well an item uncovers the examinees' ability or trait; b_i is the location parameter of item i, which indicates the general difficulty level of a certain item; d_h is the step difficulty parameter of item I with a category score of h. Besides item parameters and latent trait scores (θ_j for respondent j), we could also obtain item information curves (which is a function of θ) for each item and the aggregation of them is the test information curve.¹⁹ Some items are deleted from each scale because of local dependence (LD). The criterion is the standardized LD χ² statistics, which suggest strong residual when the values are extremely large (often the value of 10 is used as a threshold).^27,28 Each time, one of the items in the item pair with the largest value of the standardized LD χ² was deleted. S-χ² item-fit statistics were also reported, significant results (p<0.01) of which indicate poor model-data fit.

Test Assembly. The calibrated items from CIAS and IAT form an item pool for IA. Twenty items were selected from the item pool to assemble a new scale. The diagnosis criteria proposed by Young were used.¹³ Ideally, the assembled test would have a test information curve like Test 3 in Figure 1.

FIG. 1.

Schematic plot illustrating the ideal test information curve of a criterion-referenced test in comparison with other tests.

The assembling process involves three steps²⁰: (1) identify the cutoff points of raw scores, which are50% and 80% of the total score. After deleting three items and collapsing categories, the total score of the revised IAT in this study is 68, so the two cutoff scores are 34 and 54; (2) use maximized likelihood to calculate the latent trait scores corresponding to raw scores with item parameters of the IRT model. In this study, the corresponding latent cutoff scores are 0 and 1.60; (3) select items from the item pool that can optimize the test information on the latent cutoff scores. At first, ten items which provide the most information around latent cutoff scores of 1.6 were selected; then, ten other items were obtained for the other cutting scores with the same criterion.

Classification consistency. The classification consistency of the new scale is compared with that of IAT which reflects the reliability of criterion-referenced tests.²⁹ Classification consistency was calculated with IRT-CLASS v2.0 for PC.²⁹

Results

Model fit

The initial model calibration used a polytomous model with five ordered categories. For 43 out of 46 items, no IRT model could fit the observed data according to S-χ² item-fit statistics (p<0.05). After joining the 4th and the 5th points on the Likert scale, as suggested by Embreston and Reise,¹⁹ the GPC model could fit the modified data with four categories well (p>0.05). The following analysis was based on the results of joined categories (1 to 4).

After collapsing categories, Mokken analysis was conducted again and the overall H coefficient is 0.35 for IAT and 0.44 for CIAS. Both scales can be regarded as moderate scales according to Mokken's criterion.²⁴ Finally, three items (Y1, Y8, and Y17) were deleted from IAT; and five items (C3, C5, C12, C21, and C26) were deleted from CIAS because of LD. The RMSEA for the final models are satisfactory (0.04 for IAT and 0.07 for CIAS), and none of the S-χ² item level diagnostic statistics is significant (p>0.01).

Item analysis and test information

The items from two scales were calibrated separately, but the responses were from the same group of respondents and the scale of latent trait was set with a mean of 0 and a standard error of 1. Therefore, the item parameters from the two scales can be compared directly without the need of equating.³⁰ The item parameters of each item are presented in Table 1.

Table 1.

GPC Model Item Parameter Estimates of Items from IAT and CIAS

Item	a	s.e.	b	s.e.	d₂	s.e.	d₃	s.e.	d₄	s.e.
Y2	0.90	0.10	−0.81	0.12	1.21	0.18	0.00	0.15	−1.21	0.14
Y3	0.79	0.09	0.88	0.12	1.04	0.15	−0.16	0.19	−0.88	0.20
Y4	0.61	0.09	2.22	0.27	0.18	0.26	0.86	0.35	−1.04	0.39
Y5	0.98	0.11	0.86	0.11	0.56	0.12	0.39	0.16	−0.95	0.16
Y6	0.99	0.11	0.65	0.10	0.68	0.12	0.17	0.15	−0.85	0.14
Y7	0.85	0.10	−0.41	0.10	0.61	0.16	0.04	0.17	−0.65	0.13
Y9	0.75	0.09	0.93	0.12	0.28	0.17	0.57	0.21	−0.85	0.20
Y10	0.81	0.09	−0.02	0.10	1.12	0.15	0.06	0.16	−1.19	0.15
Y11	0.86	0.10	0.43	0.09	0.58	0.13	−0.02	0.17	−0.56	0.16
Y12	0.44	0.06	−0.07	0.13	0.45	0.27	0.19	0.31	−0.64	0.26
Y13	1.02	0.13	1.51	0.14	0.15	0.17	0.69	0.21	−0.84	0.21
Y14	0.66	0.08	0.72	0.11	0.14	0.19	0.37	0.23	−0.5	0.21
Y15	1.52	0.18	1.72	0.14	0.35	0.13	0.29	0.17	−0.65	0.20
Y16	0.71	0.09	0.63	0.1	0.02	0.18	0.07	0.22	−0.09	0.21
Y18	1.10	0.13	1.18	0.11	0.28	0.14	0.41	0.17	−0.69	0.17
Y19	0.74	0.09	1.06	0.12	0.33	0.17	0.46	0.22	−0.79	0.22
Y20	1.15	0.14	1.55	0.13	0.21	0.16	0.44	0.19	−0.65	0.21
C1	1.05	0.11	1.12	0.11	0.33	0.13	0.51	0.17	−0.84	0.17
C2	0.98	0.10	0.52	0.08	0.79	0.11	−0.23	0.15	−0.55	0.15
C4	1.01	0.10	0.29	0.07	0.65	0.11	−0.13	0.14	−0.52	0.13
C6	0.91	0.10	−0.28	0.08	0.76	0.14	−0.28	0.15	−0.48	0.13
C7	1.21	0.13	0.94	0.09	0.26	0.11	−0.07	0.15	−0.19	0.15
C8	0.84	0.12	2.03	0.20	−0.52	0.28	0.92	0.32	−0.40	0.31
C9	0.73	0.08	0.79	0.10	0.26	0.16	−0.19	0.22	−0.07	0.21
C10	1.75	0.19	1.13	0.09	0.58	0.09	0.16	0.11	−0.75	0.12
C11	1.54	0.15	0.67	0.07	0.64	0.08	−0.13	0.1	−0.51	0.11
C13	0.52	0.07	0.07	0.11	1.01	0.22	−0.35	0.26	−0.66	0.23
C14	0.81	0.09	0.86	0.10	−0.11	0.17	0.06	0.21	0.05	0.21
C15	1.30	0.13	0.66	0.08	0.77	0.1	0.24	0.11	−1.01	0.13
C16	1.11	0.11	0.34	0.07	0.52	0.11	−0.05	0.13	−0.47	0.12
C17	1.09	0.12	1.00	0.10	0.55	0.12	0.29	0.15	−0.83	0.16
C18	1.23	0.12	0.15	0.07	0.59	0.10	0.08	0.11	−0.66	0.11
C19	1.29	0.13	0.34	0.06	0.49	0.09	−0.16	0.12	−0.33	0.11
C20	0.95	0.11	1.01	0.10	0.54	0.12	−0.07	0.17	−0.47	0.18
C22	1.60	0.16	0.54	0.07	0.71	0.08	−0.11	0.09	−0.6	0.1
C23	1.35	0.14	1.24	0.1	0.53	0.11	0.25	0.14	−0.78	0.16
C24	1.69	0.17	1.17	0.09	0.63	0.09	0.15	0.11	−0.79	0.14
C25	1.12	0.12	0.76	0.08	0.60	0.11	0.20	0.14	−0.80	0.14

Note: logit=a[k(θ−b)+Σd_k, where a is the item discrimination parameter, b is the item location parameter, and d_k(k=1, 2, 3, 4) is the category parameter. According to the model setting, d₁ is 0 for each item. C stands for items form CIAS; Y indicates IAT items. Item 1, 8, and 17 were deleted from IAT; and Item 3, 5, 12, 21, and 26 were deleted from CIAS. GPC, Muraki's Generalized Partial Credit; IAT, Internet Addiction Test; CIAS, Chen Internet Addiction Scale.

The test information curves of two scales are presented in Figure 2A. The 21-item CIAS has much larger test information than the 17-item IAT on most levels of IA. The dotted line in Figure 2A indicates test information of 6.25 and correspondingly standard errors of measurement (SEM) of 0.4 standard units. The SEM for middle and higher latent trait levels is below 0.4, which is the minimum requirement of precision for a psychological test.³¹

FIG. 2.

Test item information curves (A) and Average item information curves (B) of the Internet Addiction Test (IAT) and the Chen Internet Addiction Scale (CIAS).

Since IAT and CIAS have different lengths, comparison was also conducted based on average item information. It can be seen in Figure 2B that the average item information for IAT is lower than CIAS throughout the whole range; and the advantage of CIAS over IAT on the upper range of the trait scale is even more prominent.

Comparison between the new scale and the old ones

The new scale of 20 items is presented in Table 2. The average item information of the new scale is compared to those of IAT and CIAS in Figure 3. Only a little improvement can be seen around the cutoff point of 1.6. The classification consistency coefficient κ for IAT is 0.65. The new scale, as expected, has a higher κ coefficient of 0.79 indicating improved reliability for a criterion-referenced test.

FIG. 3.

Average item information curves for IAT, CIAS, and the new scale.

Table 2.

Items of the New Scale

		Question
1	C2	Do you feel uneasy if you are away from the Internet for a while?
2	C4	Do you find yourself becoming upset when there is no network connection?
3	C6	Do you find yourself stay online much longer than you plan to?
4	C7	Do you keep online even if the Internet has negatively affected your interpersonal relationships?
5	C10	Do you feel depressed during a period of time without Internet?
6	C11	Do you find it impossible to keep yourself offline?
7	C15	Does your academic or job performance suffer from negative influences because of using the Internet?
8	C16	Do you feel like you miss something if being offline for a while?
9	C17	Do you decrease interaction with family members because of using the Internet?
10	C18	Do you decrease the time for other kinds of entertainment because of online activities?
11	C19	Do you fail to keep yourself back online after logging off even if you have other things to do?
12	C22	Do you try to curb your impulse to log in and fail?
13	C23	Do you cut your sleeping time to have more time online?
14	C24	Do you find yourself spending more and more time online to feel satisfied?
15	C25	Do you fail to eat on time because of the Internet?
16	Y6	Do your grades or school work suffer because of the amount of time you spend on-line?
17	Y13	Do you snap, yell, or act annoyed if someone bothers you while you are on-line?
18	Y15	Do you feel preoccupied with the Internet when off-line, or fantasize about being on-line?
19	Y18	Do you try to hide how long you've been on-line?
20	Y20	Do you feel depressed, moody or nervous when you are off-line, which goes away once you are back on-line?

Discussion

Since Internet has gone through drastic changes ever since the first IA instruments were developed, renewed evidence of psychometric properties is always needed. It is necessary that the content of some items be revised before use in research. E-mail, for instance, has become an indispensable communication tool both at work and in the home. Typically, checking one's email is the first thing one does after logging on the Internet.³² Therefore, one item from IAT which says, “Do you check your e-mail before something else that you need to do?” may now be deemed as a description of a common habit other than a disorder. While some practices become common in daily life, new behaviors are emerging, such as online social networking sites as well as e-shopping. Future researchers must continually check the relevance of the questions.

Researchers should make an informed decision about which scale to select rather than base on their own personal preference. According to the current study, CIAS outperformed IAT in terms of average item information as well as test information; so the advantage of CIAS should be attributed to its high-quality items rather than longer length. The disparity between the two instruments in terms of test information turns out quite large, but currently, the two scales have been used indifferently in many studies. Although it might be a premature conclusion to say that IAT should not be used any more, the CIAS is a more reliable instrument than the Chinese version of IAT according to the findings of this study.

In this study, an item selection approach is proposed to assemble a new scale with existing items. Using this approach, the cutoff scores and the test items become closely related. It turns out that the new scale has average item information improved only slightly because the items available in this study are very limited. With a larger item pool, this method is very likely to greatly improve test information around the cutting scores. A direct benefit of this method is increased classification consistency.

It should be noted though that the item selection approach only works if the cutting scores are valid and reasonable. The cutting scores used in this study are 50% and 80% of the total scores suggested by Young. This assignment of cutting scores appears subjective and casual. How to set valid cutting scores remains an interesting topic.

Footnotes

Author Disclosure Statement

No competing financial interests exist.

References

Chinese Internet Network Information Center. 2011. 28th China Internet development statistics report. www.cnnic.net.cn/dtygg/dtgg/201101/t20110118_20250.html. 2011 Jul.

Chou

, Hsiao

. Internet addiction, usage, gratification, and pleasure experience: the Taiwan college students' case. Computers and Education, 2000; 35:65–80.

Moore

. 1995. The emperor's virtual clothes: the naked truth about the Internet culture. Alogonquin, NC: Chapel Hill.

Goldberg

. 1996. Internet Addiction Disorder. www.rider.edu/∼suler/psycyber/supportgp.html. 1996 Aug.

Young

. A case that breaks the stereotype. Psychological Reports, 1996; 79:899–902.

Young

. 1999. Internet addiction: symptoms, evaluation, and treatment. VandeCreek

, Jackson

. Innovations in clinical practice: a source book, 17. Sarasota, FL: Professional Resource Press, 19–31.

Davis

. A cognitive-behavioral model of pathological Internet use. Computers in Human Behavior, 2001; 17:187–195.

Beard

, Wolf

. Modification in the proposed diagnostic criteria for internet addiction. Cyberpsychology and Behavior, 2001; 4:377–383.

Scherer

. College life online: healthy and unhealthy Internet use. Journal of College Student Development, 1997; 38:655–665.

10.

Hall

, Parsons

. Internet addiction: college student case study using best practices in cognitive behavior therapy. Journal of Mental Health Counseling, 2001; 23:312–327.

11.

Young

. 1996. Internet addiction: the emergence of a new clinical disorder. Paper presented at the 104 American Psychological Association Annual Convention, Toronto, Canada.

12.

Grohol

. Too much time online: Internet addiction or healthy social interactions? Cyberpsychology and Behavior, 1999; 2:395–401.

13.

Young

. 1998. Caught in the Net: how to recognize the signs of Internet addiction—and a winning strategy for recovery. New York: Wiley, 257–258.

14.

Chen

, Weng

, Su

et al. Development of a chinese Internet addiction scale and its psychometric study. Chinese Journal of Psychology, 2003; 45:279–294.

15.

Bai

, Fan

. A Study on the Internet dependence of college students: the revising and applying of a measurement. Psychological Development and Education, 2005; 4:99–104.

16.

Widyanto

, McMurran

. The psychometric properties of the Internet addiction test. Cyberpsychology and Behavior, 2004; 7:443–450.

17.

Widyanto

, Griffiths

, Brunsden

. A psychometric comparison of the Internet addiction test, the Internet-related problem scale, and self-diagnosis. Cyberpsychology, Behavior, and Social Networking, 2011; 14:141–149.

18.

Canan

, Ataoglu

, Nichols

et al. Evaluation of psychometric properties of the internet addiction scale in a sample of Turkish high school students. Cyberpsychology, Behavior, and Social Networking, 2010; 13:317–320.

19.

Embretson

, Reise

. 2000. Item response theory for psychologists. Mahwah, NJ: Erlbaum.

20.

Hambleton

, Gruijter

DNM

. Application of item response models to criterion-referenced test item selection. Journal of Educational Measurement, 1983; 20:355–367.

21.

Christakis

, Moreno

, Jelenchick

et al. Problematic internet usage in US college students: a pilot study. BMC Medicine, 2011; 9:77–82.

22.

Fletcher

, Hattie

. An examination of the psychometric properties of the physical self-description questionnaire using a polytomous item response model. Psychology of Sport and Exercise, 2004; 5:423–446.

23.

Fraley

, Waller

, Brennan

. An item response theory analysis of self-report measures of adult attachment. Journal of Personality and Social Psychology, 2000; 78:350–365.

24.

Mokken

. 1971. A theory and procedure of scale analysis. Berlin, Germany: De Gruyter.

25.

van der Ark

. Mokken scale analysis in R. Journal of Statistical Software, 2007; 20:1–19.

26.

Cai

, du Toit

, Thissen

. 2009. IRTPRO: Flexible, multidimensional, multiple categorical IRT modeling [Computer software] Chicago, IL: Scientific Software International.

27.

Orlando

, Thissen

. Further investigation of the performance of SX2: an item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 2003; 27:289–298.

28.

Orlando

, Thissen

. Likelihood-based item fit indices for dichotomous item response theory models. Applied Psychological Measurement, 2000; 24:50–64.

29.

Lee

. Classification consistency and accuracy for complex assessments using item response theory. Journal of Educational Measurement, 2010; 47:1–17.

30.

Hambleton

, Swaminathan

. 1985. Item response theory: principles and applications. Dortrecht: Kluwer.

31.

Irwin

, Stucky

, Langer

et al. An item response analysis of the pediatric PROMIS anxiety and depressive symptoms scales. Quality of Life Research, 2010; 19:595–607.

32.

Weiser

. The functions of Internet use and their social and psychological consequences. Cyberpsychology and Behavior, 2001; 4:723–743.