Abstract
Objective:
The present study examined whether implementing recommendations of Web accessibility guidelines would have different effects on nondisabled users than on users with visual impairments.
Background:
The predominant approach for making Web sites accessible for users with disabilities is to apply accessibility guidelines. However, it has been hardly examined whether this approach has side effects for nondisabled users. A comparison of the effects on both user groups would contribute to a better understanding of possible advantages and drawbacks of applying accessibility guidelines.
Method:
Participants from two matched samples, comprising 55 participants with visual impairments and 55 without impairments, took part in a synchronous remote testing of a Web site. Each participant was randomly assigned to one of three Web sites, which differed in the level of accessibility (very low, low, and high) according to recommendations of the well-established Web Content Accessibility Guidelines 2.0 (WCAG 2.0). Performance (i.e., task completion rate and task completion time) and a range of subjective variables (i.e., perceived usability, positive affect, negative affect, perceived aesthetics, perceived workload, and user experience) were measured.
Results:
Higher conformance to Web accessibility guidelines resulted in increased performance and more positive user ratings (e.g., perceived usability or aesthetics) for both user groups. There was no interaction between user group and accessibility level.
Conclusion:
Higher conformance to WCAG 2.0 may result in benefits for nondisabled users and users with visual impairments alike.
Application:
Practitioners may use the present findings as a basis for deciding on whether and how to implement accessibility best.
Introduction
In modern society, a large proportion of the population uses the Web, but these users differ considerably in their competencies, characteristics, and needs. Therefore, research on Web site design for specific user groups (e.g., children, elderly people, inexperienced users) has become an increasingly popular issue in the field of human-computer interaction (e.g., Jacko, 2012). People with disabilities also represent such a specific user group, which is important to consider in Web site design (e.g., Vu & Proctor, 2011). About one in six persons has some type of disability (World Health Organization, 2011), such as a visual impairment, hearing impairment, or motor impairment. Such impairments may result in various barriers when using Web sites. For instance, a person with impaired eyesight may have difficulty reading content on a Web site because the text-to-background contrast is too low, a person with a hearing impairment cannot access audio information, or a user with a motor impairment cannot access a button on a form because he or she cannot click it with the mouse. It is obvious that these kinds of barriers lead to disadvantages in a society that relies heavily on Web services. Therefore, Web content accessibility guidelines aim to reduce barriers for users with disabilities by providing recommendations on disability-friendly Web site design. These recommendations include appropriate thresholds for text-to-background contrasts or the use of captions for audio content (e.g., Caldwell, Cooper, Reid, & Vanderheiden, 2008). Accessibility practitioners as well as researchers endorse the application of such accessibility guidelines, which makes it the prevailing approach in Web site design for supporting users with disabilities (e.g., Cooper, Kirkpatrick, & O Connor, 2014; Jacko, 2012; Thatcher et al., 2006; Vu & Proctor, 2011; Yesilada, Brajnik, Vigo, & Harper, 2012).
However, there is some controversy about whether implementing accessibility guidelines may result in negative side effects for nondisabled users. While practitioners often assume such negative consequences for nondisabled users (e.g., disability-friendly Web sites are ugly, dull, or boring; e.g., Ellcessor, 2014; Petrie, Hamilton, & King, 2004; Thatcher et al., 2006), recent research has shown positive effects of applying accessibility guidelines for nondisabled users (Schmutz, Sonderegger, & Sauer, 2016). Investigating such side effects is of importance because it is very rare that only people with disabilities use a Web site. Instead, nondisabled users usually constitute the vast majority of users. The consequences for nondisabled users are thus of particular significance for practitioners, not least for economical reasons (e.g., Farrelly, 2011). Since only very few studies have considered effects of implementing accessibility guidelines for nondisabled users (Yesilada et al., 2013), the present work aims to examine this by comparing the effects of implementing accessibility guidelines for nondisabled users to users with visual impairments.
Web Content Accessibility Guidelines
While several Web accessibility guidelines exist (e.g., U.S. Section 508 regulations, https://www.section508.gov; IBM accessibility checklist, http://www-03.ibm.com/able/guidelines/ci162/accessibility_checklist_web.html), the Web Content Accessibility Guidelines 2.0 (WCAG 2.0) (Caldwell et al., 2008) is the most widely used standard among researchers and practitioners. The WCAG 2.0 is the basis of legal requirements for Web accessibility in many countries (e.g., Australia, Canada, Germany, United Kingdom, Switzerland) (Rogers, 2015; Thatcher et al., 2006) and is also the International Organization for Standardization’s (2012) yardstick for accessibility. The WCAG 2.0 comprise a list of 61 success criteria and enables the classification of a Web site in one of four accessibility levels: no accessibility (NA), low accessibility (A), high accessibility (AA), and very high accessibitiliy (AAA). Put simply, the more accessibility criteria a Web site meets, the higher the accessibility level of the Web site will be (see Caldwell et al., 2008). Although legislation often requires the implementation of WCAG 2.0, it is astonishing that so little is known about possible side effects for nondisabled users.
Effects of WCAG 2.0 on Nondisabled Users: Research Evidence
Although this study focuses on WCAG 2.0, there are also a few studies that examined the influence of accessible Web site design on nondisabled users prior to the release of WCAG 2.0 (e.g., Disability Rights Commission, 2004; Huber & Vitouch, 2008). These studies focused on WCAG 1.0 (cf. http://www.w3.org/TR/WCAG10/) as the preceding guideline. Studies based on WCAG 1.0 mainly indicated positive consequences for nondisabled users, such as shorter task completion time (Disability Rights Commission, 2004), higher perceived usability (Huber & Vitouch, 2008), and higher scores in automated usability testing (Sullivan & Matson, 2000). Since WCAG 1.0 and WCAG 2.0 differ considerably (Reid & Snow-Weaver, 2008, 2009), results from these earlier studies may need to be treated with caution.
To our knowledge, there are only two studies that investigated the effects of WCAG 2.0 on nondisabled users. In the first study, nondisabled users tested three versions of a municipal Web site with different levels of accessibility (NA, A, and AA) (Schmutz et al., 2016). Aside from the differences in WCAG 2.0 compliance, the Web sites contained the same content (e.g., text and images). Sixty-one nondisabled users solved tasks on the three Web sites. The results revealed many benefits of the AA Web site compared to the NA Web site. This included faster task completion time; higher task completion rate; higher ratings in usability, trust, and aesthetics; and lower ratings in workload for the AA Web site. The A Web site did not differ from the two other Web sites for any of the variables. In regard to emotional state, the three Web sites received the same ratings. Although the study used a promising approach by investigating a rather large sample, the sample was quite homogeneous and comprised students only.
The second study used a similar approach but tested nondisabled users and users with visual impairments (Pascual, Ribera, Granollers, & Coiduras, 2014). Nine users with visual impairments and five nondisabled users took part in a usability test, assessing two Web sites presenting tourist information. These Web sites differed in WCAG 2.0 compliance but contained the same content (i.e., text and pictures). One Web site corresponded to level NA and the other Web site to level A. All participants tested both Web sites by solving a set of tasks. Due to the small sample size, only descriptive results were reported. These results indicated that nondisabled users showed shorter task completion times when interacting with an A Web site compared to an NA Web site. Furthermore, nondisabled users reported higher satisfaction and more positive affect for the A Web site than the NA Web site. Task completion rate did not differ between the Web sites. Considering the users with visual impairments, the results showed lower task completion time, higher task completion rate, and higher satisfaction when using the A Web site compared to the NA Web site. They also reported more positive emotion when using the A Web site. It is to be pointed out that a comparison of effects of accessible Web site design on nondisabled users and users with disabilities is of particular interest for practice. Such a comparison allows us to develop a better understanding of advantages and drawbacks of accessible Web site design for different user groups. Based on this information, practitioners may be able to decide whether WCAG 2.0 should be implemented or not. Pascual et al. (2014) provided important first insights into the effects of WCAG 2.0 on nondisabled users and people with visual impairments. However, the small sample size and the fact that only descriptive data were reported have to be taken in consideration when interpreting the results. Consequently, future studies should make use of larger samples of nondisabled users and users with disabilities (e.g., Schmutz et al., 2016; Yesilada et al., 2012, 2013).
The Present Study
The present work aims to build on previous research by comparing effects of accessible Web site design on two matched samples of nondisabled users and users with visual impairments in a controlled quasi-experimental setting. A Web site’s accessibility was manipulated as an independent variable using three WCAG 2.0 levels: NA, A, and AA. Each participant solved five standardized tasks on one of the Web sites. As dependent variables, performance was measured (i.e., task completion time and task completion rate) as well as several subjective measures (i.e., perceived usability, positive and negative affect, aesthetics, workload, and user experience).
Based on previous findings, two hypotheses were formulated. First, higher accessibility levels would lead to higher performance and more positive subjective evaluations for nondisabled people and people with visual impairments. Second, the advantages of higher accessibility levels were expected to be greater for users with visual impairments than for users without, which would result in a significant interaction between the independent factors accessibility level and user group.
Method
Design and Participants
In this experiment, a 2 × 3 between-subjects design was employed, with user group (unimpaired eyesight vs. impaired eyesight) and accessibility level (NA = very low conformance, A = low conformance, AA = high conformance) representing the independent variables.
A total of 110 participants (i.e., N = 55 for each user group) took part in the study. In a first step, participants with visual impairments were recruited in Switzerland, Germany, and Austria. The participants were required to have maximum eyesight of 20% on the better eye, some experience in using the Web, and a minimum age of 18 years. Their ages ranged from 22 to 73 years. Of people with impaired eyesight, 69% were considered to be blind (i.e., eyesight <2% on the better eye), and 31% had reduced eyesight (eyesight between 2% and 20% on the better eye). In a second step, a matched group of 55 users without visual impairments was recruited. These participants were required to have unimpaired acuity (with or without correction) and color vision (i.e., no visual impairment diagnosed by a physician or registered by health insurance). We did not conduct a vision test and relied on self-reported data. Their ages ranged from 22 to 71 years. Matching variables included age, gender, and perceived experience in using the Web and computers (combined subjective rating). These variables were statistically tested for possible differences between user groups to reduce the influence of confounding factors. The results showed that there were no differences between the two user groups (p > .05 for all three matching variables). Each participant was randomly assigned to one of the Web site conditions. Table 1 gives an overview of demographic variables as a function of eyesight and Web site condition. This research complied with the American Psychological Association Code of Ethics and was approved by the Institutional Review Board at the University of Fribourg. Informed consent was obtained from each participant.
Matching Variables Divided Into Conditions User Group and Accessibility Level
Note. Web experience = Likert scale from 1 to 5; NA = very low conformance to Web Content Accessibility Guidelines 2.0 (WCAG 2.0); A = low conformance to WCAG 2.0; AA = high conformance to WCAG 2.0.
The Web Sites
We created three versions of a municipal Web site based on existing content (see Figure 1). Each version of the Web site corresponded to a different accessibility level (i.e., NA, A, and AA). While only design aspects required in WCAG 2.0 were changed (e.g., contrast and link descriptions), other Web site characteristics (e.g., text content or pictures) remained the same for the three Web sites. Schmutz et al. (2016) provide a detailed description of how the Web sites were made and validated. In total, 13 criteria were manipulated (see Appendix for a complete list of criteria).

Screenshots of the home page of the Web sites used for testing (top, level AA, high conformance; bottom, level NA, very low conformance).
The Web site examined contained information about a municipality in the country of Liechtenstein (Europe). This included, for instance, information about administrative issues, education, local politics, or leisure services. The design of the Web site was static (based on html and CSS), which means that information was primarily presented by means of text and images and did not use sound or animations (see Schmutz et al., 2016, for detailed information about the Web sites).
Assistive Tools
We asked participants with visual impairments to report the assistive tools they normally use for navigating the Web site. To increase external validity, we then asked them to use the Web site the same way. The 55 users with visual impairments used the following assistive tools: a screen reader (14 participants), a screen magnifier (10 participants), a screen reader combined with a braille board (23 participants), and a screen reader combined with a screen magnifier (5 participants). Three participants used no assistive tool. After the randomized assignment of participants to the three Web sites, there was no significant difference (p = .425) in the use of assistive tools between these conditions (NA: 17 participants used assistive tools; A: 17 participants used assistive tools, 1 participant did not use assistive tools; AA: 18 participants used assistive tools, 2 participants did not use assistive tool).
Measures
Performance measures
Performance was assessed by measuring task completion rate (%) and task completion time (seconds). The measurement of task completion time began when the participants started searching the Web site and ended when they found the information requested. Users without visual impairments marked the appropriate content with the cursor when they found the requested content, whereas participants who were visually impaired indicated this orally. If participants did not complete a task within six minutes, they were asked to move on to the next one. In this case, a failed attempt was recorded without explicitly mentioning this to the participants. They were not informed about the time limit to avoid time pressure being induced.
Subjective measures
Six subjective variables were measured: usability, positive and negative affect, aesthetics, workload, and user experience. These variables were assessed with German versions of well-established questionnaires. All items were rated on 5-point Likert scales. (a) To measure perceived usability, the Web site Analysis and MeasureMent Inventory (WAMMI; Kirakowski, Claridge, & Whitehand, 1998) was used. (b) Positive and negative affect were subjectively evaluated by the Positive and Negative Affect Schedule (PANAS; Watson, Clark, & Tellegen, 1988). (c) Perceived aesthetics was assessed by the classical aesthetics subscale of the User Experience Scale (Lavie & Tractinsky, 2004). The scale was chosen for this study because its items make little reference to visual capabilities (e.g., “The Web site is clean, pleasant, or clear” rather than “The colors are appealing”), which made it possible to be used by participants who were visually impaired. (d) Workload was measured by the NASA Task Load Index (NASA TLX; Hart & Staveland, 1988). Since participants were not pressed for time, the item measuring time pressure was excluded. This resulted in a five-item version of the NASA TLX including the dimensions mental demand, physical demand, effort, performance, and frustration. (e) Finally, the User Experience Questionnaire (UEQ; Laugwitz, Held, & Schrepp, 2008) was used to assess user experience. In its original form, the UEQ comprises six subscales (i.e., attractiveness, perspicuity, efficiency, dependability, stimulation, and novelty). Since aesthetics is already assessed by the UES’s classical aesthetics scale, the UEQ’s subscale attractiveness was not used.
Matching variables
In addition to these dependent measures, we used two items for assessing the matching variable Web and computer experience. Web experience was assessed by means of two items (How experienced are you in using computers? How experienced are you in using the Web?) using a 5-point Likert scale ranging from not at all experienced to very experienced. According to previous work (e.g., Chadwick-Dias, Tedesco, & Tullis, 2004), the two items about experience were averaged to a single Web and computer experience score.
Covariates
Although the two samples (i.e., unimpaired eyesight and impaired eyesight) were matched regarding Web and computer experience, age, and gender, the differences of Web experience and age between each experimental condition were still considerable. Therefore, age and Web experience were used as covariates in all statistical analyses. For the analysis of positive and negative affect, a baseline measure was taken prior to testing and used as an additional covariate. Results of covariates were only reported when they showed a significant relationship with the respective dependent variable.
Procedure
We used the synchronous remote testing method in the present experiment. To prepare participants for testing, they received an e-mail explaining the experimental procedure one week prior to the testing session. This e-mail also included a request to install a screen sharing software TeamViewer (www.teamviewer.com), which was required for remote testing (cf. Andreasen, Nielsen, Schrøder, & Stage, 2007; Dray & Siegel, 2004). Prior to the beginning of the testing session, participants were contacted by phone. The telephone line was kept open throughout the entire testing session to provide support in case of difficulties. After the introduction by phone, the participants’ screen became visible to the experimenter via TeamViewer for the entire testing session. The experimenter also employed TeamViewer’s integrated screen recording feature to capture all the interactions of the participants with the Web site. Participants were then instructed to set two tabs in their browser window, which allowed them to switch easily between questionnaires and test Web site. Before completing the tasks, participants filled in PANAS as a baseline measure of affect. Afterward, five tasks were completed (see Table 2). When a task had been solved or the time limit of 6 minutes had been exceeded, participants were asked to move on to the next task. After the fifth task, participants completed the PANAS for a second time, followed by the other questionnaires (i.e., WAMMI, NASA TLX, UEQ, aesthetics, and several demographic items).
Tasks to Be Completed on the Web Site
Data Analysis
A two-factorial analysis of covariance was conducted with Web accessibility (i.e., NA, A, and AA) and eyesight (i.e., unimpaired eyesight and impaired eyesight) as fixed factors, together with age and Web and computer experience as covariates. All requirements for an ANCOVA were met. For the condition accessibility level, post hoc tests were adjusted using the Bonferroni method and only reported in case of significant differences between groups. To analyze the task completion rate, the percentage of successfully completed tasks per participants was used. To analyze the task completion time, the mean time per task was used. These performance measures were chosen to match previous work in this field (e.g., Schmutz et al., 2016) to allow comparability of results.
Results
Performance Measures
Task completion rate
In line with our prediction, Table 3 shows that the mean for task completion rate is lowest in the NA condition, followed by A, and was found to be highest for AA. The overall effect was significant, F(2, 102) = 4.12, p = .019, partial-η2 = .08. Post hoc tests revealed a significant difference between NA and AA (p = .027). The means further indicate that participants with impaired eyesight completed fewer tasks than participants with unimpaired eyesight (see Table 3). This difference was also statistically significant, F(1, 102) = 33.27, df = 1, 102, p < .001, partial-η2 = .25. No significant interaction between accessibility level and user group was found, F(2, 102) = 1.35, p = .264, partial-η2 = .03. The covariate age was negatively related to task completion rate, F(1, 102) = 14.51, p < .001, partial-η2 = .13.
Performance Measures as a Function of Accessibility Level and User Group: Means and Standard Errors
Note. Task completion rate is expressed as a percentage; task completion time is expressed in seconds. NA = very low conformance to Web Content Accessibility Guidelines 2.0 (WCAG 2.0); A = low conformance to WCAG 2.0; AA = high conformance to WCAG 2.0.
Task completion time
According to our expectations, participants showed the highest mean in task completion time when using the NA Web site, followed by the A Web site, and the lowest task completion time in the AA condition (see Table 3). This effect of accessibility level on task completion time was statistically significant, F(2, 102) = 5.44, p = .006, partial-η2 = .10. The post hoc analysis showed that the difference between NA and AA was significant (p = .007). The results also indicated that participants with visual impairments were slower than participants without visual impairments (see Table 3). This effect of user group on task completion time was also significant, F(1, 102) = 102.79, p < .001, partial-η2 = .50. No interaction between user group and accessibility level was found, F(2, 102) = 1.06, p = .349, partial-η2 = .02. The covariates age, F(1, 102) = 11.89, p = .001, partial-η2 = .10, was positively related to task completion time, while experience, F(1, 102) = 6.50, p = .012, partial-η2 = .06, was negatively related to task completion time.
Subjective Measures
Perceived usability
In line with our hypotheses, usability ratings were lowest for the NA condition and highest for the AA condition (see Table 4). This main effect of accessibility level on perceived usability was statistically significant, F(2, 102) = 4.46, p = .014, partial-η2 = .08. Post hoc tests confirmed a significant difference between level NA and AA (p = .012). No significant main effect of user group was found, F(1, 102) = 0.39, p = .533, partial-η2 = .00, and there was no significant interaction between user group and accessibility level, F(2, 102) = 0.51, p = .598, partial-η2 = .08. The covariate age was negatively associated with perceived usability, F(1, 101) = 5.53, p = .021, partial-η2 = .15.
Subjective Measures as a Function of Accessibility Level and User Group: Means and Standard Errors
Note. All subjective variables were measured by a Likert scale from 1 to 5. NA = very low conformance to Web Content Accessibility Guidelines 2.0 (WCAG 2.0); A = low conformance to WCAG 2.0; AA = high conformance to WCAG 2.0.
Positive affect
As presented in Table 4, condition NA led to lowest positive affect, whereas condition AA showed the highest positive affect. This corresponds to our assumption. This main effect was statistically significant, F(2, 101) = 5.70, p = .005, partial-η2 = .10. Post hoc tests revealed a significant difference between NA and AA (p = .005). There was no main effect of user group on positive affect, F(1, 101) = 0.71, p = .791, partial-η2 = .00, and no significant interaction between user group and accessibility level, F(2, 101) = 1.08, p = .345, partial-η2 = .02.
Negative affect
As the data in Table 4 show, there was no significant main effect on negative affect, for neither accessibility level, F(2, 101) = 1.31, p = .268, partial-η2 = .03, nor user group, F(1, 101) = 0.35, p = .558, partial-η2 = .00. There was also no significant interaction between user group and accessibility level, F(2, 101) = 1.36, p = .261, partial-η2 = .03.
Perceived aesthetics
Confirming our hypothesis, results showed lowest aesthetic ratings for condition NA and the highest ratings for AA (see Table 4). This effect of accessibility level was significant, F(2, 102) = 3.62, p = .030, partial-η2 = .07. Post hoc analysis showed a significant difference between condition NA and AA (p = .029). No significant effect of user group was found, F(1, 102) = 0.26, p = .873, partial-η2 = .00, and no interaction between user group and accessibility level occurred, F(1, 102) = 2.03, p = .137, partial-η2 = .04.
Perceived workload
The analysis revealed the lowest ratings of workload in the AA condition, whereas ratings in the A and NA conditions were higher (see Table 4). This main effect of accessibility level was significant, F(2, 102) = 3.22, p = .044, partial-η2 = .06. However, the post hoc analysis did not confirm any significant differences between any of the three conditions (all p > .05). In regard to the variable user group, the results showed that participants with impaired eyesight reported higher perceived workload than users with unimpaired eyesight. This effect was also significant, F(1, 102) = 4.79, p = .031, partial-η2 = .05. No interaction between user group and accessibility level occurred for perceived workload, F(2, 102) = 1.05, p = .355, partial-η2 = .02. The covariates age, F(1, 102) = 8.80, p = .004, partial-η2 = .08, and Web and computer experience, F(1, 102) = 4.00, p = .049, partial-η2 = .04, were significantly related to perceived workload.
User experience
Consistent with our hypothesis, condition NA was associated with the lowest and the condition AA with the highest ratings. The overall effect of accessibility level on user experience was significant, F(2, 102) = 3.12, p = .048, partial-η2 = .06. The post hoc analysis demonstrated that level NA and level AA were significantly different (p = .042). Eyesight did not influence ratings of user experience, F(1, 102) = 3.31, p = .072, partial-η2 = .03, and there was no interaction between user group and accessibility level F(2, 102) = 0.15, p = .862, partial-η2 = .00.
Discussion
This study compared effects of a Web site’s accessibility level on two matched samples of nondisabled users and users with visual impairments. The first hypothesis stated that all users (i.e., people with and without visual impairments) would benefit from higher accessibility levels. The second hypothesis stated that users with visual impairments would benefit more from higher accessibility levels than users without impairments. While the results confirmed the first hypothesis, the second hypothesis was rejected.
The findings are in line with previous work that indicated positive effects of WCAG 2.0– compliant Web site design on users with and without impairments (e.g., Huber & Vitouch, 2008; Pascual et al., 2014; Schmutz et al., 2016). For all performance and subjective variables (except negative affect), there was a significant main effect of accessibility in the expected direction. This confirms the effectiveness of WCAG 2.0 in improving user performance and users’ subjective experience. More surprising was that there was no interaction between accessibility and user group. This suggests that nondisabled users and users with visual impairments profited from higher accessibility to the same extent. This was unexpected because the main objective of accessibility guidelines is to support users with disabilities (Caldwell et al., 2008). A possible explanation for the missing interaction between user group and accessibility may be that the needs of the two user groups are more similar than previously thought. At first sight, users with visual impairments (especially people who are blind) seem to be different from nondisabled users in terms of navigation behavior (Power et al., 2013; Takagi, Saito, Fukuda, & Asakawa, 2007), mental models (Baumgartner et al., 2010), and general perception (Chiang, Cole, Gupta, Kaiser, & Starren, 2005; Vu & Proctor, 2011). However, this does not necessarily imply that the underlying needs of the two user groups are different. Several authors have emphasized the overlap between design requirements for users with and without disabilities (e.g., Huber & Vitouch, 2008; Mbipom & Harper, 2011; Petrie et al., 2004; Thatcher et al., 2006). Similarly, there have been suggestions that there is considerable overlap between WCAG 2.0 criteria and usability recommendations on Web site design for nondisabled users (e.g., Farkas & Farkas, 2000; Nielsen, Tahir, & Tahir, 2002; Shneiderman, Plaisant, Cohen, & Jacobs, 2013; Spool, 1999; Spyridakis, 2000; Williams, 2000). The present work provides first empirical evidence in support of these considerations.
Considering the main effect of accessibility on the dependent variables, it is to note that for most of the variables, the post hoc analyses revealed a significant difference between condition NA and AA but not between any of the other conditions. This pattern implies that WCAG 2.0 level AA is to be aimed for when designing a Web site. Given that the criteria manipulated in the present study are very concrete and easy to implement (see Appendix), practitioners can improve the usability of Web sites for a wide range of users with little effort. Implementing the criteria may have considerable effects because the effect sizes found in the present study were substantial. For example, the improvement of the Web site from NA to AA resulted in an increase in task completion rate of 11.5% (i.e., about 1 out of 10 tasks is solved or not depending on a Web site’s accordance to WCAG 2.0). This result is to be interpreted by taking into consideration the tasks that the participants had to solve. Even participants without impairments showed an imperfect completion rate (mean = 86.4%). Tasks 1 and 4 especially were often not completed (i.e., 76% of all incomplete attempts), which might be due to the necessity of clicking several links (Task 1) and screening longer text passages to find the correct answer (Task 4). Although we aimed to choose representative tasks for this particular type of Web site (i.e., municipality), it does not rule out the possibility that the findings could have been different (e.g., task completion rate) if an alternative set of tasks had been employed. Similarly, for task completion time, the changes from NA to A resulted in a decrease in the average task completion time of 43 seconds (22.7%), which is also a substantial reduction given that users leave a page on average after 10 to 20 seconds (Liu, White, & Dumais, 2010). Finally, the subjective measures also consistently showed medium to high effect sizes and confirmed level AA as the standard to strive for. Furthermore, it is to be considered that most of the participants (i.e., 76%) with visual impairments were using assistive tools such as a screen reader (25%), a braille board (42%), or a screen magnifier (9%) because their eyesight was really poor. As pointed out in previous work, the interaction of such users with the web is “conceptually most different from that of nondisabled, sighted users” (Petrie & Kheir, 2007, p. 405). We can therefore assume that effects of accessibility on people with other types of disabilities (e.g., hearing impairment or motor impairment) would be even more similar to nondisabled users.
Overall, the present results are in line with previous findings. Due to the very small number of studies in the field, further research is needed to see whether these findings can be corroborated.
Limitations and Future Research
There are a few limitations to consider in regard to the present work. Since the Web sites used in the study were mainly static, complex dynamic features (e.g., animations, maps, changing text, sounds, etc.) were not addressed. As the accessible design of such features may result in different consequences for nondisabled users, future research should examine more complex and dynamic Web sites.
This study focused on a comparison of nondisabled users with users with visual impairments. Future research should also examine other types of impairments (e.g., hearing, cognitive or motor impairments). Knowing more about similarities between user groups especially might lead to inclusive design solutions that could support different user groups at the same time. Furthermore, this work shows that visual impairment comes in many forms, lies on a continuum (i.e., it is not a dichotomous variable), and may or may not lead to reading difficulties. Such heterogeneity in samples of people with impairments is an issue that future research needs to address.
Although we present important first evidence on the effects of accessible Web site design on different user groups, the present study is only a first contribution to an emerging field. As in any other study, there are many possible factors that may have influenced the results. First, recruiting people with visual impairments is of great difficulty. Therefore, previous studies conducting experiments with people with visual impairments tested rather small number of participants, such as N = 9 (Pascual et al., 2014) or N = 3 (Rømen & Svanæs, 2012). Although we tested a comparatively large sample of people with visual impairments (i.e., 55), the sample is still rather small for computing statistical interactions between independent factors. Future research may aim to examine even larger samples to increase statistical power. Second, in experiments, the kind and strength of manipulation has an influence on the pattern of results found. Future research may aim to replicate present findings by using other stimuli. Furthermore, differences between inaccessible and accessible interfaces should be explored in more detail to identify the elements that caused the effects on performance and subjective evaluations and their levels of influence.
Implications for Practitioners
The present results have important implications for practitioners. Up to now, practitioners have mainly considered the WCAG 2.0 as a tool for supporting users with disabilities (e.g., Caldwell et al., 2008; Ruth-Janneck, 2011). They thus tended to omit its recommendations because “there are too many instances where the audience is specifically not those with a disability” (Farrelly, 2011, p. 227). However, the present findings suggest that applying the WCAG 2.0 will benefit users with and without visual impairments alike. If practitioners use WCAG 2.0 as a tool for designing user-friendly Web sites for both user groups, this is likely to result in market benefits since the needs of a wider range of users are met.
Conclusion
In contrast to the general assumption that WCAG 2.0 is an instrument for supporting users with disabilities only, the present results showed that WCAG 2.0 can support users with and without visual impairments alike and should also be recognized as an instrument offering such qualities. We believe that WCAG 2.0 should not be labeled only as a tool that “will make content accessible to a wider range of people with disabilities” (Caldwell et al., 2008) but rather as a tool that makes content user friendly for people with and without impairments. Emphasizing the advantages of the guidelines for a wide range of users may change the perception of practitioners in a positive way, moving from an “accessibility for users with disabilities” approach to an “inclusive-design” approach (e.g., Benyon, Crerar, & Wilkinson, 2001; Clarkson, Coleman, Keates, & Lebbon, 2013; Newell & Gregor, 2000). The joint consideration of users with and without disabilities is in our opinion economically promising and morally necessary.
Key Points
Implementing Web Content Accessibility Guidelines 2.0 (WCAG 2.0) showed benefits in terms of performance and subjective user experience.
The benefits of WCAG 2.0 were found for nondisabled users and users with visual impairments alike.
Practitioners may profit from implementing WCAG 2.0 by providing usable Web site design to a wider range of users, including people with and without visual impairments.
Footnotes
Appendix
Overview of Web Site Characteristics Manipulated
| Success Criterion (WCAG 2.0 Level) | AA Web Site | A Web Site | NA Web Site | Comments Reference the Document Understanding WCAG 2.0 |
|---|---|---|---|---|
| 1.1.1 Non-text content (A) | Every image on the Web site has an appropriate text alternative | Every image on the Web site has an appropriate text alternative | For every image the text alternative “image” was used | The manipulation is based on common failures F30 and F39 (Manipulated for further studies—did not affect nondisabled users) |
| 1.3.1 Info and relationships (A) | Required fields in the contact form were labeled with bold text and with an asterisk whose text alternative says, “required” | Required fields in the contact form were labeled with bold text and with an asterisk whose text alternative says, “required” | Required fields were labeled only with bold text, whose text alternative did not say, “required” a | The manipulation represents a violation of the sufficient technique G117 |
| 1.4.3 Contrast (minimum) (AA) | The contrast between headings and background was 4.5:1 (#007FAF | #FFFFFF); The contrast between text and background was 21.0:1 (#FFFFFF | #000000) |
The contrast between headings and background was 3.9:1 (#007FEF | #EFFFFF); The contrast between text and background was 4.0:1 (#FFFFFF | #7F7F7F) b |
The contrast between headings and background was 3.9:1 (#007FEF | #EFFFFF); The contrast between text and background was 4.0:1 (#FFFFFF | #7F7F7F) b |
The manipulation represents a violation of the sufficient technique G18 The chosen contrasts seem to be realistic since the screening revealed that there are plenty of Web sites containing contrasts about 3.0:1 or lower |
| 1.4.4 Resize text (AA) | Text can be resized without assistive technology up to 200% without loss of content or functionality | Resizing text to 200% caused text passages to be truncated or obscured | Resizing text to 200% caused text passages to be truncated or obscured | The manipulation is based on common failure F69 (Manipulated for further studies—did not affect nondisabled users because nobody resized text) |
| 1.4.8 Visual presentation (AAA) | Text blocks had a maximum width of 80 characters and were left aligned | Text blocks had a maximum width of 90 characters and were justified b | Text blocks had a maximum width of 90 characters and were justified b | The manipulation is based on common failure F88 as well as a violation of sufficient technique C20 |
| 2.4.3 Focus order (A) | Focusable components receive focus in an order that preserves meaning and operability | Focusable components receive focus in an order that preserves meaning and operability | Some fields in the form did not receive focus in a typical order via tabbing (i.e., skips between fields in different sections of the form; focus moved from the name field to a checkbox above, then to the street address) a | The manipulation is based on common failure F44 as well as Example 5 of understanding SC 2.4.3 |
| 2.4.4 Link purpose (in context) (A) | Links for sending an e-mail to a certain person were presented as mail address (e.g., |
Links for sending an e-mail to a certain person were labeled “contact” within the same paragraph as the description of the respective person (the purpose can be determined from the link text together with its context) b | Links for sending an e-mail to a certain person were labeled “link” within the same paragraph as the description of the respective person (the purpose cannot be determined with certainty from the link text together with its context) a | The manipulation is based on common failure F88 as well as a violation of sufficient technique C20 |
| 2.4.6 Heading and labels (AA) | Heading and labels describe topic and purpose | Some headings were shortened to be less descriptive (e.g., from “information about the town Eschen” to “general information”) b |
Some headings were shortened to be less descriptive (e.g., from “information about the town Eschen” to “general information”) b |
The manipulation represents a violation of the sufficient technique G117 |
| 2.4.7 Focus visible (AA) | Keyboard focus indicator was visible | Keyboard focus indicator was not visible b | Keyboard focus indicator was not visible b | The manipulation is based on common failure F78 |
| 2.4.10 Section headings (AAA) | Section headings were used to organize the content | Some section headings were removed b | Some section headings were removed b | The manipulation represents a violation of the sufficient technique G141 and H69 |
| 3.2.3 Consistent navigation (AA) | Navigational mechanisms occurred in the same relative order | Some navigation links were not presented in the same order on some webpages but only in the html file (i.e., remarkable when using a screen reader); the change in position was not visible due to holding the position via CSS | Some navigation links were not presented in the same order on some webpages but only in the html file (i.e., remarkable when using a screen reader); the change in position was not visible due to holding the position via CSS | Not remarkable without using screen reading software; the manipulation is based on common failure F66 |
| 3.4.4 Consistent identification (AA) | Links are designed consistently bold, in blue color, and underlined | Links differ in design: links were either blue and not underlined or underlined and in the same color as text; links were also not consistently bold b | Links differ in design: links were either blue and not underlined or underlined and in the same color as text; links were also not consistently bold b | The problem frequently occurred in the screening |
| 3.3.1 Error identification (A) and 3.3.3 error suggestions (AA) | If an input error in the form was automatically detected, the item that was in error was detected and described by the user in text (i.e., the field was marked with a red square and a textual suggestion on how to complete the field) | If an input error in the form was automatically detected, the item that was in error was detected and described by the user in text (i.e., the field was marked with a red square and a textual suggestion on how to complete the field) | There was no error identification used in the form a | The manipulation is based on a violation of sufficient technique G83 |
Note. Common failures and sufficient techniques are mentioned referring to the document Understanding WCAG 2.0 (Cooper, Kirkpatrick, & O Connor, 2014). NA = very low conformance to Web Content Accessibility Guidelines 2.0 (WCAG 2.0); A = low conformance to WCAG 2.0; AA = high conformance to WCAG 2.0; AAA = very high conformance to WCAG 2.0.
Modifications from Level A to Level NA.
Modifications from Level AA to Level A.
Sven Schmutz is a PhD student at the University of Fribourg, Switzerland. He received his MSc in work and organizational psychology from the University of Bern, Switzerland.
Andreas Sonderegger received his PhD in psychology from the University of Fribourg, Switzerland, in 2010 and is a lecturer in the Department of Psychology, University of Fribourg and UX Analyst at EPFL+Ecal-Lab, EPFL Lausanne, Switzerland.
Juergen Sauer received an MSc in occupational psychology from the University of Sheffield, UK, in 1990 and a PhD in psychology from the University of Hull, UK, in 1997. He is a professor of cognitive ergonomics in the Department of Psychology, University of Fribourg, Switzerland.
