Abstract
To ensure a certain degree of usability, a library website should be carefully designed, especially since end users constitute a multitude of people with different needs and demands. The focal objective of this research was to investigate how different types of end users (i.e. pupils, students, the working population, seniors and researchers) respond to a library website in terms of its effectiveness, efficiency and satisfaction, which together represent its usability. The answers were obtained by performing formal usability testing, including think-aloud protocol, log analysis and questionnaires. The results of the statistical analysis show that different groups of end users achieve different levels of effectiveness and efficiency, while there is no significant difference between groups in satisfaction level. The results also indicate that participants did not achieve the threshold for a usable website. Based on the identified weaknesses, researchers present recommendations for improving a website’s usefulness, especially for non-experienced users. This research has two main contributions: (1) the connection between the theoretical definition and practical use of ISO 9241-11 attributes and (2) a usability testing procedure with a measurement framework applicable for different types of users in a specific domain, which could be applied to other domains.
Keywords
Introduction
Nowadays, around 51% of the world’s population has an Internet connection and for millions of them websites are an important resource for the search and retrieval of information. Website users generally get sought-after information easily and in a timely fashion (Roy et al., 2014). These actions depend not only on the technical characteristics of Internet, Web and website design, but also on users and their characteristics, such as demographics (e.g. gender, age range, culture and language), educational level, experience, computer literacy and occupation (Yoon et al., 2016). Based on a combination of these characteristics, users develop a unique set of needs and expectations. Thus, needs and expectations are more diverse since the end users represent a multitude of people with different characteristics. The existing literature frequently investigates web interaction comparisons between age groups. On the one hand, there are younger users who have grown up with information technologies and who expect websites to have a high level of functionality. On the other hand, there are senior users dealing with ageing physical and cognitive changes. They commonly demand a functional website that is simple to learn and understand. Additionally, a website for an ageing population should minimize distractions, offer larger fonts and sound signals within certain frequency ranges (Djamasbi et al., 2010; Wagner et al., 2010). Increasingly all groups of web end-users expect better website functionality (Kılıç Delice and Güngör, 2009), user-friendliness and a high level of usability.
Together with reliability and security (Fernandez et al., 2011) usability represents an important criterion for evaluating websites from a user’s perspective (Djamasbi et al., 2010; Okhovati et al., 2017; Xie, 2006). Therefore, usability testing is an appropriate method for identifying user interface weaknesses before the production phase (Sauer and Sonderegger, 2011). The beginnings of usability testing date back to the early 1990s when Nielsen and Rubin developed usability engineering techniques for computer software design and applied them to web design. Their aim was to ensure that a website operates effectively and meets users’ needs (Battleson et al., 2001; Cockrell and Jayne, 2002). Since that time, usability testing has been extended to a variety of domains and represents an essential technique for building effective user interfaces. Several studies have investigated usability, including in the library domain, which is the main research domain of this paper. In general, the methods and techniques for web usability evaluations are classified into two groups: with and without user participation (Okhovati et al., 2017). This research focuses on usability evaluations of a library website with end user participation.
The end users of a library website are usually people for whom libraries are intended (i.e. pupils, students, researches, senior users, library staff, etc.). The existing research on usability testing in the library domain usually investigates the usage of university library websites with end users, which are often university staff, such as professors, researchers and students. Therefore, the participants included in these types of research collectively represent homogeneous users with similar needs and interests. Obtaining an acceptable level of usability requires less effort in the case of homogenous users compared to the case of users with different characteristics. For this purpose, this research focuses on evaluating the usability of a redesigned and reorganized segment of a university library website, which includes a service intended for library members who are all citizens over 15 years of age. Thus, the members are not only restricted to university staff and students, but also citizens, constituting a multitude of people with different characteristics. Likewise, they have different expectations and needs regarding the website itself and its associated usage. Despite this, all of these end users expect that the library website will be easy to use, efficient in performing a specific task and ensure their satisfaction when they use it (Okhovati et al., 2017).
The research question addressed in this paper is:
RQ: What is the level of the usability of the library website in light of different types of end users?
To ensure the inclusion of different types of end users in research, participants were divided into five groups: pupils, students, the working population, seniors and researchers. The construct of usability was treated as ‘the extent to which a product can be used by a specified user to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use’ (ISO/IEC, 1998). Therefore, it was composed of three underlying concepts: effectiveness, efficiency and satisfaction. In addition, the research also verified if different types of end users achieve the threshold for a usable website. The answer to the research question was obtained by performing formal usability testing, think-aloud protocol, log analyses and questionnaires.
Literature review
Usability testing
In general, website usability can be defined as a quality characteristic that describes how easily a user can navigate across a website (Roy et al., 2014). According to existing literature (Abran et al., 2003; Brinck et al., 2002; Furtado et al., 2003; Kılıç Delice and Güngör, 2009; Nielsen, 1993; Tsakonas and Papatheodorou, 2006), the term ‘usability’ represents a combination of several properties and attributes. Regardless of the variety of definitions, Jeng (2005) states that Nielsen and ISO 9241-11 definitions are the most widely cited. ISO 9241-11 defines usability as ‘the extent to which a product can be used by a specified user to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use’ (ISO/IEC, 1998), while Nielsen (1993) defines usability as an aggregation of five attributes: learnability, efficiency, memorability, errors and satisfaction.
The usability evaluation method is defined as ‘a procedure, composed of a set of activities for collecting usage data related to end user interaction with a software product and/or how the specific properties of this software product contribute to achieving a certain degree of usability’ (Fernandez et al., 2011). Usability evaluation methods are classified into two general categories – empirical methods and inspection methods, while the authors Battleson et al. (2001) divide empirical methods additionally into inquiry (e.g. focus group, interviews, questionnaires and surveys) and formal usability testing (i.e. interactions with a website by performing tasks). Empirical methods involve real users, while inspection methods (such as heuristic evaluation, cognitive walk-through, pluralistic walk-through and formal inspection) are based on reviewing the usability aspects of web artefacts, which have to be in compliance with established guidelines and performed by expert evaluators or designers (Fernandez et al., 2011).
Fernandez et al. (2011) found that 59% of reviewed papers include end-user-based usability testing based on the think-aloud protocol, Question-Asking protocol, performance measurement, log analysis and remote testing. Inspection methods were used in 43% of the reviewed papers, where the most common testing approaches are heuristic evaluation, cognitive walk-through (Albert et al., 2010), perspective-based inspection and guideline review (Fernandez et al., 2011). Nielsen (as quoted in (Battlesonet al., 2001; Fry and Rich, 2011)) claims that formal usability testing with the end user, in combination with the think-aloud protocol, is the most valuable usability engineering method.
Usability testing in the library domain
I Xie (2008) reports that the majority of digital library evaluation studies are related to usability testing. According to Fry and Rich (2011), 85% of libraries had conducted usability testing on some part of their websites. Chowdhury et al. (2006) explored the state-of-the-art of usability and its impact on digital libraries. There are also several recommendations (Albert and Tullis, 2013; Albert et al., 2010; Brinck et al., 2002) and studies which investigated the usability testing of digital libraries. The selection of appropriate research was based on inclusion and execution criteria. Upon further consideration, existing research was included that provided beneficial information on: the number of participants and/or groups, the measurable concepts/attributes of usability and the used metrics, the protocol for performing usability testing including the description of methods and quantitative and/or qualitative results. The results of the reviewed papers are summarized in Table 1.
Overview of related literature.
The results (Table 1) show that the usability testing of the reviewed papers is based on an evaluation with respect to their target users, as recommended by experts (Battleson et al., 2001; Chowdhury et al., 2006). Most research focuses on university library websites, so the target users are primarily students and university staff. Therefore, the existing research did not investigate and compare the usability of library website simultaneously with other user groups, such as pupils, the working population and seniors. In the reviewed papers, usability is treated with a different set of attributes, but none of it is based on existing standardized definitions (e.g. ISO, Nielsen). These findings are consistent with the stated fact that in only 18% of the reviewed papers in (Fernandez et al., 2011) do the usability evaluation methods employed rely on the standardized definitions of usability. The results also indicate that formal usability testing with the think-aloud protocol is the most commonly used method in the library domain.
This research attempts to fill the gaps discussed above in two ways: (1) the term usability is treated comprehensively by including effectiveness, efficiency and satisfaction as recommended by ISO 9241-11 (ISO/IEC, 1998); (2) the research includes different groups of users in accordance with their profile. It is based on established usability testing methods as formal usability testing, think-aloud protocol, log analysis and questionnaires.
The research
Research goals
This research investigated a website’s use with respect to the different types of users, through the concept of usability as defined in ISO 9241-11 (ISO/IEC, 1998). Thus, the research explored the effectiveness, efficiency and satisfaction experiences during website use by different types of users. ISO 9241 defines the aforementioned concepts as follows (ISO/IEC, 1998):
(1) effectiveness as ‘accuracy and completeness with which specified users can achieve specified goals in particular environments’;
(2) efficiency as ‘resources spent by a user in order to ensure accurate and complete achievement of the goals’;
(3) satisfaction as ‘the comfort and acceptability of the work system to its users and other people affected by its use’.
Therefore, the main goals of this paper are:
(1) To verify if different types of users report different levels of experiences in effectiveness, efficiency and satisfaction;
(2) To verify if the different types of end users achieve the threshold for a usable website. A website is denoted as usable if at least 75% of participants are able to complete the tasks successfully by themselves, as recommended by experts (Şengel and Öncü 2010; Stephan et al. 2006).
Object of the research
The area of the library website which was the subject of the evaluation is a service that allows end users to access different databases (e.g. union bibliographic/catalogue database, database on local libraries, etc.). The service is available to more than one million members of different library types (e.g. the national library, university and academic libraries, special libraries, public libraries and school libraries) that represent half of the whole population of the country. The university library members constitute 5% of all active library members. The university library is primarily intended for students and university staff, but library membership is open to all citizens above 15 years of age. In terms of identity, the members are divided into the following categories: secondary school students (4.7% of active university members), students (undergraduate/postgraduate: 61.9% of active university members), employees (of the university: 5.4% of active university members, employed outside the university: 15.9% of active university members), retirees (1.6% of active university members), foreign nationals (0.3% of active university members) and other members (10.4% of active university members).
The part of the library website under investigation is composed of four tabs shown in Figure 1 (1 – entry page, 2 – search forms, 3 – personal portal). The service enables a ‘search’ (No. 2 in Figure 1) of relevant information in different databases either by using basic (i.e. single search field), advanced (i.e. multiple search fields) or expert (i.e. by specifying a query) search, which results in a numbered list of relevant items. For each item, the detailed information is displayed, such as author, title, type of material, language, availability information and e-access. The last tab on the entry webpage (No. 3 on Figure 1) includes the personal portal ‘My library’ which allows members to access management library materials (e.g. overview of borrowed and reserved items and other materials, history of borrowing, etc.).

Investigated tabs.
Participants
According to existing literature (Albert and Tullis, 2013; Albert et al., 2010; Lazar et al., 2010), it is important to identify real users and include them in the research. In this research, the real users are library members above 15 years of age, classified in the library system into one of the categories based on their identity. To ensure an inclusion of different categories of users in a usability evaluation, as recommended by Marchionini, Plaisant and Komlodi (as quoted in (B. Xie, 2008)), five groups were identified based on the primary classification of the library system:
(1) The Pupils group consists of secondary school students. They constitute new members, ‘freshmen’ according to Ranganathan’s classification, who rarely visit the university library, because they usually borrow library materials for school obligations at their school’s library or the public library (Kumar and Phil, 2009; Prabha, 2013; Singh, 2015).
(2) The Students group consists of undergraduate and postgraduate university students and is the largest number of university library users. Most of them use the library frequently for their academic obligations and are comparable to the group of specialist users based on Ranganathan’s classification (Kumar and Phil, 2009; Prabha, 2013; Singh, 2015).
(3) The Working population group consists of employees outside the university, who visit the library irregularly and according to Ranganathan’s classification are comparable to the group of ordinary or specialist users, depending on their individual reading interests (Kumar and Phil, 2009; Prabha, 2013; Singh, 2015).
(4) The Seniors group consists of those who have finished all full-time employment. They visit the library irregularly and based on Ranganathan’s classification, are comparable to the group of ordinary users (Kumar and Phil, 2009; Prabha, 2013; Singh, 2015).
(5) The Researchers group consists of university employees, who use the library frequently for research and work obligations and are comparable to the group of specialist users, based on Ranganathan’s classification (Kumar and Phil, 2009; Prabha, 2013; Singh, 2015).
The first four groups were age restricted (i.e. pupils, students, working population and seniors), while one group, composed of university researchers, had no age restrictions. The demographic characteristics of the included groups are presented in Table 2.
Demographic characteristics of groups.
Included in the research were 25 participants (36% females, 64% males), between 12 and 60 years old. All of them were first-time users of the reorganized and redesigned library website, but 19 of them (76%) had experience using an old version of the library website (except one pupil, one student, one member of the working population and three seniors). The typical participant involved in this research was an experienced user who works with computers between 21 and 40 hours per week and borrows between 11 and 15 books per year.
Methods and metrics
The usability evaluation in this research is based on the following methods: (1) Formal usability testing, (2) Think-aloud protocol, (3) Log analysis and (4) Questionnaires. The evaluation was divided into three sections:
(1) Questionnaire 1 – based on a questionnaire with six close-ended questions, which acquired the participant’s demographic information.
(2) Performing tasks – based on formal usability testing, which consisted of ten tasks and presents the information for the measurement of effectiveness and efficiency (Appendix 1). Effectiveness was measured by successful task completion (ISO/TS, 2013), while efficiency was measured by the time needed to complete the tasks (Chowdhury et al., 2006; Fry and Rich, 2011; ISO/TS, 2013).
(3) Questionnaire 2 – based on the standardized questionnaire System Usability Scale (SUS) developed by the Digital Equipment Corporation presents the information for measurement of satisfaction. Participants’ satisfaction was measured by the five-point Likert scale and evaluated with the SUS protocol (Sauro, 2011).
Table 3 presents the measurement framework with detailed connections between methods, metrics, measurements and values.
Measurement framework.
Execution of the research
The usability test was created and performed by an observational coding system called Morae. Morae is classified as a semi-automatic tool that enables data collection, and records the user’s behaviours and facial reactions via video (Seffah and Habieb-Mammar, 2009). In this research, all three components of the system were used: (1) Morae Manager – used to manage data, (2) Morae Recorder – used for the preparation and execution of usability testing, (3) Morae observer – used for monitoring events on the participant’s screen and for the management of observational data. Participants used the Morae Recorder for the entire time of the performance usability testing process, including Questionnaire 1 and Questionnaire 2. Although the task instructions were presented in the Morae Recorder and could be displayed during the execution of the task, the task instructions also offered paper-based forms for participants. These also included other important information, such as individual data for access to a personal portal within the library website and the descriptive levels of the Likert scale. Because participants in the senior group may have had accessibility limitations (e.g. visual impairments), the font on the printed sheets was 14-point size Ariel.
Usability evaluation was conducted individually for each participant in a controlled environment in the presence of two researchers – the modeller and the observer. The modeller led the whole testing process and communicated with participants, while the observer monitored the situations and took notes about all participants’ comments when they were thinking aloud. The working environment was adapted to the performance of the usability testing. The room was comfortable and free from distractions. It was large enough to accommodate the moderator, recorder and participant and the participant had enough privacy for the execution of tasks. In the room were two computers – one for the participant and the other for the observer. Researchers assumed that the whole usability testing process, including a welcome speech and the detailed instructions to participants, would take, on average, 30 minutes per participant.
Before beginning the usability evaluation, a pilot test with five volunteers (between 14 and 55 years old) was performed. The main goal of the pilot test was to check for the technical and usability weaknesses of the planned usability testing. With the technical checks, researchers investigated if the test was working properly, while the usability checks reviewed if the test was easy enough to understand and if it provided the required answers. The technical checks did not show any deficiencies or weaknesses, while an analysis of the usability checks’ data showed that the task descriptions were understandable for all participants, but that they had not been classified properly according to the degree of difficulty. When researchers observed the participants during the task execution, they found a decline in motivation and participants’ frustration with the second task due to the higher degree of difficulty (the second task had the highest degree of difficulty for all five users). Based on the analysis, researchers changed the order of tasks according to the pilot data (starting with the easiest one and then slowly increasing the level of difficulty).
After the pilot research, potential participants were invited via email to participate in the research. When a participant responded to the invitation, researchers together planned an appropriate time for testing. On the scheduled day, the modeller welcomed the participant and presented the structure and instructions for testing. Each participant was informed that the usability testing was anonymous, and that it was not testing them and their skills, but the usability of the library website and that the data was intended solely for research purposes. To ensure consistency in testing, the participant was asked to reset the browser to ‘home’ after completing each task. If a participant did not have any questions, he/she started with the testing procedure. After performing the testing procedure, each participant received a ‘thank you’ gift.
At the end, the library’s expert was invited to participate in the usability evaluation. The expert was a person who had an intimate understanding of the library website’s functionalities and had the knowledge to get the correct answer in an optimal way. These results were taken as a reference value for the time needed for each task.
Data collection protocol
The empirical data (i.e. successful task completion, task time, satisfaction level) were collected using the questionnaires and using formal usability testing with the Morae Recorder. The qualitative data were collected using the think-aloud protocol (i.e. participants’ comments) and log analysis (i.e. observer’s findings). In order to facilitate a more detailed analysis of performance tasks, the observer followed and recorded the action on the participants’ screens with the Morae Observer. The observer took notes of the participants’ behaviour, reactions and comments during usability testing. To ensure the anonymity of individual data, shots were saved with an identification number (i.e. SerialGropNo_ SerialParticipantNo).
Data analysis protocol
The data were analysed using quantitative and qualitative methods. The quantitative method was employed to conduct descriptive data analysis (such as frequency and mean) using the statistical tool SPSS version 24. To determine any statistically significant difference between the five independent groups the nonparametric Kruskal-Wallis H test was used as the analytical technique. The qualitative analysis was executed by reviewing, coding, analysing and interpreting the participants’ comments obtained during usability testing. Data coding and analysing were performed using the software tool for qualitative data called QDA Miner version five.
Results
Quantitative results
Task completion
Figure 2 shows the relationships between the levels of performed tasks in relation to the groups. None of the groups achieved a 75% success rate, which is the threshold for a usable website as recommended by experts (Stephan et al., 2006). A comparison between the groups showed that the students achieved the highest success rate (62% success, 31/50) but that they needed at least some assistance from the modeller. In 22% of cases (11/50), the students needed some help from the modeller in order to perform the task and in four cases (8%, 4/50) completed the task with piecemeal guidance. In four cases (8%, 4/50), they did not find the right answer or gave up before completing the task. The lowest level of performance was achieved by the group of seniors (26% success; 13/50). They needed the most help during the search for the right answer. More specifically, in 16 cases (32%, 16/50) the seniors completed the task with some help from the modeller and in 16 cases (32%, 16/50) completed the task with guidance piecemeal. In five cases (10%, 5/50), the seniors did not find the right answer or gave up before completing the task.

Relationship between the levels of success by groups.
A Kruskal-Wallis H test showed that there was a statistically significant difference in task completion between the different groups χ2(4)=16.763, p=0.002, with a mean rank task completion of 138.07 for pupils, 102.65 for students, 117.80 for working population, 152.48 for seniors and 116.50 for researchers. The pairwise comparison revealed a significant difference between students and seniors (p=0.002).
Task time
Figure 3 illustrates the average task time for each task per group and reference value. The reference value shows the results based on testing by expert.

The average results and reference values of task time for each task.
As shown in Figure 3, the expert’s minimal task time was less than seven seconds (Task 4), while the maximum task time was 63 seconds (Task 9). The optimal average amount of time for the completion of one task was 36.45 seconds (average reference value). On average, the seniors (169.50 ± 114.1 s) needed longer to complete the task than students (67.35 ± 41.3 s), pupils (100.31 ± 58.0 s), the working population (106.0 ± 72.6 s) and researchers (106.3 ± 81 s).

The average results of satisfaction per group.
A Kruskal-Wallis H test showed that there was a statistically significant difference in task time between the groups χ2(4)=34.509, p=0.000, with a mean rank task time of 125.45 for pupils, 85.60 for students, 123.35 for working population, 170.28 for seniors and 122.82 for researchers. The pairwise comparisons revealed significant difference between pupils and seniors (p=0.019), students and seniors (p=0.000), the working population and seniors (p=0.012) and researchers and seniors (p=0.010).
Satisfaction
Figure 4 presents the average results of satisfaction per group. As previously mentioned, the satisfaction was measured and calculated by an SUS. The results are interpreted by Sauro’s reference limits (Sauro, 2011).
In accordance with Sauro’s interpretation (2011) of the SUS results, the working population (77.5 ± 10.752), students (69 ± 8.023) and pupils (68.5 ± 14.641) expressed above-average satisfaction with using the website, while the researchers’ satisfaction (55.0 ± 20.766) and seniors’ satisfaction (48.5 ± 26.374) was below-average satisfaction.
A Kruskal-Wallis H test showed that there was no statistically significant difference in satisfaction levels between the groups (χ2(4)=5.869, p=0.209).
Qualitative results
Table 4 shows the results of analysing the participants’ comments and their frequency of occurrence.
Participants’ comments.
The most common problems were related to warning messages in case of an error: 60% (15/25) of participants had a problem with reading the warning message, because the font size was too small and the contrast between font and background colour was not high enough; 48% (12/25) of participants had a problem with understanding the meaning of the warning message, because the content was unclear. The third most common reported problem was related with personal accounts: 44% (11/25) of participants stressed their confusion when they had to login to their personal account; 36% (9/25) of participants had a problem with understanding the general information and with presenting too much information on one page. More than 30% (9/25) of participants said that too much unfamiliar library terminology was used. Less than 30% of participants reported missing easier sorting by columns with a simple click (28% of participants), problems with understanding graphic elements (24% of participants), unclear information about borrowing a book (16% of participants) and basic information (ISBN, year of publication) about a book (16% of participants).
Discussion
Concepts of usability, i.e. effectiveness, efficiency and satisfaction, were measured with combinations of formal usability testing and an SUS questionnaire. The research results show a statistically significant difference in task completion between groups of students and seniors – the students having a higher statistically significant effectiveness rate (the first construct of usability), in contrast to seniors. The existing literature revealed different characteristics as reasons for the lower performance of older adults. Some of the literature strongly indicates that there is a negative relationship between age and performance, while other researchers report that performance relates to contextual factors such as experience and cognitive abilities (Wagner et al., 2010). According to the references in Wagner et al. (2010) and Charness et al. (2001), age and experience trade off with roughly equal weight, while some researchers found that experience accounted for the largest portion of variance (Wagner et al., 2010). Sjölinder et al. (2003) report that Internet experience has a stronger impact on performance for easy tasks, while age, spatial visualisation ability and working memory have the strongest impact on performance for complex tasks. In this research, however, the potential reasons for non-deviations or deviations in effectiveness between the groups appeared in their comments during testing. Between the frequencies of occurrence of comments given by pupils, by working populations and by researchers there are fewer deviations and there is no statistically significant difference between those groups. They usually pointed out the same problems with a similar rate of occurrence except for the problem related to small fonts and the problem with logging into an account. The major deviations appeared in comments given by students and seniors, which is also a statistically significant difference between those groups: 80% of seniors and 40% of students pointed out that there was not enough contrast between the background colour and text colour; 20% of students and 100% of seniors reported that they needed help due to the small fonts. As reported by Fukuda and Bubb (2003), seniors’ comments can be based on their decline of visual function, which can be related to age-based cognitive changes. However, in this research both mentioned problems were also identified by other participants, and thus cannot be solely related to age-related cognitive changes. Independent of participants’ own background for their comments, both problems were the most frequently exposed deficiencies in the research (60% of participants) and made the website more difficult to read, which can have a negative impact on usability. While the students did not mention any problems with the displayed information, the seniors pointed out unclear information (80% of seniors) and too much information presented on one page (60% of seniors). This led seniors to experience feelings of confusion, disorientation and/or even a feeling of being lost, which had an impact on poorer performance (Wagner et al., 2014) and on the efficiency rate experiences (the second construct of usability), based on task time. The results demonstrate that seniors performed their tasks significantly longer compared to other groups. Some of the existing research indicates that efficiency decreases with age (Charness et al., 2001; Sjölinder et al., 2003, 2005; Wagner et al., 2014), while many researchers confirm that when experience decreases (Cockrell and Jayne, 2002; Crowley et al., 2002), the time needed to complete a task increases (as quoted by Wagner et al., 2010). In addition to the aforementioned reason, this can be also one of the possible explanations for seniors’ increased task time in this research. On average, seniors are considered to be users with medium computer skills and represent the group with the lowest rate of computer skills in the research. Compared to other groups, they had the least experience with using an older version of the library website (80% of seniors had never used the library website) and they borrowed the fewest books per year (1–5 books). The last evaluated construct is satisfaction. Satisfaction experiences showed no significant difference between groups.
The results of formal usability testing also indicate that none of the participant groups achieved a 75% success rate, which is the threshold for a usable website as recommended by experts (Stephan et al., 2006). The results indicate that students achieved 62% success rate, which is comparable to findings given in Zimmerman and Paschal (2009), followed by the working population (50% success rate) and researchers (48% success rate). The biggest deviations were observed for the groups of pupils (34% success rate) and seniors (26% success rate). Both groups needed the most assistance from the modeller, but the pupils completed the tasks more quickly than seniors, by 69 seconds on average.
Based on a qualitative analysis of participants’ comments, the most common deficiencies were identified. Those comments represent the added value, because they provide an explanation for existing quantitative results. Most comments were posted by seniors (26 comments), followed by the working population (21 comments), researchers (20 comments), pupils (18 comments) and students (12 comments). In general, the qualitative responses indicate that the participants had several problems with the use of the website because of unsuitable design and incomprehensible content. The most common problems were warning/error messages. In terms of semantics, the message content was unclear and incomprehensible, because it included words, phrases and concepts which were unfamiliar to users. Problems with interpreting and understanding library terminology have also been revealed in many examples of existing research (Augustine and Greene, 2002; Cockrell and Jayne, 2002; Crowley et al., 2002; Van den Haak et al., 2003). The participants in this research reported problems with message displays and reading messages, because they were not close enough to their predictable target. This finding could be related to unintuitive website design, which is also identified as a reason for the unsuccessful performing of tasks in existing research (Augustine and Greene, 2002; Crowley et al., 2002). As pointed out by Crowley et al. (2002), the web design should be intuitive, because an intuitive library website is vital in helping users get the searched-for information. Therefore, the process for the search protocol should be in accordance with participants’ expectations. In addition, the participants in this research reported that the fonts used were too small and did not contrast enough with the background colour. As reported in Zimmerman and Paschal (2009), visual presentation is key to encouraging users to explore the content of a website and users’ positive reactions contribute to return visits to the website. The next problem revealed by participants was the information richness on display on one webpage that also had an impact on reading and finding the right information on the library website. Other problems revealed by participants were the following: (1) re-signing into the personal portal ‘My Library’ was disruptive, (2) a lack of advanced search engine options would suggest similar possibilities and (3) the need for an easier sorting of information by columns. All of the identified problems mentioned above could have had an impact on the level of effectiveness, efficiency and satisfaction. These imperfections can dampen the motivation to use the library website, because the users cannot reach the desired information effectively and efficiently.
To summarize, the quantitative results of the research indicate that seniors have a statistically significantly lower effectiveness rate compared to students and that they have statistically significantly lower levels of efficiency when compared to other groups (pupils, students, the working population and researchers). No significant difference was found between the groups in terms of satisfaction level. Quantitative results could indicate that the division into the aforementioned groups was less important than anticipated. However, in light of qualitative data (i.e. analysis of subjects’ comments reported alongside performed tasks as summarized in Table 4), the results indicate differences between the groups in frequency of some comments’ occurrence, such as the inappropriate size of fonts, unfamiliar library terminology, unclear general information, webpage information overhead and complex sorting mechanisms. Most comments were posted by seniors, followed by the working population, researchers, pupils and students. In general, the differences between the comments given by pupils, working populations and researchers demonstrate fewer deviations, while the major deviations appeared in comments given by students and seniors. These findings indicate that the inclusion of representative groups of end users was reasonable to obtain a real state of usability. If one or more groups was not included in the research, the results could lead to different findings. For example, if the evaluation of the current library website was only conducted with students and researchers, the results would not show any problems for other groups (pupils, working population and seniors), such as unfamiliar library terminology, unclear general information and webpage information overhead. Therefore, the identification of real-end users and their inclusion in the research was rational, as recommended by experts (Albert and Tullis, 2013; Albert et al., 2010; Battleson et al. 2001; Lazar et al., 2010).
Limitations and future work
Some limitations need to be addressed in this research. Primarily, the sample of experimental participants makes it hard to generalize the findings. The research included 25 participants – five participants in each group. Further research will be conducted with more participants per individual group. In addition, 76% of participants had experience with using an older version of the library website, which may have impacted the evaluated results. Also, the library website included in the research has its own design and structure. Therefore, the results cannot be generalized to all library websites, but it can be applied to other library websites with a similar design and structure. Nevertheless, the general recommendations for improving the library website’s usefulness, based on these results, could lead to the improved usability of any library website.
This research opens up many possibilities for future work. To expand the breadth of this research, the results based on the identification of the causes of failure by each task will be explored in separate papers as complementary work to this research. Additional research will be expanded by using a heuristic evaluation method (Hasan et al., 2012; Okhovati et al., 2017). Existing research has revealed that user testing and heuristic evaluation methods should be used together to obtain a comprehensive identification of usability problems, because the heuristic evaluators explore the overall interface, while users have to focus on problems with specific tasks during their interaction with the website (Hasan et al., 2012). By involving more participants in research, future work will also be focused on investigating the correlation between effectiveness, efficiency and satisfaction, especially in researching the impact of effectiveness and efficiency on participants’ satisfaction levels. Additionally, future study will be directed at investigating the impact of personal characteristics (e.g. computers skills and library experience) on the concepts of usability in the library domain.
Conclusions
Usability is an important criterion for evaluating websites from a user’s perspective (Djamasbi et al., 2010; Okhovati et al., 2017; Xie, 2006). Website usability is more difficult to guarantee if end users are represented by a diverse multitude of people, because all of them expect that the website will be designed according to their own needs and expectations. In order to create a website that will be usable for as many end users as possible, it is necessary to identify and analyse real end users (Albert and Tullis, 2013; Albert et al., 2010; Battleson et al., 2001). Based on these recommendations and the structure of real library end users, the participants in this study were divided into five groups, such as pupils, students, the working population, seniors and researchers. By combining different usability evaluation methods, quantitative and qualitative results were obtained. The quantitative results of the research indicate that seniors have a statistically significantly lower effectiveness rate in comparison with students, and they have statistically significantly lower levels of efficiency than other groups (pupils, students, working populations and researchers), while there is no significant difference between groups in terms of satisfaction level. Contrary to assumptions, different groups of end users did not achieve the acceptable threshold for a usable website.
Based on the participants’ comments, the most common deficiencies were identified, defining general recommendations for improving library website usefulness. The recommendations are helpful to library web designers, especially when they design websites for a range of users with different characteristics, focusing on creating a more user-friendly and usable website.
The paper also shows a measurement framework for usability testing with a detailed connection between the definition of usability recommended by ISO 9241-11, methods, metrics, measurements and values. A measurement framework is developed for a specific domain – the library website, but it could apply to any other domain and it could be useful to other usability researchers to help them define their measurement framework in the research planning phase.
Recommendations for improvement of the library website
Based on the identified problems, the researchers defined general recommendations for the improvement of the usability of library websites. The recommendations could be helpful to library web designers, especially when they design a website for a multitude of people with different characteristics.
The first category of recommendations is related to graphic design. The redesign of the user interface in accordance with web design standards, especially for people with disabilities and an ageing population, would be effective in solving most of the identified problems. The main recommendations include: using larger fonts, using a suitable contrast between the background colour and font colour, eliminating excessive distractions, using minimalist and understandable graphics items and a simpler layout of graphics items on one webpage. Additionally, researchers recommend that the designers cross-check the existing interface with the recommendations given by ISO/IEC 40500:2012 (ISO/IEC, 2012) and eliminate inconsistencies with the above-mentioned standard.
The second category of recommendations is related to technical and functional properties. The major problem is the non-intuitive process for using a personal account. Therefore, researchers recommend reprogramming the personal portal, because it contains the most important content elements on the library website (Mierzecka and Suminas, 2016). It has to be intuitive to navigate, easy to use, simple and understandable for all users, especially for users with lower computer literacy.
The last but not least important category of recommendations is related to the website content and its semantic meaning. The library website should use the users’ language rather than system-orientated and library-vocabulary terms (Okhovati et al., 2017). As recommended by Ahmed et al. (2006), all error messages should be specific, constructive and uncritical of users without too many distractions and unimportant information. Therefore, researchers recommend modifying the semantic meaning of the error messages, modifying the library-vocabulary terms with user-friendly language and/or an additional explanation of library terminology.
Footnotes
Appendix
Acknowledgements
The authors are grateful to all participants for their precious time and their participation in this study.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article
