Abstract
Users commonly report problems in refinding important websites. To address this, some create Bookmarks (also called web Favorites) to improve the refinding of these websites. Previous research notes that these Bookmarks are rarely used. However, these prior studies did not systematically observe and quantify the frequency and success of Bookmark retrievals, or compare their success with other retrieval methods. The authors address those questions in this study using the elicited personal information retrieval technique, in which they asked their 50 participants to retrieve target URLs (uniform resource locators). Each participant received 21 targets, from which five were taken from the participants’ Bookmarks, and they were presented in random order to avoid raising suspicions. Although most of the participants created Bookmarks, they rarely used them for retrieval. Across all of the participants, only 41 (16%) of these 250 bookmarked retrieval targets were actually retrieved using the Bookmark facility. Of these 41 instances when Bookmarks were used, only 9 (4%) Bookmarks were retrieved using the Bookmarks menu hierarchy, while the remaining 32 were located in the browser’s upper bar, which was in full view of the participants. The results suggest that unless the Bookmarks were highly visible, the participants did not use them. Furthermore, for websites that the users had visited, bookmarked websites were not better retrieved than those that had not been bookmarked. The authors conclude by discussing possible explanations, as well as design and theory implications.
Introduction
The Internet looms large in the lives of technology users, and we all spend a significant amount of time online searching for, and consuming, web information. In common with other personal data, when we encounter web information, we make complex judgements about the anticipated future value of that information (Aula et al., 2005; Bergman and Whittaker, 2016; Bruce, 2005; Bruce et al., 2004a; Jones et al., 2002, 2017; Obendorf et al., 2007). These judgements are important because our web behavior usually involves refinding; between 41% and 81% of web retrievals concern information that we have previously encountered (Cockburn and Greenberg, 2000; Obendorf et al., 2007; Tauscher and Greenberg, 1997). Refinding is a key process in personal information management (PIM), a research area that describes how people organize personal data such as files, emails, photographs, and contacts in order to retrieve it successfully (Bergman and Whittaker, 2016; Jones et al., 2017; Jones and Teevan, 2007; Whittaker, 2011).
How, then, do people successfully refind websites? One obvious strategy is to proactively prepare for future retrieval by bookmarking websites, creating direct links to information that we expect to need again. For information that we do not anticipate needing again, we might eschew Bookmarks. Instead, we might rely on reconstructive strategies devised at the time of retrieval—for example, regenerating a prior successful search query, guessing a plausible URL, or scanning our History list to relocate the information. This article explores the prevalence and success of retrieval using Bookmarks and compares it with other reconstructive refinding strategies. These are important problems to address; users are often unsuccessful in refinding valuable websites (Wen, 2003) and there is no consensus about optimal refinding strategies (Aula et al., 2005; Jones et al., 2017; Morris et al., 2008; Obendorf et al., 2007).
The limitations of bookmarking
Creating Bookmarks is common, with the majority of people doing so (Abrams et al., 1998; Aula et al., 2005; Boardman and Sasse, 2004; Bruce et al., 2004b). There are also psychological reasons why we might expect Bookmarks to be effective for web retrieval. Research from cognitive science shows that proactively annotating and organizing information that we expect to retrieve at a later date makes such information more memorable and more easily refindable (Craik and Lockhart, 1972). In the PIM context, other research demonstrates that actively organizing other types of personal information, such as personal files, into subjectively meaningful categories also promotes successful retrieval (Bergman et al., 2014). Despite these arguments, however, proactive organization using Bookmarks is problematic. First, people create Bookmarks that remain unused. For example, Tauscher and Greenberg (1997) show that 58% of Bookmarks are created but never revisited. A possible reason is that Bookmarks fail to preserve the context of the original retrieval (Alhenshiri et al., 2012). A second, different type of problem is that people fail to bookmark web information they later need. Wen (2003) found that users are often unable to retrieve past valuable web information, failing to reaccess 80% of websites they found useful in the past. Aula et al. (2005) suggest a reason for this: users are often reluctant to create new Bookmarks, fearing that too many irrelevant Bookmarks will create clutter and compromise the utility of valuable Bookmarks.
Alternative retrieval strategies
Given these problems with Bookmarks, other work has explored alternative strategies for retrieving web information. Unlike Bookmarks, these strategies are not proactive. Instead, they depend on information that is generated at the time of retrieval—for example, regenerating original queries, accessing the History list, or directly typing a website URL. A very influential paper by Teevan et al. (2004) used diary study methods to explore these strategies. The authors make a distinction between teleporting and orienteering strategies. Teleporting strategies involve users accessing their target website directly in one shot. In contrast, orienteering is a more incremental retrieval process where the target is approached by a series of gradual steps. For example, if a person needs to find a colleague’s phone number knowing that they work at a certain organization, they can teleport to it in a single search using the name of the organization, the colleague’s name, and the word “phone,” or they can orienteer to it by first searching for the organization and then browsing to the employees’ section of the organization’s webpage, recognizing the person’s picture and their webpage, before finding the relevant phone number. Teevan et al. (2004) found that teleporting is infrequent; even when users possess enough information to do so, they do not generally access target information directly. Instead, users prefer orienteering; they use a combination of search, links, browsing, and local search (within a page) to incrementally approach their target via small steps. Consistent with other PIM research (Bergman et al., 2008), Teevan et al. (2004) offer three arguments as to why orienteering is preferred: (1) it is cognitively simpler than keyword search generation as it relies on memory or simple inferences about the target; (2) it provides feedback to users about the progress of their retrieval, allowing them to check whether they are following a promising path; and (3) it allows participants to better evaluate the results of their searches, which are often hard to assess when directly teleporting.
Other studies of website refinding confirm user preferences for incremental orienteering rather than teleporting using a search or typing direct URLs (Tauscher and Greenberg, 1997; Wen, 2003). Obendorf et al. (2007) examined the log files of retrieval sessions. They discovered that the most common strategy for refinding information was to follow links (50%), with the back button being the next most common strategy (31%). Teleporting strategies accounted for just 13% of retrievals. Simple retracing using the back button was also found in other studies, and this is especially common for recently accessed websites (Tauscher and Greenberg, 1997).
Given this strong preference for retracing prior actions, it seems that using the History list should be a helpful website-refinding strategy. Nevertheless, the History list is seldom used, and people experience great difficulties in exploiting it (Aula et al., 2005; Bruce et al., 2004a; Jones et al., 2002; Morris et al., 2008). There are two reasons for this. First, page titles in the History list are automatically generated and often unrelated to the actual search terms used. More significantly, however, the History list intermingles successful with unsuccessful results. When scanning the History list to reaccess a useful target site, users are often unable to distinguish successful prior retrievals from failures. A further challenge to retracing prior actions is that users often cannot recall the exact method that they used for access and, in trying to replicate a previously successful strategy, such confusion might lead them to attempt to search for information for which they had originally successfully browsed (Aula et al., 2005; Bruce et al., 2004a; Jones et al., 2002; Wen, 2003).
However, one crucial limitation of this prior work on refinding web information is that it has been primarily observational, relying on interviews, diary studies, or log file analysis. Such approaches allow rich descriptions of key web-access phenomena (Bruce et al., 2004b; Teevan et al., 2004) or determinations of the frequency of specific retrieval strategies (Obendorf et al., 2007; Tauscher and Greenberg, 1997). Critically, however, such methods do not enable the controlled evaluation of the success and efficiency of various refinding strategies, or assessment of the precise situational characteristics eliciting those strategies. Furthermore, in the case of log file analysis, it is hard to infer users’ goals, making it difficult to determine whether retrieval is successful.
The goal of the current study is to address these well-documented problems of website refinding, but by using quantitative controlled methods. We systematically examine retrieval behaviors to determine how frequently people actually exploit Bookmarks that they have created, and whether using Bookmarks is efficient and successful. We also explore the possible reasons why people fail to use the Bookmarks they have created. We compare the prevalence, success, and efficiency of Bookmark retrievals with other refinding strategies, such as History lists and regenerating original search terms and URLs. These have previously been studied using observational methods. Prior work also suggests that users find it hard to recall the details of previous retrievals (Aula et al., 2005; Jones et al., 2002). We therefore compare methods for refinding previously encountered web information with strategies for finding novel information. Finally, we investigate individual differences. It is well known that people have very different strategies for creating and managing Bookmarks (Abrams et al., 1998; Aula et al., 2005). We therefore wanted to see whether retrieval success and efficiency related to the number of Bookmarks created, as well as demographic information such as age, gender, and personality, which have all been proposed to influence PIM (Bergman et al., 2019b; Massey et al., 2014).
In order to address these research questions, we evaluated retrieval in a naturalistic setting while still controlling the retrieval tasks. To do this, we used the elicited personal information retrieval (EPIR) technique (Bergman et al., 2010). A defining property of PIM is that users employ highly subjective schemes for managing and retrieving their own data, leading us to avoid approaches in which participants access exclusively experimenter-provided materials (see, for example, Civan et al., 2008; Fitchett et al., 2013; Gao, 2011). The EPIR method allows controlled comparisons of the success and efficiency of different retrieval strategies in a naturalistic setting where the experimenter asks the participant to retrieve their own personal materials.
Research questions (RQs)
Bookmarking prevalence and success
RQ1.1: What percentage of bookmarked websites is retrieved by using Bookmarks?
RQ1.2: Are websites that are retrieved using Bookmarks retrieved more successfully and efficiently than bookmarked websites retrieved using other methods?
RQ1.3: Is there a difference in the success and efficiency of retrieval between bookmarked websites and websites sampled from the participants’ History that were not bookmarked?
RQ1.4: Does the number of Bookmarks affect their retrieval use, success, and efficiency?
RQ1.5: Why do the participants use Bookmarks?
Retrieval-methods distribution
RQ2: What is the overall distribution of retrieval methods?
Personal versus public targets
RQ3: Are people better at retrieving previously encountered websites versus those that they have not accessed before?
Individual differences
RQ4: Do age, gender, and personality traits affect retrieval success and efficiency?
Research method
The study was reviewed and approved under Institutional Review Board application no. 2767, granted by the UC Santa Cruz review board.
Participants
The participants were 50 students recruited at UC Santa Cruz from the psychology department’s research pool. Of these, 35 (70%) were women. Their ages ranged from 18 to 62 (M = 20.6, SD = 6.18). Of the participants, 30 (60%) used Chrome, 14 (28%) used Safari, and 6 (12%) used the Explorer browser. As we were interested in bookmarking behavior, we required the participants to have at least five Bookmarks, and this information was obtained by a screening survey. The participants had a range of 6 to 651 Bookmarks (M = 57.18, SD = 100.85). The participants were first asked to show us their computers so we could access their browser History and count their Bookmarks. We informed them both before sign-up and again after the study that no sensitive information would be recorded from their computers. We also requested permission to record their screens during the experiment. Despite warnings before sign-up about the nature of the data that we were collecting, three participants did not agree. The data from these three participants is not included in the above sample; all 50 of the participants we analyze completed the procedure and allowed recordings.
Materials
Our procedure used a modified version of the EPIR method (Bergman et al., 2010), which allows the collection of quantitative data about personal information retrieval in a naturalistic context. In the current study, the participants were asked to retrieve four types of target websites when requested by the experimenter. To prompt retrieval, we showed each user 21 websites, one at a time, on a separate laptop with the URLs hidden, and asked them to retrieve that website on their own computer. The URLs were hidden to prevent the users from simply copying these into their search bar. For each target website, we analyzed the time taken for the retrieval, whether it was successful, and the retrieval method the user employed. This method has been successfully employed in multiple studies of personal file retrieval (Benn et al., 2015; Bergman et al., 2010, 2019a).
In order to address our research questions, we presented the users with four types of target websites: Bookmarks, Unmarked History, Hard, and Easy. We mixed bookmarked websites with other target websites as we did not want to bias the users towards exclusively relying on Bookmarks for retrieval. Overall, these Bookmarks websites made up 5 (24%) of the 21 targets. To understand the benefits of Bookmarks for retrieval, we compared retrieval for websites that users had visited and bookmarked with websites in their History list that were not bookmarked. We call these six previously visited personal websites that were not bookmarked Unmarked History websites. We also compared retrieval for these two types of previously visited personal websites with public websites that users had not visited. The public websites were operationalized as being easy or hard to retrieve, as we define below.
Five Bookmarks retrieval targets were chosen from the participants’ own computers. We wrote a program that sampled their Bookmarks and then manually filtered them to remove Bookmarks that referred to compromising information (pornography, internal banking websites, medical, psychological, or legal help sites, etc.), replacing them with non-sensitive targets. This was done to avoid presenting the participants with sensitive tasks.
Six Unmarked History items were next chosen after excluding bookmarked items from the History list. Again, we used the automatic program to select these from the History list, and they were sampled from different time periods. As with the Bookmarks websites, we manually filtered to remove sensitive websites.
Next, we selected 10 public websites—that is, items that users had not previously visited. These websites were divided into five Easy and five Hard websites. The Easy websites were chosen as being frequently accessed by the general public (assessed using alexa.com, a service that ranks website access frequency) and having an easily visible name, with that name appearing in their URL. The five Easy websites we chose were: twitch.tv, bbc.com, youtube.com, ebay.com, and imdb.com. Hard websites were defined as being infrequently accessed and having names that were not easily identifiable on their website or did not appear in their URL. The Hard public websites we chose were: yale.art.edu, jamilin.com, omfgdogs.com, fallingfalling.com, and drbroners.com. A pilot study with 12 participants verified that people were slower and less successful in accessing Hard compared to Easy websites.
Procedure
Having explained our procedure, we first accessed the participants’ machines, where our program identified Bookmarks and Unmarked History websites. Following the EPIR method, we asked the participants to retrieve the set of 21 websites constituted by the five Bookmarks, six Unmarked History, five Easy public, and five Hard public websites. We refer to these 21 websites as target websites, and these were presented one at a time in a randomized order.
For each target website retrieval, we used the following procedure, which was screen-recorded using Flashback Express 5 recording software for Windows and Quicktime for Macintosh. The recording software was located on a USB (universal serial bus) stick plugged into the participant’s laptop to avoid having to install new software on their computer. We opened each target website in a browser on a separate computer but hid the URL bar using a curl script. Following the EPIR method, the participants were given the following instructions to retrieve that website: “Please find this site. When done, close the window.” The participants were not given a time constraint. They were told to close the browser window when they were done to signal on the screen recording that the search was successfully completed or that they had given up. After completing all 21 target retrievals, the participants completed a short survey. The screen recordings were uploaded from the USB stick to a server for subsequent analysis of the retrieval methods. Finally, we administered the 44-item Big Five Inventory personality trait questionnaire—a standard personality survey (John and Srivastava, 1999).
Analysis
In order to classify different retrieval methods (see Table 1), we examined the screen recordings taken of each retrieval. In order to determine the retrieval method, we identified the following retrieval categories, which were based on prior research defining different strategies (Bruce et al., 2004a; Jones et al., 2002; Teevan et al., 2004). We also classified the methods into orienteering versus teleporting. Following Teevan et al. (2004), instances of teleporting were defined as an attempt to reach the target destination directly by using a single method (typing in the URL or finding a target item in a single successful search). Orienteering cases involved combinations of actions, such as search and browsing.
Definitions of retrieval methods.
Results
Overall, the participants succeeded in retrieving 86% of the target websites (SD = 17%) and did so in 38 seconds on average, with a large variance (SD = 109 seconds). Our analyses used t-tests, which are parametric comparisons, although the distributions were skewed. This decision was motivated by the central limit theorem, which establishes that for large enough sample sizes (n > 30), the distribution of means (constructed by resampling and computing the mean an infinite number of times) is normal even if the original sample is not normally distributed. When the t-test inferentially tests the difference between means, its appropriate deployment depends on the assumption of normality. However, this normality assumption concerns the normal distribution of means and not the specific distribution sampled in the test (Gravetter et al., 2020). Recall that for their personal data we asked the participants to retrieve target websites from different time periods. However, we found no time effects and so do not discuss this data further.
Bookmark prevalence and success: Bookmarks are seldom used
RQ1.1: What percentage of bookmarked websites is retrieved by using Bookmarks?
Creating Bookmarks was relatively common and, on average, the users created 57.18 Bookmarks (SD = 100.85). Nevertheless, Bookmarks were not frequently used for retrieval. For the combined set of 250 bookmarked target websites we asked the 50 participants to retrieve, only 41 (16%) were actually retrieved using Bookmarks. Instead, the vast majority (84%) of bookmarked target websites were retrieved using other methods. Moreover, Bookmark usage critically depended on the participants being visually reminded that they had made a Bookmark. Of these 41 bookmarked target websites retrieved using Bookmarks, only 9 (4% of the bookmarked target websites) were taken from the Bookmarks menu hierarchy. The other 32 target websites were retrieving using Bookmarks located in the browser’s continuously visible upper bar.
RQ1.2: Are websites that are retrieved using Bookmarks retrieved more successfully and efficiently than bookmarked websites retrieved using other methods?
A Bookmark is a direct link to a target website, which meant that there were no retrieval failures when Bookmarks were used (compared to failures of 13% when using other options). As expected, the retrieval time (M = 10 seconds, SD = 11 seconds) was significantly faster when using Bookmarks than for bookmarked target websites retrieved using other methods (M = 28 seconds, SD = 35 seconds) (t(237) = 17.56, p < 0.001). As there were only nine retrievals from the (less visible) Bookmark menu hierarchy, we could not statistically compare their success and efficiency with other retrieval methods.
RQ1.3: Is there a difference in the success and efficiency of retrieval between bookmarked websites and websites sampled from the participants’ History that were not bookmarked?
Even if Bookmarks were not actively used at retrieval, we nevertheless expected that actively creating Bookmarks would make bookmarked websites more memorable, increasing their ease of retrieval using other methods. However, we did not observe an advantage for bookmarked websites. There were no significant differences between the failure percentage for bookmarked (11%) and Unmarked History (10%) targets, or between the retrieval time for bookmarked (M = 25 seconds, SD = 33 seconds) and Unmarked History targets (M = 23 seconds, SD = 27 seconds).
RQ1.4: Does the number of Bookmarks affect their retrieval use, success, and efficiency?
We could not test whether the number of Bookmarks affected their retrieval use, success, and efficiency as only nine Bookmarks were retrieved from the Bookmarks menu hierarchy.
RQ1.5: Why do the participants use Bookmarks?
Finally, we explored motivations for Bookmark usage. Thirty-eight participants answered the question “Why do you use Bookmarks?” Twenty-five (66%) stated “for websites I visit often”; seven (19%) answered “for websites I don’t think that I can search for”; and six (16%) mentioned that they bookmarked websites they did not currently have time to read. Overall, these results indicate that the vast majority (84%) of the participants used Bookmarks to facilitate their future retrieval.
Given that Bookmark usage was not prevalent, we now turn to other retrieval methods.
Retrieval-methods distribution
RQ2: What is the overall distribution of retrieval methods?
Table 2 presents the retrieval method usage distribution (for definitions of the different retrieval methods, see Table 1). Note that a method can be used more than once per retrieval (for example, using two different searches when the first search failed to retrieve the target website). As a result, frequencies can exceed 1.
Retrieval method usage distribution when performing the retrieval tasks.
Table 2 indicates that on average the participants made 3.745 attempts to reach each target website. Search and browsing were by far the most commonly used retrieval methods. They were often combined—in 467 retrievals, search and browsing were combined, and only 103 retrievals involved search only. These results confirm that direct teleporting is less common than orienteering involving combinations of actions—for example, search and browsing.
As already noted, using Bookmarks was an infrequent retrieval method. Although 24% of the target websites appeared in users’ Bookmarks collection, Bookmarks were used just 0.07 times per retrieval on average. The participants also made very little use of the History option for retrieval. This result is surprising because 29% of the target Unmarked History websites came from their browser History, and a third of these websites were visited less than three days before the experiment.
Personal versus public websites
RQ3: Are people better at retrieving previously encountered websites versus those that they have not accessed before?
We explored whether people were better at retrieving previously encountered websites by comparing success and efficiency for public versus private target websites. Personal websites are targets taken from the participant’s History, made up of Bookmarks and Unmarked History websites. Public websites are previously unencountered websites, made up of Easy and Hard websites. An independent samples t-test indicated that the average retrieval failure percentage for public target websites (M = 16%, SD = 37%) was significantly higher than for personal target websites (M = 11%, SD = 30%) (t(1020) = 2.68, p < 0.01). Another independent samples t-test indicated that the retrieval time for public target websites (M = 57 seconds, SD = 158 seconds) was significantly longer than for personal target websites (M = 24 seconds, SD = 30 seconds) (t(973) = 4.62, p < 0.001).
Individual differences
RQ4: Do age, gender, and personality traits affect retrieval success and efficiency?
We tested whether age, gender, and personality traits affected the retrieval success percentage and retrieval by using correlations (for age and personality traits effects) and t-tests (for gender effects), but found no significant results.
Discussion
Our four research questions concern Bookmarks, general website retrieval methods, retrieving personal versus public websites, and individual differences in retrieval. We discuss each research question, then characterize the relations between our results and general PIM theory, before turning to design implications.
RQ1 addressed the usage, success, and reasons for using Bookmarks. Eighty-four percent of the participants stated that their main reason for creating Bookmarks was to facilitate future retrieval of already accessed websites. We found that although Bookmarks were more successful and efficient than other retrieval methods, they were used infrequently, for just 16% of bookmarked websites. This finding contradicts prior work suggesting that refinding problems arise because people fail to create Bookmarks for valued websites (Wen, 2003). Confirming earlier work (Tauscher and Greenberg, 1997), we saw that people created Bookmarks but then did not use them.
Our results extend prior work in identifying reasons for this lack of usage. We observed far greater exploitation of Bookmarks when these were salient in the upper part of the browser. For Bookmark-based retrievals, the majority (78%) exploited the browser’s visible upper menu, with only a small proportion (22%) using the Bookmarks menu hierarchy. This suggests that people forget that they have bookmarked specific websites, relying on other methods to retrieve those websites if Bookmarks are not in immediate sight. This forgetting hypothesis is consistent with Aula et al. (2005), who found that participants who created frequently visited Bookmark collections reported being better able to exploit these successfully for retrieval.
Another reason why Bookmarks may not be used is that people create Bookmarks for websites that they can retrieve using other methods. Almost 70% of the participants stated that they bookmarked frequently accessed websites, but it may be that such frequent websites are well remembered and therefore easier to retrieve using alternative methods. In other words, for a frequently accessed site, people may easily remember a link, a search term, or a URL, without the need to use a Bookmark. This suggests that rather than bookmarking frequently accessed websites, users might better employ Bookmarks for websites that are hard to retrieve by other means. We return to this point in our design suggestions.
Another possible explanation for unused Bookmarks is the “first impressions” hypothesis, which states that “the method of re-finding follows from previously successful retrievals of the information and, ultimately, from an initial encounter when the information is created or otherwise experienced for the first time” (Jones et al., 2014: 560). People tend to store their personal files in folders and therefore tend to retrieve them in the same way—by folder navigation (Bergman et al., 2008; Bergman and Yanai, 2017; Fitchett and Cockburn, 2015). Because foldering is usually successful in allowing people to access their files (Bergman et al., 2010, 2020a), they may think that this method will also be useful for websites, and so bookmark their files. However, at the time of retrieval, they may follow the procedure used when they first accessed the website, which is most often orienteering (Teevan et al., 2004).
Even if people do not actually use Bookmarks for retrieval, we would still expect bookmarked websites to be better retrieved. Cognitive psychology suggests that actively selecting websites by bookmarking them should boost memory and hence retrieval for those websites (Craik and Lockhart, 1972). However, we did not find retrieval benefits for bookmarked compared with Unmarked History websites. One possible explanation is that the Bookmarks had not been used for a long time and therefore the benefit of bookmarking was lost over time, but we found no evidence that older bookmarked websites were retrieved less well than more recent ones.
Our other research questions concerned the frequency of different retrieval methods (RQ2), retrieval of public versus personal websites (RQ3), and individual differences in Bookmark usage between the participants (RQ4). For RQ2, we observed the prevalence of hybrid retrieval methods, which is consistent with orienteering, a retrieval process combining search and navigation (Teevan et al., 2004). Our results extend prior descriptive (Jones et al., 2002; Teevan et al., 2004) and log file analyses (Obendorf et al., 2007; Tauscher and Greenberg, 1997) by documenting the relative frequencies of these retrieval methods in a controlled context. We saw very little use of strategies such as using Bookmarks, History, or browser landing pages containing frequently accessed websites. Prior research suggests possible reasons for not using browser History. It can be hard to identify useful information in the History list, as it intertwines both successful and unsuccessful retrievals (Aula et al., 2005; Morris et al., 2008; Wen, 2003). Browsing landing pages may also be little used because people can independently remember URLs or search terms for websites that they access frequently. For RQ3, we confirmed that people were better at retrieving previously accessed (personal) than unencountered (public) information. Public targets took more than twice as long to retrieve, with almost 50% more retrieval failures, indicating large effect sizes. These results can be interpreted as a familiarity effect. Finally, for RQ4, although previous studies found that retrieval time was positively correlated with participant age for files (Bergman et al., 2014), this was not confirmed here for web pages. And we found no correlation between retrieval parameters and personality traits, which is consistent with Bergman et al. (2020b).
Implications for PIM
What, then, can these results tell us about more general PIM processes? First, our results about creating unused Bookmarks confirm observations from other areas of PIM showing that people sometimes create resources that they do not use for later retrieval, suggesting general difficulties in anticipating future PIM retrieval needs. For example, in email, people create folders that do not improve future retrieval efficiency (Whittaker et al., 2011; Whittaker and Sidner, 1996); they create contacts that they never revisit (Bergman et al., 2012; Whittaker et al., 2002); and they store photographs that they never later access (Whittaker et al., 2010). As we observed with Bookmarks, those studies indicate that people sometimes fail to use such retrieval resources because they forget that they have created them. However, a small subset (16%) of our users here stated that they bookmarked websites for an alternative reason: to serve as reminders about websites that they did not currently have time to read. Creating such reminders about outstanding tasks is a common task management practice in PIM (Bellotti et al., 2004), which is also noted for email (Whittaker et al., 2011; Whittaker and Sidner, 1996) and document folders (Jones et al., 2005). However, designing effective tools for task management is a considerable challenge (Bellotti et al., 2004).
Our research also contributes to emerging theories of PIM. Prior empirical work has shown that users prefer to refind files using manual navigation rather than search (Bergman et al., 2008), which seems to result from the reduced cognitive load for navigation (Benn et al., 2015; Bergman et al., 2013). This is consistent with our current findings that orienteering was far more frequent than pure search. However, one contrast between our results and other areas of PIM is that active organization of personal files promotes effective retrieval (Bergman et al., 2010, 2014), whereas active creation of Bookmarks does not seem to improve retrieval. We have argued that this may occur because users forget the Bookmarks they have created (Aula et al., 2005).
Design implications
Our results also suggest possible design implications, both for Bookmarks and more general website refinding. One set of suggestions addresses reminders about Bookmarks. We have seen that users create Bookmarks but seem to forget that they have done so. There are multiple design approaches that could help users remember their Bookmarks. First, simple reminding methods, like a Bookmark “highlights reel,” might be effective. Highlights could be shown to users at login or when they start their browser, so this does not interfere during retrieval tasks. An alternative might be to add Bookmarks to the set of frequently accessed websites displayed on the browser’s landing page. A final way to remind users about their Bookmarks might be to rank bookmarked websites so that they appear at the top of search results, or to indicate bookmarked websites in search results.
A second class of design implications could address different user motivations for bookmarking websites. A subset of our users reported bookmarking websites that were hard to find. Such websites could be identified from users’ browsing History, using machine learning methods to analyze which websites induced extended or effortful searches. Finally, it might be possible to improve general refinding strategies such as teleporting using machine learning to make active suggestions. Fitchett et al. (2014) showed that hints based on learning prior navigation paths can be successful in improving file retrieval, and similar methods might be explored in the context of website refinding.
Limitations and future work
There are also limitations to our study that promote questions for future work. Our participant pool was homogeneous and more diverse populations should be explored, given that age and job characteristics may influence PIM retrieval (Bergman et al., 2019a; Capra and Perez-Quinones, 2006; Dinneen and Julien, 2020; Khoo et al., 2007; Zhang and Hu, 2014). For example, knowledge workers have larger file collections than students, with different distributions of file types and different organizational strategies (Dinneen et al., 2019; Dinneen and Julien, 2019, 2020). We also did not collect data regarding the time the Bookmarks were created. Another limitation relates to our method. While EPIR generates quantitative data and offers experimental control, it does not capture some aspects of participants’ naturalistic retrieval behaviors. Other qualitative methods, such as diaries, could provide additional insights into users’ motivations for creating Bookmarks and retrieval strategies (Teevan et al., 2004). They could also be combined with EPIR to test the “first impressions” hypothesis (Jones et al., 2014). Diaries could record how a user initially accessed a given website and EPIR studies might then test whether these initial strategies are repeated on subsequent refindings.
Future work could also examine our hypothesis that the participants failed to use Bookmarks because they forgot about them. We could first test the participants’ memory for their Bookmarks by asking users to enumerate these. Actual Bookmark usage could be compared against these memory scores to test the prediction that those who remember specific Bookmarks would be more likely to use them. Another factor that we have argued affects website retrieval strategy is usage frequency. Future work could assess whether Bookmarks are used less often for refinding frequently accessed websites, as users are more likely to remember detailed information about such websites. Finally, there were individual differences in bookmarking practices (Gottlieb and Dilevko, 2003), and future work could build on prior research (Abrams et al., 1998; Aula et al., 2005; Tauscher and Greenberg, 1997) exploring how these practices influence retrieval methods, which we were unable to test directly here.
Conclusion
Our study is the first to quantify the frequency and success of Bookmark retrievals, comparing Bookmarks with other access methods. In general, Bookmark retrieval was infrequent, with Bookmarks being used for just 16% of retrievals of bookmarked websites, and the majority of retrievals being limited to Bookmarks that were permanently visible in the browser bar. The participants’ efforts to organize Bookmarks beyond the limited space of the bar did not help their retrieval. We have discussed possible reasons for this and suggested design implications.
Footnotes
Acknowledgements
The authors thank their participants, Chad Tubbs, Camile Otillio, Riley Hillman and our RAs: Nicholas Santer, Ayla Reed, Mira Anand, Kelly Rogos, Jeremy Del Carpio and Catherine Vileisis.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The authors were supported by a Google Faculty Research Award 2014_R2_79.1.
