Abstract
We spend a lot of time searching for things. If we know what we are looking for in advance, a memory representation of the target will be created to guide search. But if the identity of the search target is revealed simultaneously with the presentation of the search array, is a similar memory representation formed? In the present study, 96 observers determined whether a central target was present in a peripheral search array. The results revealed that as long as the central target remained available for inspection (even if only in iconic memory), observers reinspected it after each distractor was checked, apparently forgoing consolidation of the target into working memory. The present findings challenged the assumption that evaluating items in a search array must involve comparison with a template in working memory.
Palmer (1995) proposed that visual search involves four processing stages: preselection, selection, postselection, and decision. Whereas many efforts have been made to examine the mechanisms of the first three stages (e.g., Lavie, Hirst, De Fockert, & Viding, 2004; Treisman & Gelade, 1980; Wolfe, 1994), little is known about the decision stage. The decision stage involves comparisons between the attentional template (a mental representation of the target) and items in the search array (to determine whether the item is a target). Standard theories of attention assume that the template comparison occurs in working memory (Bundesen, 1990; Duncan & Humphreys, 1989; Treisman & Gelade, 1980). Consistent with this assumption, evidence demonstrates that the attentional template may be stored in working memory during the initial configuration of attentional control settings (Carlisle, Arita, Pardo, & Woodman, 2011; Woodman & Arita, 2011).
However, consolidating information into working memory is time consuming (Jolicœur & Dell’Acqua, 1998; Vogel, Woodman, & Luck, 2006) and may sometimes be inefficient. It has been suggested that the locations of visited distractors may not be remembered during visual search (Horowitz & Wolfe, 1998) and that the attributes of a just-attended object may not always be retained in working memory (Chen & Wyble, 2015). Moreover, the speed of search is faster than the speed of memory consolidation; for example, searching for a color among a group of distinctive colors can be as fast as 10 ms per item (Olds, Cowan, & Jolicoeur, 1999), but consolidating color information into working memory takes about 50 ms per item (Vogel et al., 2006). These findings suggest that properties of the search array need not be consolidated into working memory. Then what about the target itself?
In the present study, we investigated whether target identity is stored in working memory when the identity of the target is both revealed simultaneously with the search array and remains available for inspection during search. Objects are identified and recognized more rapidly than information is consolidated into working memory (Jolicœur & Dell’Acqua, 1998; Potter, 1976; Thorpe, Fize, & Marlot, 1996). If the target identity is available for inspection only while the search array is present, observers may proceed immediately to search without stopping to consolidate the target in working memory. One way to avoid the need for consolidation is to revisit the target during search.
Experiment 1
To test the above hypothesis, we first compared search efficiency in a target-in-memory condition (in which the target identity has to be stored in working memory) with search efficiency in a target-on-screen condition (in which the target identity is shown simultaneously with the search array and remains available during the search). Wolfe, Horowitz, Kenner, Hyle, and Vasan (2004) showed that when the target identity is revealed (as briefly as 50 ms) before the presentation of the search array, an attentional template is formed to guide search and reduce the search time. To control this top-down guidance in the target-in-memory condition, we included a target-first condition in which the target identity is presented 50 ms before the search array and remains on screen until response.
Method
Observers
To determine the necessary sample size, we conducted an a priori power analysis with the program G*Power (Faul, Erdfelder, Lang, & Buchner, 2007). Because we were looking for a difference in the search slope across different testing conditions (i.e., interaction between condition and set size), we conducted a repeated measures analysis of variance (ANOVA) with six measurements and a moderate effect size (Cohen’s d) of 0.25, an alpha of .05, and a power of .80. This analysis gave a minimum sample size of 19 observers. We decided to use 24 observers to provide sufficient power for the experiment. 1
Twenty-four students (14 females) from Zhejiang University participated in Experiment 1 in exchange for ¥30. This sample size yielded a prospective power of .91. The observers ranged from 17 to 23 years of age (M = 19 years) and had normal or corrected-to-normal vision. The experimental procedures were approved by the local research ethics committee and were conducted in accordance with the Declaration of Helsinki. Written informed consent was obtained from all observers before they performed the experiment.
Tasks, stimuli, and design
Observers determined whether a central target letter was also presented in a peripheral circular letter array. The set size (i.e., number of letters) of the search array was varied (one, three, or five letters). The threshold stimulus-exposure duration (TSED) for successfully fulfilling this task with an accuracy rate of 79.4% was measured at each set size. The slope of the TSED × Set Size function also measured the efficiency of search (Li, Xin, Li, & Li, 2018). The reason we used TSED (rather than the reaction time) as the dependent variable was that TSED is less affected by processes in the response-selection stage (Li & Lou, 2019).
Three testing conditions (target on screen, target in memory, and target first) were examined (Fig. 1). All the conditions began with a central fixation cross that was presented for 500 ms. In the target-on-screen condition, the central letter and peripheral search array were presented simultaneously after the fixation cross disappeared. (The exposure duration for both the central letter and the peripheral array was adjusted trial by trial according to a staircase procedure.) Then all the letters in the display were masked by “#” symbols. The stimuli in the target-first condition were almost identical to those in the target-on-screen condition, except that the central letter was presented 50 ms earlier, before the presentation of the search array. However, in the target-in-memory condition, the central letter was shown for 200 ms after the fixation cross disappeared and was then replaced by a “#” for 100 ms. After that, a blank screen was shown for 200 ms, followed by the peripheral search array and a task-irrelevant central letter. (The second central letter was used to control the possibility that a central letter, as used in the target-on-screen condition, would inevitably attract attention and increase the search slope.) The presentation time for this final search display was also determined by a staircase procedure. Finally, all the letters in the display were replaced by “#” symbols. In the target-in-memory condition, observers were told that only the first central letter (the one that appeared alone) was the target and that the second (the one that appeared inside the search array) was task irrelevant. In all three conditions, targets could be present or absent (randomly intermixed), but target-absent trials were not analyzed.

Stimuli and example trial sequences from Experiments 1, 2, and 3. Three conditions were tested in Experiment 1: target on screen, target in memory, and target first. In Experiments 2 and 3, only two conditions were tested: target on screen and target in memory. In Experiment 1, all targets and all letters in the search array were presented lower case and upright. In Experiment 2, target letters were upper case and could be either upright or mirror reversed and tilted, and all letters in the search array were presented lower case and upright. In Experiment 3, there were simple and complex search displays. In the simple display, target letters were upper case, mirror reversed, and tilted, and the letters in the search array were upright and lower case. In the complex search display, this was reversed: Target letters were upright and lower case, and the letters in the search array were upper case, mirror reversed, and tilted. For Experiments 2 and 3, the figure shows only the key screens from the conditions tested.
The letters in the display were always chosen from among the set “b,” “f,” “g,” “h,” “j,” “k,” “m,” “r,” “t,” and “y” and were upright and in lower case. Each letter subtended a visual angle of 1.0° in height and 0.7° in width. Letters in the search array were evenly spaced in an imaginary circle. The center-to-center distance between the central letter and letters in the search array was 2.5°. No two letters in the search array were identical. The central letter varied trial by trial. In about half the trials, the central letter was also shown in a random location of the search array. The three testing conditions were examined within observers but in separate blocks. Different set sizes were randomly mixed in each block. The order of blocks was counterbalanced across observers.
The staircase procedure
A one-up three-down staircase procedure was used to determine the exposure duration of the search display. The initial exposure duration was determined by the number of letters in the search display. Each letter added 60 ms to the initial exposure duration in the target-on-screen condition and added 40 ms to the initial exposure duration in the target-in-memory condition. The exposure duration was decreased by 10 ms after three consecutive correct responses, and it was increased by 10 ms with each incorrect response. A turn was completed when two consecutive changes made to the exposure duration differed (from a decrease to an increase, or vice versa). Ten turns in the staircase were required to complete the trials in each condition. The mean exposure duration at the last four turn points was calculated to determine the TSED in that condition. For the target-absent trials, the exposure duration was synchronized to the concurrent exposure duration for the target-present trials: that is, the exposure duration of the target-absent trials was also determined by the staircase procedure tracking the exposure duration of the target-present trials. We did not track the exposure duration of the target-absent trials because of concerns that the exposure duration used in those trials might be much longer than that in the target-present trials, so that observers could use the exposure duration to judge whether the target was presented.
Data analyses
The difference in search slopes across different testing conditions was analyzed using two-way (2 conditions × 3 set sizes) within-subjects ANOVAs with TSEDs as the dependent variable. For all inferential statistics, alpha was set at .05. Where the sphericity assumption was violated, as measured with the Mauchly’s sphericity test, the Greenhouse-Geisser corrected values are reported.
Results
Figure 2a shows the results of Experiment 1. The slope of the TSED × Set Size function in the target-in-memory condition (M = 30.2 ms per item) was smaller than that in the target-on-screen condition (M = 62.5 ms per item), F(1.58, 36.3) = 29.2, p < .001, η p 2 = .559. However, the slope of the TSED × Set Size function in the target-first condition (M = 65.5 ms per item) was not different from that in the target-on-screen condition, F(1.60, 36.7) = 0.580, p = .528, η p 2 = .025, which means that revealing the target identity before the search array did not explain the difference in the search slope between the target-on-screen and target-in-memory conditions. Note that the result was replicated using an alternative staircase procedure in Experiment S1 (see the Supplemental Material available online).

Threshold stimulus-exposure duration (TSED) as a function of set size and condition in (a) Experiment 1, (b) Experiment 2, and (c) Experiment 3. Error bars indicate 95% confidence intervals.
Experiment 2
The results from Experiment 1 are consistent with the target-reidentification hypothesis. The search slope in the target-in-memory condition was 30.2 ms per item, indicating that the time for identifying a letter was 30.2 ms. The slope doubled in the target-on-screen condition (62.5 ms per item) and the target-first condition (65.5 ms per item), which suggests that, in these two conditions, each letter required a processing time sufficient for identifying two letters. A plausible explanation is that, after checking each distractor, observers reinspected the target. To provide convergent evidence for this possibility, we manipulated the form of the central target in Experiment 2: The central target was either upright or mirror reversed and tilted. Letters in the search array were all upright. We anticipated that it might take observers longer to identify the mirror-reversed and tilted letters than the upright letters. This extra target-encoding time should affect the intercept of the search-time function only if the central target is identified once, but it may increase the search slope if the central target is reinspected after each distractor is examined.
Method
Observers
A new group of 24 students (14 females) from Zhejiang University participated in Experiment 2 in exchange for ¥30. This sample size yielded a prospective power of 1.00, which was calculated with an alpha of .05 and an effect size (Cohen’s d) of 1.13, determined by the results from Experiment 1 (i.e., η p 2 = .559). The observers ranged from 18 to 24 years of age (M = 20 years). They all had normal or corrected-to-normal vision. Written informed consent was obtained from all observers before they performed the experiment.
Tasks, stimuli, and design
In Experiment 2, only the target-on-screen and target-in-memory conditions were tested. In each condition, two forms of the central target (i.e., upright or mirror reversed and tilted) were examined (Fig. 1). In the upright-target condition, the central letter was upright and in upper case. In the mirror-reversed-and-tilted-target condition, the central letter was upper case, mirror reversed, and randomly tilted. In both conditions, letters in the circular array were upright and lower case. All the letters in the display were chosen from among the set “b,” “e,” “f,” “g,” “j,” “k,” and “r.” All the other aspects of the experimental design, stimuli, and tasks were the same as in Experiment 1.
Results
Figure 2b shows the results of Experiment 2. In the target-on-screen condition, the search slope for the mirror-reversed and tilted target (M = 100 ms per item) was significantly steeper than that for the upright target (M = 60.0 ms per item), F(2, 46) = 42.1, p < .001, η p 2 = .647. This is consistent with the reidentification hypothesis. In contrast, in the target-in-memory condition, the search slopes for the two forms of targets (i.e., mirror reversed and tilted: M = 31.0 ms per item; upright: M = 31.3 ms per item) were statistically indistinguishable, F(2, 46) = 0.008, p = .99, η p 2 = .000. This result suggests that the difference in slopes between the two forms of targets in the target-on-screen condition was not the result of a difference in the memory representation between the two target forms.
Experiment 3
In Experiment 3, there were two types of displays: simple and complex (Fig. 1). In the simple display, the target letter was mirror reversed and tilted, while letters in the search array were upright; in the complex display, the central letter was upright, but letters in the search array were mirror reversed and tilted. In the target-on-screen condition, which display would observers search more quickly? Existing theories predict a faster search in the simple display, but the reidentification hypothesis makes an astonishing prediction, that is, search speeds for the simple and complex displays should be similar. Experiment 3 tested this prediction.
Method
Observers
A new group of 24 students (12 females) from Zhejiang University participated in Experiment 3 in exchange for ¥30. This sample size yielded a prospective power of 1.00, which was calculated with an alpha of .05 and an effect size (Cohen’s d) of 1.13, determined by the results from Experiment 1 (i.e., η p 2 = .559). The observers ranged from 19 to 23 years of age (M = 20 years). They all had normal or corrected-to-normal vision. Written informed consent was obtained from all observers before they performed the experiment.
Tasks, stimuli, and design
In Experiment 3, only the target-on-screen and target-in-memory conditions were tested. In each condition, two stimulus displays (simple, complex) were examined (Fig. 1). For the simple display, the letters in the search array were upright and in lower case, while the central target letters were in upper case, mirror reversed, and tilted; for the complex display, the central letters were upright and in lower case, but letters in the search array were in upper case, mirror reversed and tilted. All the letters in the display were chosen from among the set “b,” “e,” “f,” “g,” “j,” “k,” and “r.” The initial exposure duration of the search array was determined by the number of letters it contained. Following the same staircase procedure as in Experiment 1, we added 60 ms to the initial exposure duration in the target-on-screen condition and 40 ms to the initial exposure duration in the target-in-memory condition for each letter. However, for each mirror-reversed and tilted letter, these values were doubled. As in Experiment 1, the exposure duration was decreased by 10 ms after three consecutive correct responses, and it was increased by 10 ms with each incorrect response. All the other aspects of the experimental design, stimuli, and tasks were the same as in Experiment 1.
Results
Figure 2c shows the results of Experiment 3. In the target-on-screen condition, the search slope for the simple display (M = 129 ms per item) was statistically indistinguishable from that for the complex display (M = 121 ms per item), F(1.56, 36) = 1.99, p = .159, η p 2 = .080. But in the target-in-memory condition, the search slope in the complex display (M = 68.2 ms per item) was significantly steeper than that for the simple display (M = 38.6 ms per item), F(1.57, 36.1) = 18.2, p < .001, η p 2 = .44. These results suggest that when target identity is stored in working memory, the time of search is shorter in the simple display than in the complex display, consistent with existing theories of search. By contrast, if the target identity is revealed simultaneously with the search array and remains visible on the screen, the search speed in both simple and complex displays becomes similar.
Experiment 4
Previous work using eye tracking has suggested that in naturalistic tasks, observers may preferentially refixate parts of the work space rather than committing them to working memory (Ballard, Hayhoe, & Pelz, 1995; Hayhoe, Shrivastava, Mruczek, & Pelz, 2003). Although the search slopes reported here are unlikely to involve eye movements, in Experiment 4, we used a variant of the paradigm in which the target was mostly available in iconic memory and search could be completed with exposure times of less than 200 ms (thus minimizing the risk of eye movements). For this purpose, we switched to numeric digits and used a target preexposure of 30 ms, which was immediately followed either by the search array (immediate condition) or by an annular mask and then (1,000 ms later) by the search array (delayed condition). We reasoned that in the delayed condition, observers would need to consolidate the search target into working memory, but that in the immediate condition, iconic memory might be enough to support reidentification while not causing gaze movements.
Method
Observers
A new group of 24 students (9 females) from Zhejiang University participated in Experiment 4 in exchange for ¥30. This sample size yielded a prospective power of 1.00, which was calculated with an alpha of .05 and an effect size (Cohen’s d) of 1.13, determined by the results from Experiment 1 (i.e., η p 2 = .559). The observers ranged from 18 to 26 years of age (M = 21 years). They all had normal or corrected-to-normal vision. Written informed consent was obtained from all observers before they performed the experiment.
Tasks, stimuli, and design
The basic task was similar to that used in Experiment 1. Observers determined whether a central target digit was present in a peripheral digit array. Notably, the target was not presented again in the search array. Two conditions (immediate, delayed) were examined (Fig. 3). Both began with a central fixation cross that was presented for 500 ms, followed by a central target digit for 30 ms. In the immediate condition, a circular digit array was shown immediately after the central target digit disappeared. The exposure time for the digit array was adjusted trial by trial according to a staircase procedure similar to that used in Experiment 1. Finally, all the digits in the display were masked by “#” symbols. In the delayed condition, a mask array consisting of one, three, or five “#” symbols immediately followed the offset of the central target digit. The spatial arrangement of the mask array was similar to that of the search array, but the number of “#” symbols in the mask array was independent of the set size of the subsequent search array.

Stimuli and example trial sequences for the immediate and delayed conditions in Experiment 4. In the immediate condition, the target number was immediately followed by the search array, whereas in the delayed condition, the target number was followed by an annular mask and then a blank screen before the search array was presented.
This mask array was used to control the potential masking effect of the search array on the central target digit in the immediate condition. The mask array was presented for 100 ms; then, after a 1,000-ms blank period, the circular digit array was shown. The exposure time for the digit array was also determined by a staircase procedure. Finally, all the digits in the display were masked by “#” symbols. The digits in the display were chosen from the set of “3,” “4,” “5,” “6,” “7,” and “8.” The center-to-center distance between the central digit and the digits in the circular array was 2°. The other aspects of the experimental design, stimuli, and tasks were the same as those employed in Experiment 1. The initial exposure duration for set sizes one, three, and five was 80 ms, 120 ms, and 160 ms, respectively, for both the immediate and delayed conditions. The other aspects of the staircase procedure were the same as those used in Experiment 1.
The delayed condition was analogous to the target-in-memory condition of Experiment 1 because the 1-s time delay between the presentations of the search array and central digit forced the target representation to be stored in working memory. The immediate condition was analogous to the target-on-screen condition of Experiment 1 because, even after the target disappeared, iconic memory for it would still make the target identity available for inspection for at least 300 ms. We changed the items in the search array to digits and shrank the radius of the circular array to reduce the time of search to less than 200 ms so expected iconic memory duration would suffice for the search.
Results
Figure 4 shows the results of Experiment 4. The search slope in the immediate condition (M = 27.4 ms per item) was statistically steeper than that in the delayed condition (M = 12.5 ms per item), F(2, 46) = 3.89, p = .028, η p 2 = .145. The search slope in the immediate condition was twice that in the delayed condition, which replicated the results of Experiment 1. The search times (i.e., TSEDs) in all testing conditions were below 200 ms, which rules out explanations related to eye movements. The identical target-exposure time (i.e., 30 ms) in the immediate and delayed conditions also minimized potential contributions from the difference in the quality of target encoding.

Threshold stimulus-exposure duration (TSED) as a function of set size and condition in Experiment 4. Error bars indicate 95% confidence intervals.
General Discussion
The findings of Experiments 1 and 4 showed that when target identity is revealed simultaneously with the search array and remains available for inspection (even in iconic memory), the search slope is twice that as when the target identity is stored in working memory before the search begins. In the no-working-memory (unconsolidated-target) search conditions, observers seemed to jump right into the search, depending on reidentification of the target rather than on first establishing a representation of the search target in working memory. This could explain the drop in search-rate efficiency. In a supplementary study (Experiment S2 in the Supplemental Material), the target-in-memory condition of Experiment 1 was modified so that the previewed target would always reappear at the center of the search array. Search slopes in this hybrid condition remained low, which implies that the observers, rather than being lazy, still preferred a faster search rate, as long as sufficient time was allowed to prepare the target representation (in working memory) to guide attention.
The findings of Experiments 2 and 3 were consistent with the reidentification hypothesis—that is, when target identity is revealed simultaneously with the presentation of the search array and remains available for inspection, observers will reidentify the target after examining each distractor rather than taking the time to first consolidate the target identity into working memory. The present findings cannot be attributed to factors related to eye movements because the reidentification process can occur even in iconic memory and within a duration insufficient for initiating saccadic eye movements. If the reidentification hypothesis is correct, that implies that comparison between the target template and items in the search array does not necessarily occur in working memory and that visual search has an unconsolidated-target (working-memory-free) search mode.
Horowitz and Wolfe (1998) showed that search efficiency does not change when the locations of the items in the search array randomly shift every 111 ms compared with when the items’ locations are fixed. They interpreted their findings as showing that the locations of visited items are not remembered (i.e., sampling with replacement; see also Gibson, Li, Skow, Brown, & Cooke, 2000; Gilchrist & Harvey, 2000; Horowitz & Wolfe, 2001, 2003). In fact, the doubled search slope in the nonconsolidated conditions of the present study is reminiscent of findings of this memory-free search. However, such searches have been reported in cases in which target consolidation has clearly occurred and the searched locations are not being consolidated. Some authors have indeed demonstrated conditions in which search occurs without replacement (Kristjánsson, 2000; Lleras, Rensink, & Enns, 2005; Peterson, Kramer, Wang, Irwin, & McCarley, 2001; Takeda, 2004). In the present case, it could be argued that ongoing attempts at consolidating the target interfered with search-location consolidation. However, the results of Experiment 3 seem to contradict this account. If searches in the target-on-screen condition did not include target reidentification, it is difficult to understand how search speed in the simple display was not faster than that in the complex display.
Although target reidentification results in a steeper search slope, this strategy eliminates the time normally used for consolidating the target identity, which might involve several hundred milliseconds. Therefore, the overall efficiency of the unconsolidated-target search is likely better than that of the consolidated-target search when there is minimal preview time. Moreover, the benefit of this reidentification strategy is actually consistent with the finding that observers miss changes between two simultaneous arrays, just as they do in a flicker paradigm, but that their performance improves significantly in the simultaneous condition (e.g., Scott-Brown, Baker, & Orbach, 2000).
In natural tasks (e.g., making a sandwich), observers may frequently go back to check the model (Ballard et al., 1995; Hayhoe et al., 2003), similar to the target reidentification suggested by the present findings. Although eye movements are clearly involved in those natural tasks, the phenomenon observed in the present study is likely attributable to covert attention, because the findings of Experiment 4 show that the reidentification of the target could occur in iconic memory. It also implies that the comparison between the target template and items in the search array may occur either in iconic memory (Sperling, 1960) or in fragile short-term memory (Sligte, Scholte, & Lamme, 2008, 2009).
Supplemental Material
Li_Supplemental_Material_rev – Supplemental material for Visual Search May Not Require Target Representation in Working Memory or Long-Term Memory
Supplemental material, Li_Supplemental_Material_rev for Visual Search May Not Require Target Representation in Working Memory or Long-Term Memory by Zhi Li, Keyun Xin, Jiafei Lou and Zeyu Li in Psychological Science
Footnotes
Acknowledgements
We thank Edward Awh, Frank Durgin, and Jeremy Wolfe for their helpful suggestions and comments on an early draft of this article.
Action Editor
Caren Rotello served as action editor for this article.
Author Contributions
Zhi Li provided the theoretical motivation, designed the experiments, and wrote the manuscript. K. Xin, J. Lou, and Zeyu Li designed the experiments, wrote the computer program for the stimuli, performed the experiments, and analyzed the data. All authors approved the final manuscript for submission.
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
Funding
This study was supported by a grant from the National Natural Science Foundation of China (31671129).
Open Practices
Data and materials for this study have been made publicly available, and the design and analysis plans were not pre-registered.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
