Abstract
The ability to remember and manipulate visual information is pervasive and is associated with many cognitive abilities. Yet despite the importance of visual working memory (VWM), there is little consensus among researchers in the field as to which neural areas are necessary and sufficient and which models best describe its capacity. Here, we propose that an assumption that all people remember visual information in the same way has led to much contention and inconsistencies in the field. By accepting that there are multiple cognitive strategies and methods to perform a VWM task, we introduce an individual “precision” approach to the study of memory. We propose that VWM should be redefined, not by the type of stimuli used (e.g., visual) but rather by the specific mental processes (e.g., visual imagery, semantic, propositional, spatial) and the corresponding brain regions used to complete the mnemonic task. We further provide a short how-to guide for measuring different mnemonic strategies used for working memory.
The ability to remember visual information varies substantially from person to person, yet despite these differences and the importance of visual memory in everyday life, little research has investigated why these individual differences occur. It may be that these differences arise because some individuals have better or more efficient neural machinery. Another possibility is that different people use different cognitive strategies and, therefore, different neural machinery to complete the same task with the same stimuli.
Here, we review the empirical evidence that individual differences in cognitive strategies during working memory tasks composed of visual stimuli explain, in part, the observed variation and controversy regarding the results of neuroimaging studies and computational models of memory capacity. We consider how different cognitive strategies and a shift in our definitions and theories of visual working memory (VWM) can help explain these controversies. Throughout this article, we discuss VWM in regard to how it is currently classified: as a mnemonic task that involves visual stimuli to be maintained or manipulated during a delay period, irrespective of the cognitive strategy or mnemonic method employed by the participants. We propose that VWM should be defined not by the type of stimuli used (e.g., visual), but rather by the specific mental processes (e.g., visual imagery, semantic, propositional, spatial) used by a given individual to complete the memory task.
The Current State of VWM Research
One of the most influential theories of working memory is the multicomponent theory of working memory (Baddeley & Hitch, 1974), which posits that working memory is made up of separate components. The three main components of working memory are the central executive, the phonological loop, and the visuospatial sketchpad. The central executive is proposed to be the general task-control system, whereas the phonological loop and visuospatial sketchpad are “slave” systems that store modality-specific information for use during working memory. More recent theories of VWM further break down the visuospatial sketchpad into the visual buffer and the visual cache (Cocchini, Logie, Della Sala, MacPherson, & Baddeley, 2002; see Fig. 1). The visual buffer is thought to create and maintain low-level visual representations and is often referred to as an “active” state of memory, easily perturbed by incoming sensory information because of the common use of the primary visual cortex (Borst, Niven, & Logie, 2012). The visual cache, on the other hand, stores information at a lower resolution, or more abstract form, and is thought to use the posterior parietal cortex and is, therefore, more protected from incoming sensory interference.

Proposed general neural correlates of the visual cache and visual buffer: posterior parietal and primary visual cortex, respectively.
Although the multicomponent theory of VWM is probably the most influential and widely taught theory of working memory, there is still much dissention in the field regarding which neural areas are necessary and sufficient to carry out these tasks and how to best characterize VWM capacity limits.
Neural correlates of the VWM controversy
The neural correlates of VWM have been a contentious issue for over a decade, with numerous researchers reporting conflicting evidence for which brain areas are involved (for more in-depth reviews of the neuroimaging literature, see D’Esposito & Postle, 2015; Sreenivasan, Curtis, & D’Esposito, 2014). Early neuroimaging studies used sustained elevated blood-oxygen-level-dependent (BOLD) signals during the memory-delay period as an indicator that a given brain region was involved in information retention. Many studies using these methods found evidence of only high-level areas (such as prefrontal and parietal cortices) being involved in VWM tasks (Curtis & D’Esposito, 2003; Todd & Marois, 2005). More recent research has indicated that a network of high- and low-level visual areas is important for creating, maintaining, and manipulating visual information during VWM tasks. Recent research has shown that memory content can be decoded from visual areas during memory retention, even in the absence of sustained BOLD activity, using multivoxel pattern analysis. This suggests that low-level sensory areas play an important role in mnemonic maintenance of memory content (Albers, Kok, Toni, Dijkerman, & de Lange, 2013; Harrison & Tong, 2009; Serences, Ester, Vogel, & Awh, 2009). However, other studies have found that only the type of task being performed could be decoded from higher-level areas (Lee, Kravitz, & Baker, 2013; Riggall & Postle, 2012).
These decoding studies support sensorimotor-recruitment models of working memory (Postle, 2016). These models propose that during memory tasks, the same sensory regions that were involved during perception of the stimuli (i.e., primary visual cortex) are activated during VWM. This suggests that the memory content exists in these sensory areas, whereas any activation observed in high-level areas reflects noncontent memory processes.
However, recent research has demonstrated that, in some cases, content-specific mnemonic information can also be decoded from high-level areas (Bettencourt & Xu, 2016; Ester, Sprague, & Serences, 2015; Galeano Weber, Peters, Hahn, Bledowski, & Fiebach, 2016). Similarly, another recent study found that a color in memory could be decoded at above-chance levels only in the intraparietal sulcus, and this decoding correlated with behavioral measures of memory precision (Galeano Weber et al., 2016). These conflicting results produce further ambiguity and debate around the neural antecedents of VWM.
Modeling-literature controversy
There are two major, competing theories of VWM capacity, known as the discrete-slot model and the flexible-resources model. The discrete-slot model suggests that capacity limits occur once an individual’s unique number of “slots” is exceeded. In very basic discrete-slot memory models, the complexity of the images to be remembered does not affect the number of items an individual is able to hold in mind (Luck & Vogel, 1997). In contrast, the flexible-resources model posits that individuals have a finite pool of resources that can be spread out among many mental representations, and both the number of items to be remembered and the complexity of each item will influence the capacity limit (Bays & Husain, 2008).
So far, no one theory has reconciled the inconsistent findings throughout the modeling literature. There is much recent research that has found evidence supporting either a discrete (Pratte, Park, Rademaker, & Tong, 2017) or a flexible model (Taylor, Thomson, Sutton, & Donkin, 2017; Veksler et al., 2017) of VWM, demonstrating that no real consensus has yet to emerge as to how to best understand the capacity limits of VWM.
A Possible Solution to These Controversies: Individual Differences in Strategies Used During VWM Tasks
The capacity of VWM is highly variable, with individual-limit estimates ranging from as little as one item to as many as seven items (Vogel & Machizawa, 2004). Some work has investigated individual differences in attentional control in relation to working memory capacity (for a review, see Kane & Engle, 2002); however, little research has investigated why such large variations in low-level visual-representational capacity limits exist. One study by Vogel and Machizawa (2004) found that an individual’s memory capacity was predicted by the change in event-related potentials from a small to a large set size during the delay period of the memory task, contralateral to the visual presentation (also known as contralateral delay activity, or CDA), with high-capacity individuals showing greater increases in amplitude with larger set sizes. Another study found that individuals who were better at filtering out irrelevant information, measured using the electroencephalographic (EEG) CDA amplitude, had larger VWM capacities (Vogel, McCollough, & Machizawa, 2005), whereas a correlational study showed that the functionally defined size of the primary visual cortex predicted VWM capacity (Bergmann, Genc, Kohler, Singer, & Pearson, 2016). However, these studies provide little information about differences in the format, modality, or nature of mnemonic representations for high-capacity versus low-capacity individuals.
One possibility is that the large range of differences in working memory performance and the range of findings in the modeling and neuroimaging literature may be due to differences in cognitive strategies and brain areas utilized when completing the tasks. These differences in strategy may be driven by both the demands of the task itself (e.g., a change-detection task vs. continuous recall, capacity vs. accuracy) and the cognitive preferences and profiles of the individuals. A number of reviews have touched on how task differences may influence the variable findings throughout the VWM neuroimaging literature (Christophel, Klink, Spitzer, Roelfsema, & Haynes, 2017; Serences, 2016). However, to date, there has been little VWM research investigating how individual differences in cognitive strategy may drive performance and the observed conflicting results.
Research into individual variability during other nonvisual working memory cognition indicates that even when individuals are performing the same task, different patterns of neural activation can be seen. These different patterns appear to be consistent over time (Miller et al., 2002) and are related to individual cognitive strategies and styles (Miller, Donovan, Bennett, Aminoff, & Mayer, 2012). Another study found that different neural networks activated depending on whether participants who performed the same spatial-memory task employed a verbal or visual strategy (Sanfratello et al., 2014). Similarly, a study investigating mental rotation found that individuals who were defined as good or poor visual imagers used different neural networks to complete the same task (Logie, Pernet, Buonocore, & Della Sala, 2011).
Interestingly, when participants are asked what type of strategies they use for completing visual memory tasks, they typically describe two primary methods. One of those strategies is to pick out salient features in the memory array and encode them in a propositional or phonological form, which is then compared with the test stimuli or array (Berger & Gaunitz, 1979). This strategy most likely uses the visual cache, or possibly even the phonological loop, and therefore should not activate early visual areas. The other strategy commonly described is the creation of detailed mental images during the retention interval, which are then compared with the subsequent test arrays or stimuli (Berger & Gaunitz, 1979; Harrison & Tong, 2009; Keogh & Pearson, 2011, 2014). These descriptions are synonymous with descriptions of mental imagery, and it is likely that these individuals use mental imagery, or the visual buffer and the early visual areas, as a mnemonic tool to retain visual information.
Visual Imagery as a VWM Strategy
Perhaps limits to VWM capacity reflect the degree to which, or the efficiency with which, individuals use the early visual areas to maintain or create visual sensory representations of the to-be-remembered items. The neuroimaging literature suggests that imagery may be used as a strategy during VWM, with a high level of neural overlap (specifically the visual cortex) between imagery and VWM tasks (Albers et al., 2013; Harrison & Tong, 2009; Kosslyn, Ganis, & Thompson, 2001; Kosslyn & Thompson, 2003; Kosslyn, Thompson, Kim, & Alpert, 1995). In a recent functional MRI (fMRI) decoding study, individuals completed both a visual-imagery and a VWM task (Albers et al., 2013). Interestingly, the decoder was able to predict which item was being remembered or imagined regardless of which task it trained on. These findings show that the pattern of activity in the early visual areas for both imagery and visual memory are very similar.
Recent behavioral studies by Keogh and Pearson (2011, 2014) may shed further light on the use of imagery as a strategy in VWM tasks. Both studies found positive correlations between visual-imagery strength and VWM performance, but not between visual-imagery strength and iconic or numerical working memory. Additionally, both studies found that background luminance (thought to interfere with the creation of low-level visual representations) attenuated performance on both VWM accuracy and capacity, but not on a number-memory task. However, this effect was observed only for “good imagers,” suggesting that only they use imagery as a mnemonic tool to complete VWM tasks.
The proposition that only good imagers use visual imagery, and hence the visual cortex, to perform VWM tasks may help to explain much of the controversy that exists throughout both the behavioral and neuroimaging VWM literature. If some individuals use visual imagery, and hence early visual cortex, to retain visual stimuli in mind during VWM tasks and there happens to be a high proportion of either good or poor imagers in the sample, then it is no surprise that some imaging studies find evidence of the involvement of visual areas and others do not.
Different mnemonic strategies, and hence use of different neural areas during storage, may require alternative model parameters or even different memory models. It is well known that differing levels of the visual hierarchy store and process different types of modality-specific information at differing levels of complexity and abstraction. In the visual system, information processing goes from simple to more complex as it ascends the visual hierarchy because of the pooling of neurons and their receptive fields (Hubel & Wiesel, 1962). It is quite likely that if an individual uses a low-level sensory-imagery strategy (and low-level visual areas), the limitations to what can be held in mind will be very different from those of someone who uses a propositional mnemonic strategy (likely using higher-level areas). This means that the capacity limits of VWM will be driven by, or constrained, in part by the physical “real estate” of the neural area used to maintain the visual information in mind (Franconeri, Alvarez, & Cavanagh, 2013). Therefore, if different individuals use different neural areas to create and maintain information in mind, this may well result in the observed disparities in model fitting across studies. These models might be fitting different cognitive strategies and the use of distinct neural structures rather than VWM’s general-capacity store.
Work from our lab and others has utilized binocular rivalry to objectively measure the strength of visual imagery. Binocular rivalry is a visual illusion in which two images are presented, one to each eye, and extended perception alternates between the two images. We previously found that both prior imagery and perception can influence what is seen during binocular rivalry, with weak visual-perception and imagery priming binocular rivalry (i.e., if an individual imagines a red image, or is shown a very weak image, he or she is more likely to see that image in a subsequent binocular-rivalry display). Recent work using binocular rivalry has also shown that visual imagery also has its own capacity limits, which are similar to those observed in VWM (Keogh & Pearson, 2017b). Additionally, visual imagery was less precise when participants had to imagine more images simultaneously, and imagery was stronger when individuals imagined more homogenous arrays, suggesting that imagery is likely a dynamic resource. Future work should aim to model these imagery capacity limits along with VWM, and the strategies used by an individual, to better ascertain the limits of VWM.
Nonimagery Strategies for VWM
The question as to what strategy or neural areas are used by individuals without any visual imagery is an open question. An interesting condition known as aphantasia might help to answer this. People with aphantasia have been characterized by the self-described complete inability to imagine objects, or a loss of imagery phenomenology—their mind is “blind” (Zeman, Dewar, & Della Sala, 2015)—and also by a lack of functional sensory imagery measured with the binocular-rivalry technique (Keogh & Pearson, 2017a). Surprisingly, however, recent research suggests that mental spatial manipulations are intact in these individuals (Keogh & Pearson, 2017a; Zeman et al., 2010). A recent case study also suggests that people with aphantasia may still be able to perform easy VWM tasks (Jacobs, Schwarzkopf, & Silvanto, 2017), obviously not using imagery as a mnemonic strategy. Assessing the strategies and neural correlates that people with aphantasia use to complete such working-memory tasks will help to further elucidate the many possible ways VWM tasks can be performed.
Refining Theories of VWM
Information conveyed visually can be stored in several different ways, which likely use distinct neural areas and networks. Here, we propose that it is time to move away from defining VWM in a univariate manner based on the type of stimuli and instead move toward a precision approach, defining it not only by the stimuli presented to the participants but also by the cognitive strategy used by an individual to complete the task (see Fig. 2). We propose that refining our definitions of VWM by including cognitive strategy as an important variable has the potential to unify the multiple ongoing debates regarding both the functional models and neural mechanisms of VWM.

Schematics showing the current visual working memory framework (a) and our proposed strategy-driven visual working memory framework (b).
How To Identify a VWM Strategy
Memory strategies can be uncovered with surprising ease; simply asking an individual what type of strategy he or she used can be informative. Additionally, structured questionnaires such as the Vividness of Visual Imagery Questionnaire and the Spontaneous Use of Imagery Scale can be easily and quickly administered to assess an individual’s visual imagery. However, more objective measures can also be employed to assess the strength of visual imagery, such as the binocular-rivalry paradigm. Although, the vividness and strength of imagery does not directly index an individual’s VWM strategy per se, it is likely that the stronger an individual’s imagery, the more likely he or she will be to utilize it as a mnemonic tool (Keogh & Pearson, 2011, 2014). It is also important to keep in mind that strategies could change throughout the duration of a task and that using subjective reports to elucidate the cognitive strategy used by an individual requires the participant to successfully introspect.
A more memory-specific approach is to utilize different methods of specifically perturbing memory maintenance. If an individual uses a visual strategy (and hence the visual cortex), irrelevant retinal stimulation (e.g., increasing background luminance, static, or dynamic visual noise) should interfere with task performance. In a similar vein, the use of articulatory suppression should impair performance for participants who use a more propositional strategy. Additionally, applying procedures such as transcranial magnetic stimulation and transcranial direct-current stimulation to visual cortex can also be used to infer what type of strategy is being employed. If an individual is using the visual cortex and visual imagery to perform a task, for example, altering the activity in this region should likewise alter performance on the task. On the other hand, if an individual is using higher-level areas, then perturbation to the visual cortex should have a much smaller effect or no effect on performance. In addition, simply using an individual-participant analysis technique for EEG or fMRI data and combining results with reports of strategy could also be informative. Splitting participants into strategy groups when analyzing data, adding strategy as a covariate or mediating variable, or explicitly instructing participants to use only one strategy to complete a task will likewise provide further information about the effects that differences in strategy will have on task performance. The use of neurofeedback could also potentially be used to teach individuals to employ a particular strategy through reinforcement when the desired neural outcome is observed. There are a multitude of possibilities to measure or rein in the types of strategies used by individuals during visual memory tasks; however, the main takeaway of this piece is that cognitive strategies can and do influence the mechanisms and neural correlates of memory tasks, and this variable should be accounted for.
Why people use one strategy over another is an area of study that has yet to be fully explored. It may be that people use the best tool available to them, or perhaps it is a matter of past experience and what has previously served them well. It is likely that strategy choice is constrained by the structural and functional makeup of their brain.
Delving into the “black box” of the mind to study these differences in strategies will not only help us understand why some people perform better than others on certain cognitive tasks and why some studies find support for one theory over another, but it will also help us understand the large subjective differences experienced in our lives and what it is that makes us unique human beings.
Recommended Reading
Ester, E. F., Rademaker, R. L., & Sprague, T. C. (2016). How do visual and parietal cortex contribute to visual short-term memory? eNeuro, 3(2). doi:10.1523/ENEURO.0041-16.2016. An example of the current state of debate in the visual working memory neuroimaging literature.
Pearson, J. (2014). New directions in mental-imagery research: The binocular-rivalry technique and decoding fMRI patterns. Current Directions in Psychological Science, 23, 178–183. doi:10.1177/0963721414532287. A comprehensive review of visual imagery measurement tools.
Pearson, J., Naselaris, T., Holmes, E. A., & Kosslyn, S. M. (2015). Mental imagery: Functional mechanisms and clinical applications. Trends in Cognitive Sciences, 19, 590–602. doi:10.1016/j.tics.2015.08.003. Another comprehensive review of visual imagery.
Tong, F. (2013). Imagery and visual working memory: One and the same? Trends in Cognitive Sciences, 17, 489–490. doi:10.1016/j.tics.2013.08.005. An opinion article on the relationship between visual imagery and visual working memory.
Xu, Y. (2017). Reevaluating the sensory account of visual working memory storage. Trends in Cognitive Sciences, 21, 794–815. doi:10.1016/j.tics.2017.06.013. Another example of the current state of debate in the visual working memory neuroimaging literature.
Footnotes
Acknowledgements
Both authors contributed equally to this work.
Action Editor
Randall W. Engle served as action editor for this article.
Declaration of Conflicting Interests
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
