Abstract
The visual experience of objects lies in the ability to perceive and integrate their constitutive features. Conjunctive binding (CB) is the cognitive function that integrates the features of objects as wholes. This review covers the main findings (over the last 10 years) concerning the role of CB in visual working memory (VWM) and cognitive theory, its neural correlates, as well as perspectives for future work. First, we discuss the theoretical cognitive models of CB and how these relate to other cognitive functions. We then integrate neuroimaging evidence with cognitive theory to identify the neural functional network of CB for encoding and maintenance. Also, we describe the field’s transition from experimental to clinical research, which paves the way for work in the area of VWM binding and aging. Finally, we expose the challenges faced by this field of research and analyze its role in the study of dementia and the construction of neuro-cognitive models of conjunctive binding.
Keywords
INTRODUCTION
To construct an experience of the world and the objects within it, the human brain binds features from different sensory modalities. Feature integration theory [1] aims to explain how binding occurs in perception. In visual working memory (VWM), conjunctive binding (CB) integrates and temporarily maintains features in unified object representation [2–4]. Della Sala et al. [5] define it as the cognitive function responsible for integrating the multiple features that compose complex stimuli or the different events that compose rich experiences.
Binding research covers a wide range of modalities (i.e., verbal, motor, auditory, and visuospatial skills) that are dependent on short-term memory (STM) [6–9]. Each of these modes of information has different processing systems and demands different degrees of cognitive resources within working memory (WM) [3, 10]. For example, in VWM, authors distinguish between relational and conjunctive binding. Relational binding forms associations between extrinsic features of objects and locations, and demands more attentional resources [11–13]. On the other hand, CB integrates and maintains the bound features of an object or event (e.g., color or shape) in STM [2, 14–16].
Research in CB function has recently grown in terms of theoretical and methodological approaches, establishing itself as a new field in the study of VWM. This type of binding has not been found to be sensitive to age, educational level, or prior learning [17–19]. Furthermore, it has been applied to the cognitive assessment of neurodegenerative disorders such as Alzheimer’s disease (AD), showing high sensitivity even at the early stages of the pathology [20, 21].
Here we review CB studies (theoretical, experimental, and clinical) published over the last 10 years and discuss the general cognitive theory about the process of binding in VWM. Next, we review the neural correlates of VWM binding, its experimental and clinical foundations, and the implications for AD, thus introducing a new field of research in binding and aging. Finally, we analyze how experimental results support clinical observations and how these considerations reconstruct the theoretical and experimental framework of VWM binding.
What do we know about conjunctive binding?
Research into feature binding originated with studies on short-term memory storage [22] as well as attention and visual information processing [1, 23] (see Fig. 1). Philips [22] introduced the change detection task, in which participants must recognize a stimulus from a visual array containing a set of simple items and remember its features (e.g., shape, color, or number) after a short time delay in one of two possible conditions: single-probed recognition, in which a single stimulus is shown and the participant must decide whether it is new or has been previously shown, and whole-display recognition, in which a set of stimuli is presented and the participant must decide if the target stimulus is in the new set or if the whole set was previously shown. Data obtained with this task have led to a debate about the demand of attentional resources required to solve the task [24]. In single probe condition tasks, Luck & Vogel [25] and Woodman & Luck [26] found little evidence of attentional resource requirements, suggesting that feature binding is an automatic process. In contrast, when testing the single and whole probe conditions, Wheeler & Treisman [27] observed poor performance when participants had to scan a multiple target array, suggesting the need for attentional resources in the whole probe task. The authors suggest that spatial attentional resources are required to integrate distributed features and maintain the binding features in time.

Graphical representation of the evolution of the change detection paradigm. A) classical change detection task composed by black and white squared patterns showed in a screen. Subjects have to recognize a change between the patterns after a short time delay [22]. B) change detection task used by Luck & Vogel [25] and Wheeler & Treisman [27]. In this task, stimuli introduce new colors and conditions such as “whole array condition” requiring to recognize a change in a complete set of stimuli, “single probe condition” asking for detect a change in single stimuli of the array, and “continuous report” requiring to report the color of the target stimulus. C) Simplified version of the task used by Allen, Baddeley & Hitch [14], and Ueno et al. [37, 38]. In this version, disruptive stimuli (e.g., similar target stimulus or sounds) or an additional task (e.g., counting backwards), are introduced to increase the cognitive demand of the task. D and E) versions of the modern binding task used in clinical research including abstract geometrical shapes or everyday familiar stimuli [62, 115].
These results are in line with the “Visual Feature Theory of Attention”, proposed by Treisman & Gelade [1], which aims to explain how an object’s dimensions, features, and locations are processed in parallel across the visual field and identified through the mediation of spatial focal attention in perception. According to this theory, visual perception works in two functionally independent stages: the first is a pre-attentive automatic stage in which the dimensions of a stimulus (color, shape, size) are coded independently and in parallel in feature maps through sets of dedicated feature detectors, while the second stage involves cross-dimensional processing that integrates constituent features in an integrated percept by mediating spatial attention to bound features in a perceptual master map. This theory was expanded by Wheeler & Treisman [27] and supported by other authors [28–30] who argue that constantly maintaining bindings in memory consumes spatial attentional resources. Although this theory offers an elegant explanation for binding in perception and is in line with classic studies in memory and binding (Vogel [26, 32]) about an automatic encoding of binding, these last studies do not support the involvement of spatial attention; instead, they argue that maintaining feature binding in WM is supported by basic gestalt grouping principles of object-based attention.
To determine the role of attention in feature binding, Allen Baddeley & Hitch [14] modified Wheeler and Treisman’s [27] previous design by introducing a disruption task in the single and whole probe conditions where participants count backwards by two and three digits while memorizing colors, shapes and color-shape bindings. This disruption led to reduced recognition accuracy, suggesting that encoding involves central executive functioning. Furthermore, accuracy was lower during backward counting for the maintenance of both individual features and shape-color conjunctions, suggesting that binding deficits are not directly caused by increased attentional demands. Later studies using different experimental conditions [33, 34] yielded similar results, which suggests that holding feature binding is mostly an automatic process that does not require domain general attentional resources [35, 36]. Ueno et al. [37, 38] explored the fragility of bound feature representations by introducing redundant stimuli following the target array and instructing participants to ignore them. They found that disruption is strongest when the interference stimulus is similar to the target stimulus, but minimal when the two are different. This finding reinforces the notion that encoding hold features is an autonomous process at the object level that does not depend on spatial attentional demands. Similar results were also obtained by Johnson et al. [39] with an experimental condition involving a secondary visual search task between intervals of stimuli retention; the authors concluded that attention plays a role in binding features into perceptual representations and transferring them to WM but that feature maintenance does not demand many attentional resources.
Recent studies have tried to expand the experimental paradigm by introducing more demanding disruption tasks [40] thus manipulating attentional load [41]. Specifically, Gao et al. [42] and Shen et al. [43] argue that previously used tasks were designed to restrict domain general attention. Thus, they expanded the change detection paradigm using secondary tasks like mental rotation, transparent motion, and object feature report, which they present in delay intervals under the hypothesis that “if more object-based attention is required for retaining bindings than for retaining constituent features, the secondary task should impair binding performance more so than constituent feature performance” ([42] p. 533). Indeed, in their study, disruption was greater for bindings, leading them to conclude that object-based attention is crucial for binding maintenance and that its role is more than just a visual sketchpad acting like a storage buffer. Another recent study by Wang et al. [44] proposes that features and objects could both be stored in different storage systems within VWM instead of a single system. In this proposal, attentional resources for features or objects are switched depending on the cognitive demand of the task.
Several computational models have been proposed to explain how to bind features into conjunctions in perception and memory. Some of these models aim to integrate cognitive and biological plausibility using architectures based on artificial neural networks [45–47]. Recent models aim to replicate patterns in human experimental data including learning performance, use of attentional resources, misbinding error, or disruption by cognitive load [48–50]. Particularly interesting is the model proposed by Schneegans & Bays [51] that uses a system of neural populations with a conjunctive coding of features. This model brings an interesting perspective about how attentional and storage resources can overlap in the recovery of conjunctions, as proposed in Wang et al. [44].
In brief, what we currently know about CB is that it involves perception, object-based attention, and VWM. Encoding could depend more on visual processing mechanisms; after being perceived, conjunctions are encoded and buffered temporarily. These representations could be maintained by object-based attention resources, and their manipulation involves other types of processing and forms of executive control [36, 53]. The stages of binding formation and attentional and storage resources in WM need to be further clarified, and WM models must be adapted according to the new data. Identifying the neural correlates involved could contribute to this effort.
NEURAL CORRELATES OF BINDING
Studies using different techniques to measure brain activity, such as functional magnetic resonance imaging (fMRI), electroencephalography (EEG), and magnetoencephalography (MEG), have shown that several brain structures participate in CB (Table 1), and while these structures vary across studies, some similarities are evident.
Summary of works using neuroimaging techniques in relational and conjunctive Binding function in STVM (chronological order)
Studies using fMRI while participants performed intra-object CB tasks have shown a large cortical network that includes parietal-occipital, frontal and sub-hippocampal regions. Specifically, reported regions include the lateral occipital complex, fusiform gyrus, perirhinal cortex, intraparietal sulcus, parietal-occipital and parietal-temporal junction, and the inferior-superior parietal cortex, especially in the right hemisphere [52, 54–61]. Parra et al. [52] report involvement of functional temporo-parietal networks together with frontal areas such as the frontal eye fields and precentral gyrus in the active maintenance of shape-color conjunctions. Wei et al. [59] and Pikema et al. [55] propose a frontal-parietal network that includes the frontal eye fields, intra parietal sulcus and left superior and left middle frontal gyri. The frontal regions are thought to contribute to the maintenance and processing of multi-features requiring additional resources [52, 62]. Many of these regions reported in feature and conjunctive binding have been previously reported as neural correlates of VWM [63], particularly the frontal-parietal network [64–66]. Furthermore, patients with lesions in the parietal lobe show deficits in feature binding, highlighting a critical role for posterior parietal networks [67]. Also, improved performance on the VSTM binding task was observed following stimulation of the parietal lobe with intracranial electrodes [68].
Data obtained with EEG and MEG during CB WM tasks support the findings obtained with fMRI. In the feature retention condition, oscillatory phase synchronous EEG activity was observed in the parietal-occipital region together with low synchronous prefrontal activity [69, 70]. This functional connectivity between parietal- occipital and frontal-parietal regions are often reported in recognition WM tasks [70–72]. Holz et al. [69]. and Philips et al. [73] report how alpha-gamma oscillation and the high theta-gamma phase can vary in intensity according to the cognitive demands of the task in the visual search of features or conjunctions (from 8–12 Hz 150–200 ms after probe presentation, in feature recognition and 30–38 Hz, 175–250 ms after probe, in search of conjunctions). Gamma activity has been associated with response to stimulus features and their oscillation with alpha activity with encoding of information and maintenance of object features, while theta activity and their oscillation with gamma has been associated with maintenance and more demanding cognitive control functions of WM [73–79].
Studies using MEG, on the other hand, describe alpha activity in parietal-occipital regions [76, 80] and theta activity in left medial frontal gyrus and anterior cingulate cortex [81], as well as ventral extrastriate activity during manipulation and maintenance of visual information in WM [82, 83]. The interpretation of this alpha activity has been discussed in two ways: 1) as the maintenance of cognitive load in WM [84, 85], and 2) as the disengagement of functional inhibition of attentional resources to suppress irrelevant information [80, 86].
These neural correlates of CB show a structural articulation and recursive interaction between perceptual and VWM regions at two functional levels. First, findings emphasize the role of a posterior parietal-occipital network in binding as well as an earlier role in the processing stages and subsequent functional coupling with prefrontal regions; this last activation has been associated with binding maintenance (see Fig. 2). Thus, features are initially processed involuntarily, and subsequently stored during object recognition, as has been previously argued (see [78, 87–89]), which partially coincides with Baddeley et al. [4, 53], who argued for passive storage and automatic attention processes. Based on these findings, we suggest that encoding and storage are supported by an occipital-parietal network and the ventral extrastriate pathway, and involve the initial participation of frontal regions. A potential second functional network could start with the functional coupling of frontal-parietal-occipital regions, which would suggest an active form of cognitive processing at the object level. In turn, this object-based network could be driving the episodic buffer (as suggested by Gao et al. [42]) and facilitating binding maintenance and manipulation. This hypothesis gains support from the observed synchronous electrical activity between posterior and anterior regions [76, 90]. These findings also provide support for the notion that there are two neural mechanisms underlying VWM [91, 92]. Further studies are needed to integrate a neurocognitive model of VWM binding.

Results with different techniques emphaticizes the role of posterior occipital-parietal and frontal regions in conjunctive binding. 1) After being perceived, the occipital regions play a role in visual mechanisms and early encoding of information. 2) The parietal regions (i.e., occipital-parietal network) could play a dual role in initial storage of information and integrating representations of feature conjunctions and also provides attentional support required to maintain conjunction in a “online” way. 3) The frontal regions reflect the selection and transfer of visual information into WM. The coupling with posterior network suggests a participation in encoding and maintenance in WM of conjunctions against distractors and their manipulation over time.
In summary, these findings highlight visual spatial processing in binding encoding involving perirhinal cortex and parietal occipital networks. Maintenance and recognition involve frontal regions coupling functionally with that parietal occipital network.
BINDING AND AGING: FROM EXPERIMENTAL TO CLINICAL RESEARCH
In the last 10 years, the research on binding has clearly moved towards the clinical setting, with experimental findings contributing to the understanding of healthy as well as pathological cognitive aging. Several studies have argued that CB is not affected in healthy aging [9, 93–95] and may not be affected by education level or repeated testing [88, 96–98].
Normal aging is usually accompanied by a minor cognitive decline. Since CB seems to be preserved in healthy aging, its decline may be a cognitive marker of neurodegenerative processes. Clinical-experimental data indicate that impaired VWM binding is specific to AD and is not observed in non-AD dementias such as frontotemporal dementia, Parkinson’s disease, vascular dementia, or Lewy body dementia [5, 98]. Visuospatial functions are also compromised in AD [99, 100]. In fact, CB tasks have been shown to be impaired even at the early stages of AD [16, 101], suggesting it may be a preclinical indicator of AD before hippocampal structures are affected [102, 103].
Research has shown that patients with familial AD (i.e., carrying the E280A-PSEN1 mutation) [21, 104] and sporadic AD [57] perform worse than controls on both color-color and shape-color binding tasks, even during the asymptomatic and prodromal phases of the disease. There is also evidence that white matter damage in the frontal and anterior portions of the corpus callosum is associated with hereditary AD (by E280A-PSEN1 mutation at the prodromal phase) as well as poor performance on the CB task [21]. Moreover, patterns of reduced brain connectivity in frontal and posterior regions have been observed during VWM binding tests in patients with familial AD in the prodromal and advanced stages [79, 105].
Impaired relational binding in AD has also been reported by Liang et al. [8], Pertzov et al. [11], and Peich, Husain, & Bays [12]. However, this function is associated with the hippocampus [106], and may be diminished by normal aging, whereas CB does not depend directly on hippocampal structures and is not weakened by normal aging or hippocampal deficits [107]. Clinical evidence also suggests that hippocampal activity is specific to relational binding [10, 109]. In line with these findings, Parra et al. [110] presented the case of AE, a 72-year-old patient with bilateral hippocampal damage due to a stroke in the right portion of the posterior artery. The patient presented amnesia-type difficulties in the verbal and visual domains of a standard neuropsychological battery, and performed better on visual conjunctive than visual relational binding tasks. Likewise, patient Jon, who lacked 50% of both hippocampi without any other apparent middle temporal lobe injury, presented deficits in spatial relational binding tasks despite his successful performance on shape-color binding tasks. Another interesting case is KA, a patient with developmental amnesia and selective damage to the whole hippocampal system [62]. This patient shows poor performance in memory tasks that require relational binding usually associated with hippocampal function. Instead, she can hold and manipulate color-shape conjunctions in two different tasks. This case also supports the notion that relational and conjunctive binding can dissociate.
Current standard cognitive measures can only detect AD at its prodromal or clinical stages [20, 111]. However, experimental tasks of VWM and binding may have the potential to identify minor cognitive deficits related to neural network alterations [20, 105]. In addition, discriminating the preclinical phase of AD from subjective cognitive decline (SCD) is challenging, as there is a slight decrease in basic cognitive functions in both conditions. This is why studies on binding and aging focus on differentiating between the cognitive trajectories of normal elderly individuals and patients with SCD and AD. Preliminary studies have indicated that VWM binding may contribute to that effort [112–114] and could be a transcultural cognitive indicator of AD [115]. Even when the CB tests are not developed enough to be implemented in clinical settings, interesting results are observed, such as the experimental assessment of feature binding in objects of everyday life [116], which indicates that we may be able to develop new ways of assessing and diagnosing the cognitive deficits associated with aging and neurodegenerative disorders.
Additionally, some neuroimaging evidence [117–123] suggests that relative to healthy controls, older patients with amnesic MCI who are at risk of developing AD show low glucose utilization, hypometabolism and less gray matter in fusiform gyrus, parietal posterior cortex, prefrontal lateral cortex, the posterior cingulate cortex, precuneus, hippocampus, insula, anterior-superior temporal sulcus, and right amygdala. Specifically in visual tasks [124], MCI patients show reduced activity in the visual frontal-parietal network and temporal areas while encoding and retrieving visual information. This is in line with some neural correlates reported for CB and the posterior frontal mechanisms reviewed above.
Future studies should integrate evidence about CB function and its potential use as a cognitive marker together with biomarkers of AD. Furthermore, testing should be conducted with individuals from different cultures to establish CB as a cross-cultural cognitive marker of pathological aging, thus unifying dementia diagnoses worldwide. Alladi & Hachinski [125] highlight seven challenges and priorities for the study of dementia worldwide, including exploring life trajectories that potentially impact cognitive reserve and studies in visual short-term memory and CB in aging could have potential as cognitive marker to face this challenge. Finally, the development of a clinical assessment protocol based on these cognitive functions could be an excellent low-cost diagnostic tool applicable in many contexts.
The goals of a world dementia challenge also involve expanding clinical trials, and include the populations of southern countries in cross cultural neuropsychological testing. Research from experimental to clinical in CB has advanced our understanding of how processes like attention and STM are related and brings helpful insights into cognitive deficits in aging and the early stages of AD. Finally, this work could contribute to the development of new cross-culturally valid diagnostic tools as well as the role of cognitive reserve and prevention in different life trajectories.
CONCLUSIONS
Over the past decade, experimental findings in VWM binding have led to important clinical considerations. Further integrating experimental and clinical research could settle what appears to be rough discussions on the functioning of VWM as well as whether actual clinical (i.e., diagnostic) tools can be developed. Clinical research in CB can also contribute to theoretical discussions about “the binding problem”. Finally, longitudinal studies of individuals with AD, MCI and SCD can contribute significantly to this important line of research, as well as provide complementary behavioral, computational, and brain imaging approaches.
