Abstract
Background:
It is increasingly recognized that the complex functions of human cognition are not accurately represented by arbitrarily-defined anatomical brain regions. Given the considerable functional specialization within such regions, more fine-grained studies of brain structure could capture such localized associations. However, such analyses/studies in a large community-dwelling population are lacking.
Objective:
To perform a fine-mapping of cognitive ability to cortical and subcortical grey matter on magnetic resonance imaging (MRI).
Methods:
In 3,813 stroke-free and non-demented persons from the Rotterdam Study (mean age 69.1 (±8.8) years; 55.8% women) with cognitive assessments and brain MRI, we performed voxel-based morphometry and subcortical shape analysis on global cognition and separate tests that tapped into memory, information processing speed, fine motor speed, and executive function domains.
Results:
We found that the different cognitive tests significantly associated with grey matter density in differential but also overlapping brain regions, primarily in the left hemisphere. Clusters of significantly associated voxels with global cognition were located within multiple anatomic regions: left amygdala, hippocampus, parietal lobule, superior temporal gyrus, insula and posterior temporal lobe. Subcortical shape analysis revealed associations primarily within the head and tail of the caudate nucleus, putamen, ventral part of the thalamus, and nucleus accumbens, more equally distributed among the left and right hemisphere. Within the caudate nucleus both positive (head) as well as negative (tail) associations were observed with global cognition.
Conclusions:
In a large population-based sample, we mapped cognitive performance to cortical and subcortical grey matter density using a hypothesis-free approach with high-dimensional neuroimaging. Leveraging the power of our large sample size, we confirmed well-known associations as well as identified novel brain regions related to cognition.
INTRODUCTION
Human cognition comprises a variety of important domains including memory, information processing speed, and executive function. Cognitive ability is associated with important health outcomes [1–3] and varies between individuals and throughout life [3]. It is determined by both genetic and environmental factors [4, 5], which are reflected in the structure of the brain [6, 7].
Many of the initial links between brain structure and cognition arose from clinical observations of patients with localized brain lesions or following surgical interventions [8, 9]. Subsequent neuroimaging studies have used these observations in hypothesis-driven approaches to study the neural substrate of human cognitive ability including its various domains [9, 10]. These studies have primarily focused on aggregate measures over the entire brain regions, e.g., volumetric measures of the prefrontal cortex [11, 12], thalamus [13] or hippocampus [14]. However, it is increasingly recognized that the complex functions of human cognition are not accurately represented by anatomical regions that are arbitrarily defined based on macroscopical landmarks or histological microstructure, e.g., Brodmann areas [15–17]. Moreover, a considerable functional specialization typically exists within such regions. For example, in Alzheimer’s disease the size of hippocampal subfields contains information important for cognition beyond the gross hippocampal volume [18–20]. The thalamus comprises more than 60 cytoarchitectonically and functionally distinct nuclei, all of which have a different pattern of anatomical connections to other brain regions [21, 22].
An alternative to hypothesis-driven analyses are hypothesis-free approaches that interrogate brain structure at the highest resolution and provide the opportunity to explore the association beyond just aggregated measures. For instance, voxel-based morphometry (VBM) studies volumetric differences at the level of the voxel, the smallest unit of measure of an MRI scan. In recent years, a large body of literature has emerged that employs VBM and related techniques to study how brain structure relates to cognition [12, 23– 26]. However, still several knowledge gaps remain: First, many hypothesis-free brain imaging studies are performed in relatively small studies, thereby running the risk of false-positive findings or non-significant results. Larger sample sizes can overcome this restriction and yield more robust findings. Second, in addition to VBM analysis, the shape of subcortical structures allows to study the brain regions beyond just volumetric measures [27] and may represent underlying subfield organization [28]. Against this broad background, a few hypotheses were expected to be confirmed: we hypothesized that VBM analysis would reveal certain cognitive domains to be associated with certain anatomically defined brain regions. For instance, associations the hippocampus with memory. Furthermore, we expected the subcortical shape analysis to reveal that the individual subcortical structures are not homogenously associated with cognition, but instead specific subregions would show more subtle associations than other subregions.
Therefore, using hypothesis-free approaches of VBM and shape analysis we performed a fine-mapping of cognitive ability to cortical and subcortical grey matter on magnetic resonance imaging (MRI) in a large population-based sample of middle-aged and elderly subjects.
MATERIALS AND METHODS
Study population
This study was embedded within the Rotterdam Study, an ongoing prospective population-based cohort designed to investigate chronic diseases in the middle-aged and elderly population [29]. The cohort started in 1990 and comprised 7,983 participants aged ≥55 years. In 2000 and 2006, the study was expanded and at present comprises 14,926 participants aged ≥45 years. Since 2005, brain MRI was implemented into the study protocol [30]. Between 2009 and 2014 4,140 participants underwent brain MRI and cognitive testing. Examinations in this time period were conducted as one project. We excluded participants due to prevalent dementia (n = 42), insufficient cognitive screening (n = 21), with cortical infarcts (n = 103), or clinical stroke (n = 161). In total, 3,813 participants were available for analysis. The Rotterdam Study has been approved by the medical ethics committee according to the Population Study Act Rotterdam Study, executed by the Ministry of Health, Welfare and Sports of the Netherlands. A written informed consent was obtained from all participants.
MRI acquisition
Brain MRI was performed on a 1.5-T MRI scanner (Signa Excite II, General Electric Healthcare, Milwaukee, WI, USA) using an eight-channel head coil. The protocol included T1-weighted sequence (T1), proton density-weighted sequence, and a T2-weighted fluid-attenuated inversion recovery (FLAIR) sequence, as described extensively in detail before [30]. In brief, we performed a T1-weighted three-dimensional fast radiofrequency spoiled gradient-recalled acquisition in the steady state with an inversion recovery prepulse sequence (repetition time msec/echo time ms/inversion time ms, 13.8/2.8/400; field of view [FOV], 25 cm2; matrix, 416×256; 96 slices; section thickness, 1.6 mm [zero padded to 0.8 mm]).
Voxel based morphometry
VBM was performed according to an optimized VBM protocol [31], using FSL VBM pipeline for image processing together with in house developed python script for regression and permutation analysis, as previously described [32]. Briefly, all T1-weighted images were segmented into supratentorial grey matter, white matter, and cerebrospinal fluid using a previously described k-nearest neighbor algorithm, which was trained on six manually labeled atlases [33]. All grey matter density maps were non-linearly registered to the standard ICBM MNI152 grey matter template (Montreal Neurological Institute) with a 1×1×1 mm3 voxel resolution. A spatial modulation procedure was used to avoid differences in absolute grey matter volume due to the registration, following by smoothing procedure, using a 3 mm (FWHM 8 mm) isotropic Gaussian kernel.
Subcortical shapes
The T1-weighted MRI scans were processed using FreeSurfer [34] (version 5.1) to obtain segmentations and volumetric summaries of the following seven subcortical structures for each hemisphere: nucleus accumbens, amygdala, caudate, hippocampus, pallidum, putamen, and thalamus [27]. Next, segmentations were processed using a previously described shape analysis pipeline [27, 35]. Briefly, a mesh model was created for the boundary of each structure. Subcortical shapes were registered using the “Medial Demons” framework, which matches shape curvatures and medial features to a pre-computed template [36]. The templates and mean medial curves were previously constructed and are distributed as part of the ENIGMA-Shape package (http://enigma.usc.edu/ongoing/enigma-shape-analysis/). The resulting meshes for the 14 structures consist of a total of 27,120 vertices [27]. Two measures were used to quantify shape: the radial distance and the natural logarithm of the Jacobian determinant. The radial distance represents the distance of the vertex from the medial curve of the structure. The Jacobian determinant captures the deformation required to map the subject-specific vertex to a template and indicates shape dilation due to sub-regional volume change [27]. Thickness represents how much thicker or thinner local morphometry is, while Jacobian represents how much bigger or smaller it is volumetrically.
Assessment of cognitive functioning
Cognitive function was assessed with a cognitive test battery comprising Stroop test [37] (word reading, color naming, and a reading/color naming interference task (error-adjusted time in seconds)), which tests information processing speed and executive function; 15-Word learning test (15-WLT) [38], which taps into immediate and delayed recall to investigate memory; Letter-digit substitution task (LDST) [39] and Word fluency test (WFT, animal categories), both of which test executive function. The three Stroop tests were natural log transformed due to a skewed distribution. To allow for comparison across cognitive tests, we calculated z scores (subtracting the population mean and dividing by the standard deviation) for each cognitive test. The z scores for the Stroop Tests were inverted because higher scores on the Stroop test indicate a poorer performance, whereas higher scores on the other cognitive tests indicate better cognitive performance. In addition, we also investigated global cognition by calculating a compound score (G-factor) using a principal component analysis on the delayed recall score of the 15-WLT, Stroop Interference Test, LDST, and WFT [40]. The G-factor explained 57.2% of the variance in cognitive test scores in the population.
Other measurements
Attained level of education was collected and expressed in years. Mild cognitive impairment (MCI) was defined as the presence of self-perceived cognitive complaints (defined as at least one of 6 questions on memory and daily functioning) and cognitive impairment as assessed with neuropsychological tests in the absence of dementia, in persons aged ≥60 years [41]. In total, N = 183 participants had the diagnosis MCI. N = 800 participants aged <60 years of age (which is the lower age-limit in our study to determine MCI [41]) or no data available on MCI status were excluded. Regarding prevalent dementia, participants were screened for dementia at baseline and follow-up examinations every 4 to 5 years using a 3-step protocol which is described in detail elsewhere [42]. History of clinical stroke was assessed at baseline by interview and verified using medical records, and participants were continuously monitored for occurrence of incident stroke through computerized linkage of medical records from general practitioners and nursing home physicians with the study database [43]. According to our study protocol, all images were rated by a group of trained reviewers to determine the presence and location of cortical infarcts [30]. Infarcts showing involvement of gray matter were classified as cortical infarcts.
Statistical analysis
For VBM and shape analysis, linear regression models were fitted with age, sex, education, and cognitive test value as independent variables and voxel or vertex measure as the dependent variable. We corrected for level of education as a measure of cognitive reserve. As a sensitivity analysis, we excluded all MCI cases and the analyses were repeated. For both VBM and shapes analysis we computed the significance threshold based on nonparametric statistic test by performing 10,000 random permutations separately for each voxel or vertex [44]. After collecting the minimum p-value from every test, they were sorted and the 5% quantile was used (α= 0.05) to estimate the p-value significant threshold, while controlling the family wise error (FWE). The resulting values were 2.99×10–7 for VBM and 9.63×10–6 for shapes, which were subsequently divided by the number of cognitive tests (n = 8) to account for multiple hypothesis tests correction.
RESULTS
Characteristics of the study population are presented in Table 1. Of the 3,813 participants, 55.8% were women and mean age was 69.1 years (ranging from 51.9 to 97.9 years). Correlations between all cognitive test scores stratified by sex are shown in Supplementary Figure 1.
Characteristics of the study population
Data presented as mean (standard deviation) for continuous variables and number (percentages) for categorical variables. The following variable had missing data: education (n = 52), Letter-digit substitution task (LDST, n = 98), Word fluency test (WFT, n = 42), Stroop reading (n = 96), Stroop naming (n = 97), Stroop interference (n = 105), Word learning test (WLT) delayed recall (n = 268), and WLT immediate recall (n = 269).
Voxel-based morphometry analysis
In total 4,081 grey matter voxels were significantly associated with at least one cognitive tests and/or global cognition after correction for multiple-testing (Table 2). These significant voxels were clustered within different brain structures, and almost exclusively within the left hemisphere. The strongest positive associations of better global cognition (G-factor) with grey matter density were found in the left amygdala (156 voxels, minimum (min) p-value 4.2×10–12), hippocampus (173 voxels, min p-value 9.6×10–12), parietal lobule (517 voxels, min p-value 1.2×10–10), superior temporal gyrus (313 voxels, min p-value 1.5×10–10), insula (142 voxels, min p-value 7.4×10–10), posterior temporal lobe left (101 voxels, min p-value 7.7×10–10), postcentral gyrus, inferior and middle frontal gyrus, posterior orbital gyrus and right caudate nucleus (all <25 voxels, min p-value all <2.9×10–8) (Fig. 1).
Association between cognitive tests and grey matter voxels
All associations are adjusted for age, sex, and education. G-factor, global cognition; WLT, Word learning test; WFT, Word fluency test; LDST, Letter-digit substitution task.

Association of grey matter voxel density with global cognition. Lateral view of the left hemisphere. FWE-significant voxels, indicated by red-yellow, cluster in the insular cortex (I), Wernicke’s area (II), and the hippocampus (III). All associations are adjusted for age, sex, and education. Neurological orientation axial images: left = left.
With respect to the separate cognitive tests, a higher score on the delayed recall task of the 15-WLT was positively associated with grey matter density in parts of the left hippocampus (466 voxels, min p-value 4.9×10–10), amygdala (385 voxels, min p-value 1.1×10–9), insula (10 voxels, min p-value 1.7×10–8), and parahippocampal gyrus (1 voxel, min p-value 3.5×10–8). A higher score on immediate recall score of the 15-WLT was positively related to a small portion of the left hippocampus (6 voxels, min p-value 2.9×10–8). We observed that the WFT was associated with superior temporal gyrus (103 voxels, min p-value 8.8×10–9), and parietal lobule (81 voxels, min p-value 8.8×10–9). We did not observe any association with the Stroop Reading test or the Stroop Color Naming test that survived correction for multiple testing. Stroop interference task harbored significant associations in the left hemisphere including superior temporal gyrus (442 voxels, min p-value 5.3×10–11), the parietal lobule (655 voxels, min p-value 5.7×10–11), postcentral gyrus (80 voxels, min p-value 1.9×10–10), posterior temporal lobe (152 voxels, min p-value 2.0×10–10), insula (137 voxels, min p-value 2.5×10–10), amygdala (46 voxels, min p-value 9.2×10–10), hippocampus and caudate nucleus (both <25 voxels, min p-value 1.6×10–8). LDST was associated with a small cluster of grey matter voxels in the left insula (7 voxels, min p-value 2.6×10–8), and inferior frontal gyrus (1 voxel, min p-value 2.7×10–8). All of these results are depicted in Fig. 2A-E.
Shape analysis

FWE-significant grey matter voxels in relation to cognitive tests. FWE-significant grey matter voxels, indicated by red-yellow, in relation to cognitive tests. Neurological orientation: left = left. All associations are adjusted for age, sex, and education. WLT, Word learning test; LDST, Letter-digit substitution task.
Supplementary Fig. 2 shows the correlation values between Jacobian determinant and radial distance within each structure. Jacobian determinant and radial distance showed clusters of significant FWE-corrected vertices in relation to cognitive tests, distributed among the left and right hemisphere: 2,819 and 2,298, respectively (Supplementary Table 2). The thalamus, caudate nucleus, and putamen harbored most significant associations (Supplementary Table 2). Largest significant clusters were found for the Jacobian determinant of the left and right thalamus with Stroop interference task (369 vertices, min p-value 5.8×10–11 and 324 vertices, min p-value 6.1×10–12, respectively). Global cognition harbored several significant associations, including the Jacobian determinant of the left and right thalamus (281 vertices, min p-value 4.4×10–12 and 159 vertices, min p-value 1.9×10–9, respectively), and the radial distance of the left and right caudate nucleus (133 vertices, min p-value 2.9×10–16 and 78 vertices, min p-value 2.9×10–13, respectively) (Fig. 3). A few inverse associations were observed, primarily between the Jacobian determinant of the caudate nucleus and the WLT (both delayed and immediate recall), and G-factor. Small clusters of vertices (ranging from 1 to 22) were found in the hippocampus with global cognition, WFT, and Stroop interference task, but not with the memory tests. Supplementary Figure 3A-F shows all significant findings of the shape analysis of subcortical brain structures in relation to the individual cognitive tests.

Maps of shape measures of subcortical brain regions in relation to global cognition. Maps show the associations of seven bilateral subcortical structures for the shape measures of Jacobian determinant (A) and radial distance (B), anterior (top row) and posterior (bottom row) view. All associations are adjusted for age, sex, and education. Color map represents the t-statistics and shows the direction of association, with red and blue indicating negative and positive associations, respectively. Highlighted regions represent statistically significant vertices.
The sensitivity analysis in which N = 183 MCI cases (out of N = 3013 participants) were excluded and subsequent all analyses were repeated revealed similar results, although less significant as expected due to lower power (Supplementary Table 2 and Supplementary Figures 4 and 5).
DISCUSSION
In this large study of community-dwelling adults, we presented the neuroanatomical fine-mapping of seven cognitive tests and global cognition using VBM and subcortical shape analysis. We found that the different cognitive tests significantly associated with grey matter density in different brain regions, primarily in the left hemisphere. Moreover, many of the associated regions showed overlap between cognitive tests. Subcortical shape analysis revealed associations primarily within the head and tail of caudate nucleus, putamen, ventral part of the thalamus, and nucleus accumbens, more equally distributed among the left and right hemisphere. Within caudate nucleus both positive (head) as well as negative (tail) associations were observed with global cognition.
Regarding the VBM analysis, we observed three clusters of grey matter density to be associated with global cognition. These clusters were found in the left amygdala, hippocampus, parietal lobule, insula, posterior temporal lobe, inferior and middle frontal gyrus, postcentral gyrus, and posterior orbital gyrus. Importantly, each of these three clusters was located within multiple anatomic regions. This may emphasize the importance of investigating the association with cognition beyond anatomically defined regions. Global cognition represents the shared variance of the individual cognitive tests, so it is therefore unsurprising that the three significant clusters are also significantly associated with the separate tests. We therefore will discuss our findings in more detail per individual cognitive test below.
Memory research has a long history [45]. The medial temporal lobes, and in particular the hippocampus, have long been implicated in episodic memory, with visuospatial memory predominantly associated with the right and verbal memory with the left hippocampus [46–48]. In line with this, we found that the 15-Word Learning test, with delayed recall more pronounced than immediate recall, was associated with grey matter density in particular the left hippocampus, as well as in the left amygdala. For decades there has been debate over the question of whether the amygdala is involved in memory [49]. Task-based resting-state functional MRI studies have shown that the amygdala is considered to play a role in emotional-related memory. However, its role in episodic memory is less known [50]. We did not observe an association between the shape of the hippocampus and cognitive tests measuring memory function. As was shown in a previous study, shape of subcortical structures represents a complimentary phenotype compared to just volumetric measures, with its own genetic architecture [27]. Therefore, the absence of signal may be caused by the fact that the shape of hippocampus has also different functional specialization, which is less sensitive to associations with cognition. Alternatively, our previous work has shown that the reproducibility of the subcortical shape of the hippocampus is relatively poor compared to other subcortical structures. This may have caused the hippocampus to be less sensitive to associations with cognition [27].
The Stroop interference test, WFT and LDST are all tapping into executive function. The Stroop interference task has been used extensively in studies designed to explore the efficiency of controlled attentional processes [51]. The Stroop effect reflects slowing of response time or increase in error rate when persons are required to respond with the identity if an incongruent stimulus relative to a congruent stimulus [52]. Interestingly, we observed that the Stroop interference task was positively associated with grey matter density in the left hippocampus. Over the past decades, there has been increasing interest in the contribution of the hippocampus to processes beyond the memory domain [53]. A study in healthy subjects explored the role of the hippocampus for response conflict in the Stroop task by combining intracranial electroencephalography with region of interest-based functional MRI. Researchers found that the hippocampus is recruited during response conflict. Importantly, it remains questionable whether conflict processing can be disentangled from circumstances in which there is conflicting valence or perceptual information, even in experimental studies that thoroughly control for the effect of memory [54]. Moreover, WFT and Stroop interference test showed clusters of significant grey matter voxels in the left hemisphere where the posterior frontal lobe, upper segment of temporal lobe, and parietal lobule (including supramarginal and angular gyrus) intersect. These brain areas are part of Wernicke’s area, a well-known functional language area [55]. The WFT being used in the current study tests semantic fluency. Semantic fluency requires searching for semantic associations within the lexicon [56]. Lower scores on semantic fluency tests may therefore also reflect problems with semantic memory, and not only executive function. In line with this, we found WFT to be associated with the left hippocampus, although non-significant after correction for multiple testing.
The LDST and Stroop interference test both showed associations with the left insula, more specifically the dorsal anterior insula. The left insular cortex is involved in consciousness and plays a role in diverse cognitive functions [57] such as higher cognitive processing and social-emotional processing [58]. Anterior insular cortices are among the most commonly activated brain regions across all cognitive tasks [59]. It is also considered to be part of the cognitive control network and it has been hypothesized that this network might form a pathway by which information in the insula, can affect decision making, and therefore influence information processing speed [60, 61].
In line with literature, our shape analysis results indicate that subcortical structures are heterogeneous and consist of functionally diverging sub-regions [20, 62]. This is illustrated by, e.g., the caudate nucleus showing that its head and tail differ in their associations with global cognition. It is thought that the head of the caudate nucleus interacts with medial, ventral, and dorsolateral prefrontal cortex as part of the ‘cognitive’ corticostriatal loop, whereas the tail interacts with inferior temporal areas and appears to be involved in visual stimulus processing [63–65]. In addition, our results suggest that the shape of the other subcortical structures are involved in cognition as well, emphasizing the importance of subcortical shape analysis in understanding cognition.
Strengths of our study include the large sample size, the population-based setting and the hypothesis-free approach to be able to fine map cognition to grey matter. Some limitations deserve to be acknowledged. First, because of the cross-sectional design, no conclusions can be drawn regarding the directionality of causality of the associations. Second, our cognitive test battery, limited in time because of the population-based nature, yielded a less extensive cognitive assessment compared to other studies in smaller samples. Third, the current study mainly consists of Caucasians, therefore the generalizability to other ethnicities is limited. Fourth, the mean age in our study sample was 69.1 years which may hamper generalizability to younger or older populations. Finally, it is well known that several cognitive processes are lateralized to a functionally dominant hemisphere and therefore it would have been interesting to investigate handedness as effect modifier. Unfortunately, in our study we did not have a reliable measure of handedness.
In conclusion, in this population-based study of nearly 4000 subjects we mapped cognitive ability to grey matter by using hypothesis-free approaches of VBM and shape analysis. We made the maps of association publicly available (https://neurovault.org/) for any researcher to explore the results or to contrast their findings against. In addition, other researchers may use our results on http://neurovault.org for interpretation beyond the (strictly) used p-value threshold. Our results propose that a more fine-grained analysis of brain structure adds to our understanding of cognitive function in normal aging. Future research could utilize our findings to further disentangle the dynamics of cognitive aging and the involved brain regions. Additionally, longitudinal assessment of cognitive functioning and grey matter atrophy is needed to study causality.
Footnotes
ACKNOWLEDGMENTS
The Rotterdam Study is funded by Erasmus Medical Center and Erasmus University, Rotterdam, Netherlands Organization for the Health Research and Development (ZonMw), the Research Institute for Diseases in the Elderly (RIDE), the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the European Commission (DG XII), and the Municipality of Rotterdam. The authors are grateful to the study participants, the staff from the Rotterdam Study and the participating general practitioners and pharmacists.
