Improving Functional Connectome Fingerprinting with Degree-Normalization

Abstract

Background:

Functional connectivity quantifies the statistical dependencies between the activity of brain regions, measured using neuroimaging data such as functional magnetic resonance imaging (fMRI) blood-oxygenation-level dependent time series. The network representation of functional connectivity, called a functional connectome (FC), has been shown to contain an individual fingerprint allowing participants identification across consecutive testing sessions. Recently, researchers have focused on the extraction of these fingerprints, with potential applications in personalized medicine.

Materials and Methods:

In this study, we show that a mathematical operation denominated degree-normalization can improve the extraction of FC fingerprints. Degree-normalization has the effect of reducing the excessive influence of strongly connected brain areas in the whole-brain network. We adopt the differential identifiability framework and apply it to both original and degree-normalized FCs of 409 individuals from the Human Connectome Project, in resting-state and 7 fMRI tasks.

Results:

Our results indicate that degree-normalization systematically improves three fingerprinting metrics, namely differential identifiability, identification rate, and matching rate. Moreover, the results related to the matching rate metric suggest that individual fingerprints are embedded in a low-dimensional space.

Discussion:

The results suggest that low-dimensional functional fingerprints lie in part in weakly connected subnetworks of the brain and that degree-normalization helps uncovering them. This work introduces a simple mathematical operation that could lead to significant improvements in future FC fingerprinting studies.

Impact statement

We introduce a simple mathematical operation that systematically improves the extraction of functional connectivity fingerprints from neuroimaging data, according to three different metrics. The results suggest that the information related to individual traits lies in part in weakly connected brain areas and can be compressed in a low-dimensional space. We also show the benefits of using multiple metrics to quantify fingerprint in a data set. Our approach could improve future individual-level studies of functional neuroimaging data, which are crucial for the personalized diagnosis and treatment of neurological disorders, as well as for the study of the relationship between brain and behavior.

Introduction

The study of brain functional connectivity aims to understand how distributed neural regions of interest (ROIs) interact with each other during resting-state and task conditions (Bullmore and Sporns, 2009; Fornito et al., 2016). Thanks to advances in functional magnetic resonance imaging (fMRI), the measurement of blood-oxygenation-level dependent (BOLD) signals provides an estimate of brain activity across conditions (Ogawa et al., 1990). In this context, a widespread approach to quantify functional connectivity is to compute pairwise Pearson's correlation coefficients between BOLD time series measured at each ROI. The resulting symmetric correlation matrix is referred to as a functional connectome (FC) and can be understood as the adjacency matrix of a network where nodes are ROIs and edges represent functional interactions between those ROIs (Bullmore and Sporns, 2009; Fornito et al., 2016).

The network analysis of brain connectivity is able to capture important features of cortical organization, such as integration and segregation (Bullmore and Sporns, 2009; Shine et al., 2016, 2018, 2019), as well as modularity and community structure (Betzel et al., 2016; Betzel et al., 2019; Puxeddu et al., 2020; Sporns and Betzel, 2016). Furthermore, FCs have been used in the study of several brain disorders (Fornito et al., 2015), such as schizophrenia (Gutiérrez-Gómez et al., 2020; Lynall et al., 2010; Micheloyannis et al., 2006) and Alzheimer disease (Supekar et al., 2008; Svaldi et al., 2019).

Several studies demonstrated the existence of a fingerprint embedded in individual-level neuroimaging data, allowing participant identification in test–retest settings. Various types of descriptors have been investigated to uncover brain fingerprints, ranging from anatomical features (Valizadeh et al., 2018) and morphometric measures (Wachinger et al., 2015) to white matter fiber trajectories (Kumar et al., 2017) and multimodal embeddings (Kumar et al., 2018). In parallel to these nonconnectomic studies, a growing interest in the brain fingerprint specific to FCs has emerged (Finn et al., 2015; Gratton et al., 2018; Iturria-Medina et al., 2018; Liu et al., 2018; Mars et al., 2018; Menon and Krishnamurthy, 2019; Pallarés et al., 2018; Satterthwaite et al., 2018; Seitzman et al., 2019). This fingerprint can be extracted through data-driven procedures (Amico and Goñi, 2018; Byrge and Kennedy, 2019) as well as reproduced across sites (Bari et al. 2019). These findings have important implications in the perspective of individual-level functional connectivity analysis. For instance, personalized medicine in the study of brain disorders can benefit from FCs revealing robust individual traits (Iturria-Medina et al., 2018; Svaldi et al., 2019). Moreover, participant identifiability across test–retest fMRI sessions has recently been proposed as an indicator of scan reliability for both researchers and clinicians, provided that appropriate and possibly complementary fingerprinting metrics are investigated (Milham et al., 2021). In sum, uncovering meaningful brain fingerprints enables a shift from population-level research to individual-based scientific investigation and clinical examination.

As recently shown (Rajapandian et al., 2020), fingerprints are also reflected in different network properties of FCs. To characterize the topology of functional networks, numerous networks statistics have been introduced (Rubinov and Sporns, 2010). One of the most fundamental measures for binary networks is the degree of a node, that is, the number of nodes it is connected to. In weighted networks, the weighted degree (or the strength) of a node is the sum of the weights of its neighboring edges. The weighted degree sequence denotes the vector gathering the weighted degree of all nodes in the network.

In this study, we show the benefits of applying a mathematical operation, known as degree-normalization, to FCs before extracting functional connectivity fingerprints. Degree-normalization uses the information encoded in the weighted degree sequence to reduce the weight of edges lying between strongly connected nodes (hubs) comparatively to others, thereby balancing their excessive influence in the network. This operation has been applied in previous studies on weighted communicability measures of networks (Crofts and Higham, 2009; Estrada et al., 2012; Rajapandian et al., 2020) as well as in the study of random walks on networks through the use of the normalized Laplacian (Lambiotte et al., 2014). We adopt the differential identifiability framework recently developed by Amico and Goñi (2018) for FC fingerprinting based on a principal components analysis (PCA) decomposition-reconstruction procedure. Because computing the absolute value of FCs is an intermediate step required before applying degree-normalization, we compare the results of this framework applied on (1) the original (signed) FCs, (2) the FCs taken in absolute value, and (3) the degree-normalized FCs. To assess the quality of the fingerprint extraction, we consider two previously introduced metrics, namely differential identifiability (Amico and Goñi, 2018) and identification rate (Finn et al., 2015), and we introduce a variant of the latter called matching rate. Our results show that degree-normalization improves the fingerprinting scores for all metrics and that reconstructing the corresponding optimally identifiable FCs requires fewer principal components compared with original FCs. We also highlight the difference in the interpretation of the identification rate and the matching rate and argue that the latter provides a more robust depiction of the individual fingerprint in FCs.

Materials and Methods

Data set

We included 409 unrelated individuals from the Human Connectome Project (HCP) 1200-participants release (Essen et al., 2013). This subset of unrelated individuals was chosen from the overall data set to ensure that no two participants have a shared parent. The criterion to exclude siblings (whether they share one or both parents) was crucial to avoid confounding effects in our analyses due to family structure. Data from resting-state (REST) and seven fMRI tasks were used: emotion processing, gambling, language, motor, relational processing, social cognition, and working memory. In this study, we will collectively refer to the resting-state and all the tasks as conditions.

For each condition, subjects underwent two sessions corresponding to two different phase-encoding directions (left-to-right and right-to-left). The resting-state fMRI scans were acquired on two different days with a total of four sessions (coded as REST1 and REST2). In this study, we used the two sessions from REST1. The HCP scanning protocol was approved by the institutional review board at Washington University in St. Louis. Full details on the HCP data set have been published previously (Essen et al., 2012; Glasser et al., 2013; Smith et al., 2013).

The brain atlas used in this study is the multimodal parcellation MMP1.0 proposed by Glasser et al. (2016) and comprising 180 cortical regions by hemisphere. For completeness, we added 14 subcortical regions (covering the bilateral striatum, thalamus, hippocampus, and amygdala) provided by the HCP release, for a total of $N = 374$ ROIs.

Preprocessing

We used the minimally preprocessed data provided by the HCP (Glasser et al., 2013). This pipeline includes artifacts removal, motion correction, and registration to standard template. Full details on this pipeline can be found in earlier publications (Glasser et al., 2013; Smith et al., 2013).

In addition, we applied the following processing steps to the extracted BOLD signals. For resting-state fMRI data: (1) we regressed out the global gray matter signal from the voxel time courses (Power et al., 2014), (2) we applied a bandpass first-order Butterworth filter in the forward and reverse directions (0.001 to 0.08 Hz; Python function filtfilt from the Scipy package v1.2.1), and (3) the voxel time courses were z-scored and then averaged per brain region, excluding any outlier time points that were outside of three standard deviations from the mean (Workbench software, command-cifti-parcellate). For task fMRI data, we applied the same steps, with a more liberal frequency range for the band pass filter (0.001 to 0.25 Hz) since the relationship between different tasks and optimal frequency ranges is still unclear (Cole et al., 2014).

Degree-normalization of an FC

We compute a functional connectivity matrix $F C$ as the $N \times N$ matrix of pairwise, zero-lag Pearson's correlation coefficients between the N regional BOLD time series: $F C = [F C_{i j}]$ (1)

where $F C_{i j} \in [- 1, 1]$ and $F C_{i j} = F C_{j i}$ . Without loss of generality, we ignore self-loops in the functional network by setting $F C_{i i} = 0$ . This matrix, which we denote as the Baseline $F C$ , can be directly treated as the adjacency matrix of a weighted, undirected and signed network, as done in previous fingerprinting studies (Finn et al., 2015; Amico and Goñi, 2018). In the present work, we also consider the unsigned version to avoid the occurrence of complex numbers due to the degree-normalization. This is done by taking the entry-wise absolute value of correlation coefficients in $F C$ . We denote this as the Absolute $F C$ , $|F C|$ , with all entries verifying ${|F C|}_{i j} \in [0, 1]$ .

The degree d_i of node i of an unsigned network is defined as the sum of the weights of its neighboring edges:

The degree matrix $D$ is the $N \times N$ matrix containing the degree sequence on its diagonal, and zeros elsewhere: $D_{i i} = d_{i}$ (3) $D_{i j} = 0, \forall i \neq j .$ (4)

The degree-normalization of $|F C|$ is mathematically defined as follows: $ℱ C = D^{- 1 ∕ 2} |F C| D^{- 1 ∕ 2}$ (5)

The resulting matrix $ℱ C$ is symmetric and corresponds to the adjacency matrix of the Normalized $F C$ (Crofts and Higham, 2009; Estrada et al., 2012) where any excessive influence of nodes has been modulated by their corresponding weighted degree. Figure 1 summarizes the degree-normalization procedure. It is worth noting that degree-normalization on signed networks would potentially involve negative node degrees [Eq. (2)], which would in turn generate complex entries in the normalized FCs [Eq. (5)]. For this reason, we restrict our analysis to the degree-normalization of unsigned FCs, that is, FCs taken in absolute value.

FIG. 1.

Degree-normalization of an FC. (A) An FC is computed as a matrix of pairwise Pearson's correlation coefficients between regional BOLD time series. Hence, all values in the Baseline FC are within the range [−1, 1]. (B) The next step consists of taking the absolute value of all entries, which produces the Absolute FC, denoted by $|F C|$ . (C) From that unsigned FC, we can extract the weighted degree sequence. (D) The degree matrix is a square matrix containing the weighted degree sequence on its diagonal and zeros elsewhere. (E) Finally, we apply degree-normalization [Eq. (5)] to obtain the Normalized FC. BOLD, blood-oxygenation-level dependent; FC, functional connectome. Color images are available online.

FC fingerprinting

We analyze each fMRI condition separately. To quantify the variability of our results in the population, we use sampling without replacement. We generate 100 random subsamples of the 409 individuals in the database to obtain 100 data sets containing $K = 327$ (80% of 409) different individuals. For each condition, the data set is composed of $2 K = 654$ FCs, that is, two FCs per individual corresponding to the two fMRI phase-encoding directions. Thus, we have for each individual a test FC and a retest FC. To extract functional connectivity fingerprints from this data set, we adopt the differential identifiability framework based on group-level PCA (Amico and Goñi, 2018). In summary, the procedure consists of vectorizing the upper-triangular part (excluding diagonal values) of all FCs in the data set, and then gathering these vectors in a data matrix of $\frac{N (N - 1)}{2}$ rows associated to FC entries, and $2 K$ columns associated to test–retest scans of each individual. Following the PCA decomposition of this matrix, FCs are reconstructed using an incrementally increasing number of components, selected in decreasing order of explained variance.

For each number of components, we compute the identifiability matrix $A \in {[- 1, 1]}^{K \times K}$ . The element $A_{i j}$ is the entry-wise Pearson's correlation coefficient between the test FC of individual i and the retest FC of individual j. Therefore, the diagonal elements $A_{i i}$ represent the individuals' self-similarity between test and retest, whereas off-diagonal elements represent between-individuals similarities. Importantly, this means that $A$ is not symmetric. Intuitively, the higher the contrast between diagonal and off-diagonal elements, the better are the extracted fingerprints.

Quantifying the level of identifiability

We consider three metrics to estimate the amount of fingerprint in each subsample: the differential identifiability ( $I_{d i f f}$ ), the identification rate ( $I D_{r a t e}$ ), and the matching rate ( $M_{r a t e}$ ). Let $I_{s e l f} = ⟨ A_{i i} ⟩$ denote the average of the diagonal elements of the identifiability matrix and let $I_{o t h e r s} = ⟨ A_{i j} ⟩$ , $i \neq j$ be the average of the off-diagonal elements. The differential identifiability score ( $I_{d i f f}$ ) (Amico and Goñi, 2018) is then defined as: $I_{d i f f} = (I_{s e l f} - I_{o t h e r s}) * 100 .$ (6)

Each time a diagonal element $A_{i i}$ is the highest of its row, we state that individual i's retest FC has been correctly identified on the basis of his/her test FC. The identification rate (Finn et al., 2015) is then: $I D_{r a t e} = \frac{N u m b e r o f c o r r e c t l y i d e n t i f i e d i n d i v i d u a l s}{T o t a l n u m b e r o f i n d i v i d u a l s} .$ (7)

As we can also compute this metric column-wise (i.e., test FC identified from retest FC), we report the average of row-wise and column-wise $I D_{r a t e}$ . Note that as per (Finn et al., 2015), $I D_{r a t e}$ is a procedure with replacement, such that the algorithm was not forced to identify a unique subject on each iteration within a condition.

It might happen that the test FC of an individual i is most similar not only to its own retest FC but also to that of other individuals. In the extreme case of an FC being highly similar to many others, this will negatively impact the identification rate since many individuals will not be correctly identified. To remedy this, we propose a variant of identification rate, called matching rate ( $M_{r a t e}$ ), where every time an FC from test session is matched with a retest FC (or vice versa) using the highest value of correlation along a row (or column) of an identifiability matrix, the matched test–retest pair is removed before the next comparison is made. In other words, $M_{r a t e}$ is equivalent to $I D_{r a t e}$ but without replacement. This way, all FCs are matched only once, no matter if they are similar to many others or not.

Control experiment: surrogate degree-normalization

In the present work, we evaluate the impact of normalizing each FC by its own degree sequence. As a control experiment, we also report the results of normalizing each FC by the degree sequence of a surrogate individual chosen uniformly at random, a process denoted as surrogate degree-normalization. Mathematically, this comes down to performing the fingerprinting analysis with the following normalized FCs for individual u with surrogate v: $ℱ C_{u, s u r r} = D_{v}^{- 1 ∕ 2} {|F C|}_{u} D_{v}^{- 1 ∕ 2} .$ (8)

Here, ${|F C|}_{u}$ is the absolute FC of individual u, $D_{v}$ is the degree matrix of individual v, and $ℱ C_{u, s u r r}$ is the surrogate-normalized FC of individual u. Surrogate individuals were assigned uniformly at random, with the constraint that an individual cannot be associated with its own degree sequence. This permutation was preserved for both test and retest FCs, as well as across fMRI conditions. The ordering of brain regions in the surrogate degree sequence was preserved. In the article, normalizing an FC by its own degree sequence is sometimes referred to as self degree-normalization to avoid any ambiguity with surrogate degree-normalization.

Statistical comparison between modalities

To assess the difference of fingerprinting scores obtained with baseline, absolute, surrogate-normalized, and self-normalized FCs (denoted together as modalities), we compute for each fingerprinting metric ( $I_{d i f f}$ , $I D_{r a t e}$ , and $M_{r a t e}$ ) and each modality the optimal score for the 100 subsamples of the data set. Then, we perform paired t-tests to compare these optimal scores between modalities (significance level: α = 0.005). We perform analogous analysis to compare the number of principal components corresponding to the optimal fingerprinting scores.

Results

We apply the differential identifiability framework (Amico and Goñi, 2018) to baseline, absolute, and normalized FCs. We compute three metrics: differential identifiability score ( $I_{d i f f}$ ) (Amico and Goñi, 2018), identification rate ( $I D_{r a t e}$ ) (Finn et al., 2015), and the newly introduced matching rate ( $M_{r a t e}$ ). The analysis is done for each fMRI condition separately and performed independently on the 100 randomly drawn subsamples.

Degree-normalization modulates the influence of high- and low-degree regions of FCs. To provide insight into the brain regions most affected by degree-normalization, Supplementary Tables S1 and S2 show information about the brain regions with the highest and lowest average weighted degrees, for each fMRI condition. In addition, cortical visualizations of the mean and standard deviation of regional weighted degree across individuals are shown in Supplementary Figures S1 and S2.

Figure 2 presents the results related to differential identifiability ( $I_{d i f f}$ ). We observe that the evolution of $I_{d i f f}$ with respect to the number of principal components used for FC reconstruction is concave, with sharper curves in the case of normalized FCs, for all fMRI conditions. Figure 2D compares the optimal value of differential identifiability reached for baseline, absolute, surrogate-normalized, and self-normalized FCs. We observe that absolute and surrogate-normalized FCs achieve better scores than baseline FCs, for all conditions except the emotion processing task. Self-normalized FCs provide the best $I_{d i f f}$ scores for all fMRI conditions, with an average gain of 9.6% between baseline and self-normalized FCs (minimum gain: 7.9% for emotion; maximum gain: 10.73% for working memory). When comparing the effect of self-normalization with respect to surrogate-normalization for each fMRI condition, Figure 2D shows that the former systematically led to significantly higher $I_{d i f f}$ values than the latter (p < 0.005). We notice, however, that in resting-state, surrogate degree-normalization leads to $I_{d i f f}$ values that are close to that of self degree-normalization. Figure 2E shows the number of principal components corresponding to the optimal $I_{d i f f}$ values of Figure 2D. We observe that absolute and surrogate-normalized FCs require fewer components than baseline FCs, for all conditions except the language processing task and the working memory task for which baseline FCs and absolute FCs require a similar number of components. Self-normalized FCs require the lowest number of components for most conditions, with the exceptions of the gambling task, the relational processing task and resting-state for which the surrogate and self degree-normalization require a comparable number of components.

FIG. 2.

Impact of degree-normalization on differential identifiability. (A–C) Present the evolution of I _diff with respect to the number of principal components used for FCs reconstruction, for baseline, absolute, and normalized FCs, respectively. Solid lines represent the median value across 100 random subsamples (without replacement) of the database, and shaded areas correspond to the interpercentile range (2.5 and 97.5 percentiles). Please note that the interpercentile range is sometimes small enough that the shaded area is hidden by the solid line. Square symbols highlight the optimum I _diff of the median curves. (D) Comparison of optimal I _diff values for baseline, absolute, surrogate-normalized, and self-normalized FCs. Error bars show the interpercentile range (2.5 and 97.5 percentiles) across 100 random subsamples of the database. (E) Number of principal components corresponding to the optimal I _diff values of (D). Pairs of bars highlighted by “n.s.” indicate paired t-tests that are not significant at the level p < 0.005. Color images are available online.

Figure 3 reports the behavior of $I_{s e l f}$ and $I_{o t h e r s}$ . As shown in Figure 2, when looking at $I_{d i f f}$ , self-normalization systematically performs better than surrogate-normalization (null model) for all fMRI conditions. Hence for the remaining analyses, we focus on baseline, absolute, and normalized FCs. Overall, both $I_{s e l f}$ and $I_{o t h e r s}$ decrease with the number of principal components kept for FC reconstruction. However, we observe for normalized FCs that $I_{o t h e r s}$ decreases faster than $I_{s e l f}$ in the first 200 components. This observation is valid for all fMRI conditions. In Figure 4, we display identifiability matrices obtained with baseline, absolute, and normalized FCs at the optimal $I_{d i f f}$ reconstruction point. We observe that diagonal elements stand out in all cases, indicating that individuals' self-similarity is correctly captured. Moreover, we observe that degree-normalization smooths the distribution of off-diagonal elements while maintaining a good contrast with diagonal elements. Figure 5 highlights, for the motor task as an example, how degree-normalization is able to correct the identifiability profile of some individuals. This observation is valid for all fMRI conditions (results not shown).

FIG. 3.

Impact of degree-normalization on I _self and I _others. (A) Evolution of I _self with the number of principal components added in descending order of explained variance for baseline (left), absolute (middle) and normalized FC (right). (B) Shows the analogous figures for I _others. Optimal number of components for maximizing I _diff are shown as square symbols in all cases. (C) Evolution of ΔI _self, which is the pointwise I _self difference between baseline, absolute, or normalized FC along principal components. (D) Shows the analogous analysis for ΔI _others. In all plots of the figure, solid lines represent the median value across 100 random subsamples (without replacement) of the database, and shaded areas correspond to the interpercentile range (2.5 and 97.5 percentiles). Please note that the interpercentile range is sometimes small enough that the shaded area is hidden by the solid line. Color images are available online.

FIG. 4.

Impact of degree-normalization on identifiability matrices. Top row: Identifiability matrices obtained with the baselines FCs at the optimal I _diff value, for all fMRI conditions. For visualization purposes, only 25 randomly selected individuals of one subsample of the database are displayed. Middle and bottom rows show the same analysis for absolute and self-normalized FCs, respectively. fMRI, functional magnetic resonance imaging.

FIG. 5.

Degree-normalization corrects the profile of outlier FCs. Zoom on the panels of Figure 4 related to the motor task. Arrows highlight typical examples of FCs that are very different to any other FC in the cohort. Note that this effect is alleviated after degree-normalization.

Figure 6 presents the results related to the identification rate ( $I D_{r a t e}$ ). Overall, the $I D_{r a t e}$ curves are also concave with a sudden rise in the last 50 components, for all fMRI conditions. This phenomenon is particularly pronounced for normalized FCs and highlights a shortcoming of the identification rate metric. As shown in Supplementary Figure S3, the identification rate is driven down by a few FCs being highly similar to others when around 600 principal components of 654 are used for reconstruction. The last components then correct this bias. Figure 6D compares the optimal identification rates reached for baseline, absolute, surrogate-normalized, and self-normalized FCs. We observe that baseline and absolute FCs provide comparable results, while surrogate-normalization lowers the identification with respect to baseline FCs, for all conditions. Self-normalized FCs provide the best identification rates for all conditions, with an average gain of 16% with respect to baseline FCs (minimum gain: 6% for resting-state; maximum gain: 30% for the motor task). Figure 6E shows the number of principal components corresponding to the optimal identification rates of Figure 6D. We observe that self-normalized FCs require the lowest number of components, for all fMRI conditions. We observe large error bars (2.5–97.5 interpercentile range across 100 random subsamples) in the case of surrogate-normalized FCs for the gambling task, the motor task, and the working memory task. This comes from the fact that in the realization of sampling without replacement, the highest $I D_{r a t e}$ is sometimes reached using all the components and sometimes with around 200 components, leading to a bimodal distribution of the optimal number of components. Ultimately, this produces large error bars. This phenomenon occurs particularly often with surrogate degree-normalization.

FIG. 6.

Impact of degree-normalization 70 on identification rate. (A–C) Present the evolution of ID _rate with respect to the number of principal components used for FCs reconstruction, for baseline, absolute, and normalized FCs, respectively. Solid lines represent the median value across 100 random subsamples (without replacement) of the database, and shaded areas correspond to the interpercentile range (2.5 and 97.5 percentiles). Please note that the interpercentile range is sometimes small enough that the shaded area is hidden by the solid line. Square symbols highlight the optimum ID _rate of median curves. (D) Comparison of optimal identification rates for baseline, absolute, surrogate-normalized, and normalized FCs. Error bars show the interpercentile range (2.5 and 97.5 percentiles) across 100 random subsamples of the database. (E) Number of principal components corresponding to the optimal identification rates of (D). Pairs of bars highlighted by “n.s.” indicate paired t-tests that are not significant at the level p < 0.005. Color images are available online.

Figure 7 presents the results related to the matching rate ( $M_{r a t e}$ ). The $M_{r a t e}$ curves increase quickly until they reach a plateau value, except for the emotion processing and the motor tasks with baseline and absolute FCs. Importantly, the sudden rise in the last few components observed with identification rate does not occur with matching rate. Figure 7D compares the optimal matching rates reached for baseline, absolute, surrogate-normalized, and self-normalized FCs. The observations made for identification rate are still valid for matching rate. Self-normalized FCs provide the best matching rates for all conditions, with an average gain of 14% with respect to baseline FCs (minimum gain: 5% for resting-state; maximum gain: 22% for the motor task). Figure 7E shows the number of principal components corresponding to the values shown in Figure 7D. We observe that normalized FCs require the lowest number of components, for all fMRI conditions. The large error bars (2.5–97.5 interpercentile range across 100 random subsamples) for all conditions and all FCs are the result of the noisy plateau behavior of $M_{r a t e}$ curves. Indeed, depending on the subsample, the optimal matching rate can be achieved in a large range of number of components, although its actual value remains stable.

FIG. 7.

Impact of degree-normalization on matching rate. (A–C) Present the evolution of M _rate with respect to the number of PCA components used for FCs reconstruction, for baseline, absolute, and normalized FCs, respectively. Solid lines represent the median value across 100 random subsamples of the database, and shaded areas correspond to the interpercentile range (2.5 and 97.5 percentiles). Square symbols highlight the optimum M _rate of median curves. (D) Comparison of optimal matching rates for baseline, absolute, surrogate-normalized, and normalized FCs. Error bars show the interpercentile range (2.5 and 97.5 percentiles) across 100 random subsamples of the database. (E) Number of PCA components corresponding to the optimal matching rates of (D). Pairs of bars highlighted by “n.s.” indicate paired t-tests that are not significant at the level p < 0.005. PCA, principal components analysis. Color images are available online.

Discussion

Extracting fingerprints from FCs is an important challenge for future individual-level studies of functional connectivity. This can be achieved through the decomposition of functional connectivity data into principal components to remove noisy components that deteriorate fingerprinting scores. Here, we showed that the degree-normalization of FCs improves the fingerprinting process, according to three different metrics: differential identifiability, identification rate, and matching rate. Moreover, the results indicate that the fingerprint of degree-normalized FCs is embedded in a lower dimensional space (and hence can be compressed) compared with baseline FCs.

Improved fingerprinting in a lower dimensional space

Throughout our results, we observed that normalizing FCs improves the three fingerprinting scores considered in this work (Figs. 2D, 6D and 7D), for all fMRI conditions. Importantly, this improvement was not driven by the participants' head motion (Supplementary Fig. S4). Instead, such normalization provides a modulation (or a balance) of the influence of different brain regions that enhance subject-level fingerprints. This is achieved by accounting for the weighted degree sequence in the FCs. The brain regions most affected by the degree-normalization, listed in Supplementary Tables S1 and S2, are the most strongly or weakly connected in each fMRI condition. The former are cortical hubs predominantly belonging to the visual and default mode networks, whereas the latter are mostly subcortical regions, consistently with previous reports (Buckner et al., 2009; Schaefer et al., 2014; Tomasi and Volkow, 2011). Moreover, the improved fingerprinting scores are achieved with fewer principal components than in the baseline and absolute cases (Figs. 2E, 6E and 7E). This suggests that the degree-normalization reduces the individuals' fingerprints to a first set of principal components (in descending order of explained variance). In addition, when looking at the cumulative percentage of explained variance of the principal components extracted from the data set (Supplementary Fig. S5), we observe a reduced dominance effect. In other words, the individual contribution of components to the explained variance is much more homogeneous. Together, these results indicate that the variance preserved by the components of normalized FCs, although lower, is highly specific to the contrast between individuals. From this perspective, degree-normalization could be beneficial for future FC fingerprinting research.

Surrogate degree-normalization improves differential identifiability, but not identification rate or matching rate

Figure 2 shows that differential identifiability is improved following degree-normalization for several conditions, no matter if the correspondence between FCs and their respective degree sequence is preserved (self-normalization) or not (surrogate-normalization). Normalizing FCs has a global effect of lowering the influence of hubs in the network (Crofts and Higham, 2009; Estrada et al., 2012; Rajapandian et al., 2020), which in turn allows better fingerprints to be extracted. This suggests that individual-specific components of functional connectivity might lie (in part) in sparsely connected areas whose contribution to the whole network is brought out by degree-normalization. The fact that surrogate degree-normalization sometimes improves differential identifiability compared with baseline indicates that the weighted degree sequence of FCs is similar across individuals. In Supplementary Figure S6, we show the results of the differential identifiability framework applied on degree sequences instead of functional connectivity matrices. We observe that the weighted degree sequence alone imparts a moderate fingerprinting power no matter the number of components kept for the reconstruction, which was previously reported (Rajapandian et al., 2020). However, matching FCs with their own degree sequence for degree-normalization appears to be beneficial to all metrics and all fMRI conditions, whereas surrogate-normalization has a null or negative effect on the identification rate and the matching rate, compared with baseline (Figs. 2D, 6D and 7D). This indicates that the normalization of FCs by their respective weighted degree sequence helps uncovering fingerprints and suggests a synergistic effect that goes beyond the fingerprints of original FCs and degree sequences separately.

Matching rate as a correction of identification rate

In this work, we observed that the identification rate metric, which has been used in several previous studies (Amico and Goñi, 2018; Finn et al., 2015), is sometimes driven down by a few individuals being highly similar to many others (Fig. 6C and Supplementary Fig. S3). Based on these results, it is noteworthy that the identification rate of an entire data set can be compromised by a few or even one subject or single session of an otherwise high-quality fingerprinting data set. Post hoc investigations on those particular FCs significantly dropping $I D_{r a t e}$ values showed that no obvious quality issues were present on the correlation matrices or their corresponding histograms (see Supplementary Fig. S7 for two examples). To take into account the reality that each individual in our setting appears only once in each of the test and retest data sets, we introduced the matching rate metric. We noticed that the matching rate results are characterized by a plateau value (Fig. 7A–C) rather than a concave behavior with a well-defined maximum, as obtained with differential identifiability and identification rate (Figs. 2A–C and 6A–C). This suggests that, from the perspective of the matching rate metric, the PCA decomposition does not uncover functional connectivity fingerprints, but rather detects the dimensionality to which the data can be compressed while preserving an optimal fingerprinting power.

Pros and cons of different fingerprinting metrics

As discussed in the Surrogate Degree-Normalization Improves Differential Identifiability, But Not Identification Rate or Matching Rate section, while surrogate degree-normalization increases $I_{d i f f}$ , its effects on $I D_{r a t e}$ and $M_{r a t e}$ are either neutral or negative, when compared with baseline FCs. This highlights the limitations of $I_{d i f f}$ as a metric, where we observe that even though surrogate degree-normalization has improved the overall contrast between self- and between-subject similarity (increased $I_{d i f f}$ ), its effects on the self-similarity are mostly negative (null or decreased $I D_{r a t e}$ and $M_{r a t e}$ ). However, we have discussed in the Matching Rate as a Correction of Identification Rate section, the limitations of $I D_{r a t e}$ as a metric, where it can be severely affected by one or a few subjects/sessions of FCs that have high similarity with the rest of the population, hiding the underlying fingerprint of the data set; this problem can be alleviated using $M_{r a t e}$ instead of $I D_{r a t e}$ . At the same time, we observed (Fig. 7) that $M_{r a t e}$ does not provide enough variation with number of principal components to find a clear optimal point of reconstruction in the differential identifiability frameworks. All these observations highlight that we should use more than one (preferably all three) metrics to estimate the amount of fingerprint in an FC data to avoid any unforeseen pitfalls. In other words, these three metrics represent a different face of the fingerprint in a sample of FCs.

Limitations and future work

The present work has several limitations. First, we chose to keep for each condition the total number of fMRI volumes available in the database. Previous work reported that larger numbers of frames can positively impact fingerprinting metrics (Abbas et al., 2020b; Amico and Goñi, 2018). Here, as different scanning durations were used for each condition (Supplementary Table S3), our results should be interpreted in light of this limitation. Future work should investigate whether degree-normalization is beneficial to fingerprinting studies using short scanning durations. Additionally, the effect of degree-normalization during functional reconfiguration could be assessed in scanning sessions that combine resting periods and tasks (Amico et al., 2020).

Second, we conducted our experiments on a single data set and used a particular brain parcellation. To evaluate the variability of our results with respect to variations in the data set, we used sampling without replacement. Future work should reassess the impact of degree-normalization on external data sets, possibly obtained with different preprocessing pipelines (Parkes et al., 2018). We are confident that the results presented here are generalizable to other data sets and parcellations since the other fingerprinting frameworks have been shown to reproducible across fMRI conditions (Finn et al., 2015), robust across brain atlases (Amico and Goñi, 2018; Abbas et al., 2020a), and across scanning sites (Bari et al., 2019). Future work should include the assessment of this framework for studying brain injuries and neurological disorders.

Third, we identified an optimal set of principal components specific to each subsample, restricting the possibility of extracting the fingerprint of an unseen individual outside this sample with the same components. Further investigation is required to assess the possibility to derive a “universal” set of principal components (Sripada et al., 2019) on which one could project a new FC to directly obtain its fingerprint. Initial efforts in this direction have been performed by applying a leave-one-out strategy in multisite test–retest scenarios (Bari et al., 2019). A thorough investigation on this important topic would possibly require a significantly larger cohort and involve multisite, multi-scanner fMRI acquisitions.

Finally, in the construction of the identifiability matrix, we considered the statistical similarity between reconstructed FCs, operationalized by the entry-wise Pearson's correlation coefficient. In contrast, recent studies recommended considering the geometric similarity of FCs, leveraging the observation that signed FCs lie on the manifold of positive semidefinite matrices and are therefore associated with a geodesic distance (Abbas et al., 2020b; Venkatesh et al., 2020). However, we would like to note that taking functional connectivity in absolute value, as required by the degree-normalization, breaks the positive semi-definiteness of FCs and therefore proscribes the geometric approach. Besides, the degree-normalization procedure is parameterless, whereas the geometric approach involves a data set-dependent regularization parameter (Abbas et al., 2020b). Overall, we suggest that future work should consider statistical or geometric similarity depending on the context and application of the study.

Conclusion

Fingerprints extraction from FCs is an important step toward refined individual-level studies of brain connectivity, with potential applications in personalized medicine. In this report, we showed that the degree-normalization of FCs is a simple, parameterless mathematical operation producing significant improvements of the fingerprinting quality, according to three different metrics, in resting-state and several task conditions. Furthermore, we argued that the fingerprint of FCs can be compressed in a low-dimensional space, especially thanks to degree-normalization. We also show the potential benefits and pitfalls of three different fingerprinting metrics, where each of them uncovers different aspects of the fingerprint present in a sample of FCs. Overall, our results suggest that applying degree-normalization to FCs can be beneficial for future research focused on individual differences in brain networks.

Acknowledgments

B.C. is an FRIA (F.R.S.-FNRS) fellow. The authors would like to thank Jean-Charles Delvenne for his helpful comments and suggestions. Data were provided by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research and by the McDonnell Center for Systems Neuroscience at Washington University.

Authors' Contributions

All listed authors contributed to the presented article. Their respective contributions are: B.C.: conceptualization, formal analysis, methodology, validation, visualization, writing (original draft), and writing (review and editing). K.A.: data curation and preprocessing, conceptualization, methodology, validation, writing (original draft), and writing (review and editing). E.A.: conceptualization, methodology, and writing (review and editing). D.A.D.-T.: conceptualization, methodology, and writing (review and editing). F.C.: methodology, supervision, and writing (review and editing). J.G.: conceptualization, methodology, supervision, validation, writing (original draft), and writing (review and editing).

Footnotes

Author Disclosure Statement

No competing financial interests exist.

Funding Information

B.C. is an FRIA fellow (Grant No. 1.E051.18+F, Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture, Fonds de la Recherche Scientifique, Belgium). E.A. acknowledges financial support from the SNSF Ambizione project “Fingerprinting the brain: network science to extract features of cognition, behavior and dysfunction” (Grant No. PZ00P2_185716). J.G. acknowledges financial support from NIH R01EB022574, NIH R01MH108467, Indiana Alcohol Research Center P60AA07611, and Purdue Discovery Park Data Science Award “Fingerprints of the Human Brain: A Data Science Perspective.”

Supplementary Material

Supplementary Figure S1

Supplementary Figure S2

Supplementary Figure S3

Supplementary Figure S4

Supplementary Figure S5

Supplementary Figure S6

Supplementary Figure S7

Supplementary Table S1

Supplementary Table S2

Supplementary Table S3

References

Abbas

, Amico

, Svaldi

, et al. 2020a. GEFF: Graph embedding for functional fingerprinting. Neuroimage, 221:117181.

Abbas

, Liu

, Venkatesh

, et al. 2020b. Regularization of functional connectomes and its impact on geodesic distance and fingerprinting. arXiv preprint arXiv:2003.05393.

Amico

, Dzemidzic

, Oberlin

, et al. 2020. The disengaging brain: dynamic transitions from cognitive engagement and alcoholism risk. Neuroimage, 209:116515.

Amico

, Goñi

. 2018. The quest for identifiability in human functional connectomes. Sci Rep, 8:1–14.

Bari

, Amico

, Vike

, et al. 2019. Uncovering multi-site identifiability based on resting-state functional connectomes. Neuroimage, 202:115967.

Betzel

, Bertolero

, Gordon

, et al. 2019. The community structure of functional brain networks exhibits scale-specific patterns of inter-and intra-subject variability. Neuroimage, 202:115990.

Betzel

, Fukushima

, He

, et al. 2016. Dynamic fluctuations coincide with periods of high and low modularity in resting-state functional brain networks. Neuroimage, 127:287–297.

Buckner

, Sepulcre

, Talukdar

, et al. 2009. Cortical hubs revealed by intrinsic functional connectivity: mapping, assessment of stability, and relation to Alzheimer's disease. J Neurosci, 29:1860–1873.

Bullmore

, Sporns

. 2009. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat Rev Neurosci, 10:186–198.

10.

Byrge

, Kennedy

. 2019. High-accuracy individual identification using a “thin slice” of the functional connectome. Netw Neurosci, 3:363–383.

11.

Cole

, Bassett

, Power

, et al. 2014. Intrinsic and task-evoked network architectures of the human brain. Neuron, 83:238–251.

12.

Crofts

, Higham

. 2009. A weighted communicability measure applied to complex brain networks. J R Soc Interface, 6:411–414.

13.

Essen

DCV

, Smith

, Barch

, et al. 2013. The WU-Minn human connectome project: an overview. Neuroimage, 80:62–79.

14.

Essen

DCV

, Ugurbil

, Auerbach

, et al. 2012. The Human Connectome Project: a data acquisition perspective. Neuroimage, 62:2222–2231.

15.

Estrada

, Hatano

, Benzi

. 2012. The physics of communicability in complex networks. Phys Rep, 514:89–119.

16.

Finn

, Shen

, Scheinost

, et al. 2015. Functional connectome fingerprinting: identifying individuals using patterns of brain connectivity. Nat Neurosci, 18:1664.

17.

Fornito

, Zalesky

, Breakspear

. 2015. The connectomics of brain disorders. Nat Rev Neurosci, 16:159–172.

18.

Fornito

, Zalesky

, Bullmore

. 2016. Fundamentals of Brain Network Analysis. Academic Press, NY: Elsevier.

19.

Glasser

, Coalson

, Robinson

, et al. 2016. A multi-modal parcellation of human cerebral cortex. Nature, 536:171–178.

20.

Glasser

, Sotiropoulos

, Wilson

, et al. 2013. The minimal preprocessing pipelines for the Human Connectome Project. Neuroimage, 80:105–124.

21.

Gratton

, Laumann

, Nielsen

, et al. 2018. Functional brain networks are dominated by stable group and individual factors, not cognitive or daily variation. Neuron, 98:439–452.e5.

22.

Gutiérrez-Gómez

, Vohryzek

, Chiêm

, et al. 2020. Stable biomarker identification for predicting schizophrenia in the human connectome. Neuroimage, 27:102316.

23.

Iturria-Medina

, Carbonell

, Evans

, et al. 2018. Multimodal imaging-based therapeutic fingerprints for optimizing personalized interventions: application to neurodegeneration. Neuroimage, 179:40–50.

24.

Kumar

, Desrosiers

, Siddiqi

, et al. 2017. Fiberprint: a subject fingerprint based on sparse code pooling for white matter fiber analysis. Neuroimage, 158:242–259.

25.

Kumar

, Toews

, Chauvin

, et al. 2018. Multi-modal brain fingerprinting: a manifold approximation based framework. Neuroimage, 183:212–226.

26.

Lambiotte

, Delvenne

, Barahona

. 2014. Random walks, Markov processes and the multiscale modular organization of complex networks. IEEE Trans Netw Sci Eng, 1:76–90.

27.

Liu

, Liao

, Xia

, et al. 2018. Chronnectome fingerprinting: identifying individuals and predicting higher cognitive functions using dynamic brain connectivity patterns. Hum Brain Mapp, 39:902–915.

28.

Lynall

M-E

, Bassett

, Kerwin

, et al. 2010. Functional connectivity and brain networks in schizophrenia. J Neurosci, 30:9477–9487.

29.

Mars

, Passingham

, Jbabdi

. 2018. Connectivity fingerprints: from areal descriptions to abstract spaces. Trends Cogn Sci, 22:1026–1037.

30.

Menon

, Krishnamurthy

. 2019. A comparison of static and dynamic functional connectivities for identifying subjects and biological sex using intrinsic individual brain connectivity. Sci Rep, 9:1–11.

31.

Micheloyannis

, Pachou

, Stam

, et al. 2006. Small-world networks and disturbed functional connectivity in schizophrenia. Schizophr Res, 87:60–66.

32.

Milham

, Vogelstein

, Xu

. 2021. Removing the reliability bottleneck in functional magnetic resonance imaging research to achieve clinical utility. JAMA Psychiatry, 78:587–588.

33.

Ogawa

, Lee

T.-M.

, Kay

, et al. 1990. Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proc Natl Acad Sci U S A, 87:9868–9872.

34.

Pallarés

, Insabato

, Sanjuán

, et al. 2018. Extracting orthogonal subject-and condition-specific signatures from fMRI data using whole-brain effective connectivity. Neuroimage, 178:238–254.

35.

Parkes

, Fulcher

, Yücel

, et al. 2018. An evaluation of the efficacy, reliability, and sensitivity of motion correction strategies for resting-state functional MRI. Neuroimage, 171:415–436.

36.

Power

, Mitra

, Laumann

, et al. 2014. Methods to detect, characterize, and remove motion artifact in resting state fMRI. Neuroimage, 84:320–341.

37.

Puxeddu

, Faskowitz

, Betzel

, et al. 2020. The modular organization of brain cortical connectivity across the human lifespan. Neuroimage, 218:116974.

38.

Rajapandian

, Amico

, Abbas

, et al. 2020. Uncovering differential identifiability in network properties of human brain functional connectomes. Netw Neurosci, 4:698–713.

39.

Rubinov

, Sporns

. 2010. Complex network measures of brain connectivity: uses and interpretations. Neuroimage, 52:1059–1069.

40.

Satterthwaite

, Xia

, Bassett

. 2018. Personalized neuroscience: common and individual-specific features in functional brain networks. Neuron, 98:243–245.

41.

Schaefer

, Margulies

, Lohmann

, et al. 2014. Dynamic network participation of functional connectivity hubs assessed by resting-state fMRI. Front Hum Neurosci, 8:195.

42.

Seitzman

, Gratton

, Laumann

, et al. 2019. Trait-like variants in human functional brain networks. Proc Natl Acad Sci U S A, 116:22851–22861.

43.

Shine

, Aburn

, Breakspear

, et al. 2018. The modulation of neural gain facilitates a transition between functional segregation and integration in the brain. Elife, 7:e31130.

44.

Shine

, Bissett

, Bell

, et al. 2016. The dynamics of functional brain networks: integrated network states during cognitive task performance. Neuron, 92:544–554.

45.

Shine

, Breakspear

, Bell

, et al. 2019. Human cognition involves the dynamic integration of neural activity and neuromodulatory systems. Nat Neurosci, 22:289–296.

46.

Smith

, Beckmann

, Andersson

, et al. 2013. Resting-state fMRI in the human connectome project. Neuroimage, 80:144–168.

47.

Sporns

, Betzel

. 2016. Modular brain networks. Ann Rev Psychol, 67:613–640.

48.

Sripada

, Angstadt

, Rutherford

, et al. 2019. Basic units of inter-individual variation in resting state connectomes. Sci Rep, 9:1–12.

49.

Supekar

, Menon

, Rubin

, et al. 2008. Network analysis of intrinsic functional brain connectivity in Alzheimer's disease. PLoS Comput Biol, 4:e1000100.

50.

Svaldi

, Goñi

, Abbas

, et al. 2019. Optimizing differential identifiability improves connectome predictive modeling of cognitive deficits in Alzheimer's disease. arXiv preprint arXiv:1908.06197.

51.

Tomasi

, Volkow

. 2011. Functional connectivity hubs in the human brain. Neuroimage, 57:908–917.

52.

Valizadeh

, Liem

, Mérillat

, et al. 2018. Identification of individual subjects on the basis of their brain anatomical features. Sci Rep, 8:1–9.

53.

Venkatesh

, Jaja

, Pessoa

. 2020. Comparing functional connectivity matrices: a geometry-aware approach applied to participant identification. Neuroimage, 207:116398.

54.

Wachinger

, Golland

, Kremen

, et al. 2015. BrainPrint: A discriminative characterization of brain morphology. Neuroimage, 109:232–248.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.57 MB

0.02 MB

0.62 MB

0.23 MB

0.67 MB

0.25 MB

0.95 MB

0.08 MB