Abstract
Recent development of neuroimaging and network science allow us to visualize and characterize the whole brain connectivity map in vivo. As the importance of volumetric image atlas, a common brain connectivity map (called connectome atlas) across individuals can offer a new window to understand the neurobiological underpinning of cognition and behavior that are related to brain development or neuro-disorders. However, a major obstacle to the application of classic atlas construction methods in the setting of brain network is that the region-to-region connectivity, often encoded in a graph, does not exactly comply with the Euclidean space that has widely been used for the regular data structure such as grid. To address this challenge, we first turn the brain network (encoded in the graph) into a set of graph signals in the Euclidean space via the diffusion mapping technique. Furthermore, we cast the construction of connectome atlas into a learning-based graph inference model that simultaneously (1) aligns all individual graph signals to a common space spanned by the graph spectrum bases of the latent common network, and (2) learns graph Laplacian of the common network that is in consensus with all aligned graph signals. We have evaluated our novel connectome atlas method with the comparison to the counterpart non-learning based methods in analyzing the brain networks for both neurodevelopmental and neurodegenerative diseases, where our proposed learning-based method shows more reasonable results in terms of accuracy and replicability.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
1 Introduction
Recent advances in non-invasive in-vivo neuroimaging technology offers a new insight to see a large-scale map of structural and function connections in the whole brain at the individual level. The ensemble of macroscopic brain connections can then be described as a complex network - the ‘connectome’ [1]. Graph theory, a powerful framework to characterize diverse properties of the complex network, has been widely investigated to understand the characteristics of network organization that are related to brain development or the neurobiological mechanism of brain disorders.
In the last several decades, various computational approaches have been proposed to construct volumetric image atlas by (1) registering all individual images to a common space and (2) averaging the voxel-wise values (such as intensity degrees and tissue probabilities) across the normalized subjects [2]. The image atlas acts a very critical role in neuroimaging studies since it provides a common reference space to compare and discover the alteration in human brain. In this regard, the connectome atlas, a population-wise average of brain networks, is of high necessity to reveal and characterize the network differences across clinical cohorts.
However, it is not straightforward to apply the above image atlas construction method to brain networks in the setting of graph, since graph, a data representation of brain network, holds an irregular data geometry. Currently, most of the work on brain connectome atlas focus on the modeling of nodes in the connectome atlas such as fine-grained region parcellations based on the connectivity patterns [3, 4]. For example, a connectivity-based parcellation framework is proposed in [4] to better characterize functional connectivity information, resulting in a node-wise “brain connectome atlas” with 210 cortical and 36 subcortical regions. Although connectivities is much more important and relevant to characterize network alterations than nodes in the connectome atlas, little attention has been paid to investigate the most representative brain connectivities in the population. Among a few connectome atlas work, it is common to average the connectivity values at each link by treating brain network as a data matrix. Albeit simple, such arithmetic average might break down the network topology since brain network is a structured data representation with complicated data geometry. It is worth noting that the matrix diffusion technique was used in [5] to propagate the connectivity information from one subject to another until all networks become similar. Thus, the connectome atlas is the average over the diffused matrices, instead of the original network matrices. However, matrix diffusion suffers from the issue of network degeneration where each node has almost equal probability to connect to all other nodes.
To address the above challenges, we propose a learning-based graph inference framework to unravel the connectome atlas in the population with focus on the population-wise connectivities. To do so, we first turn each brain network into a set of graph signals by diffusion mapping technique [6], which allows us to embed the network into the Euclidean space. After that, we propose a learning-based approach to discover the common network that is prevalent in the population. Specifically, we present a graph-based inference to learn the most representative topology which has the largest consensus with all aligned graph signals in the common space. Since the diffusion mapping can integrate local connectivity to global geometry of the whole network, our learning-based framework offers a new window to construct multi-scale connectome atlas that has the capability to “think globally, fit locally”.
We evaluate the accuracy and robustness of our learning-based connectome atlas construction method on various group comparisons for both neurodevelopmental and neurodegenerative diseases that include Autism Spectrum Disorders (ASD), Obsessive-Compulsive Disorder (OCD), and Fronto-Temporal Dementia (FTD). Compared to the counterpart non-learning based approaches, our connectome atlas shows more reasonable result than the existing methods such as network averaging or network diffusion.
2 Methods
Suppose we have a set of brain networks from \( M \) individual subjects. Each brain network \( {\mathcal{G}}_{m} = \left( {\varvec{V},\varvec{A}_{m} } \right) \) \( \left( {m = 1, \ldots ,M} \right) \) consists of \( N \) nodes \( \varvec{V} = \{ v_{i} |i = 1, \ldots ,N\} \) and the adjacency matrix \( \varvec{A}_{m} = \left[ {a_{ij}^{m} } \right]_{i,j = 1, \ldots ,N} \), where each element \( a_{ij}^{m} \) measures strength of connectivity between two regions. In structural network, \( a_{ij}^{m} \) is related to the number of fibers connecting between two regions. While in functional network, \( a_{ij}^{m} \) measures the synchronization of functional activities in terms of statistical correlation between BOLD (blood oxygen-level dependent) signals. The connectome atlas \( {\mathbb{G}} = \left( {\varvec{V},\varvec{W}} \right) \) represents the center of all individual brain networks in the population, where the population-wise whole brain connectivity is encoded in the adjacency matrix \( \varvec{W} = \left[ {w_{ij} } \right]_{i,j = 1}^{N} \). For convenience, we assume each element \( w_{ij} \) in \( \varvec{W} \) is non-negative.
2.1 Overcome Irregular Data Geometry: From Network to Signals
First, we will address the challenge of irregular data structure by embedding the brain network into a diffusion space. Since the procedure of diffusion mapping is same for each network \( {\mathcal{G}}_{m} \), we omit the variable \( m \) of subject index in Sect. 2.1 for clarity. In general, it is reasonable to assume that each adjacent matrix is symmetric (i.e., \( a_{ij} = a_{ji} \)) and non-negative \( \left( {a_{ij} \ge 0} \right) \). Here, we go one step further to normalize each row of adjacency matrix by \( \varvec{P} = \varvec{D}^{ - 1} \varvec{A} \), resulting a Markov matrix \( \varvec{P} \), where \( \varvec{D} \) is the diagonal matrix with the \( i^{th} \) diagonal element equals \( \sum\nolimits_{j = 1}^{N} {a_{ij} } \). Each element \( p_{ij} \) of \( \varvec{P} \) can be interpreted as a transition probability of single step taken from node \( v_{i} \) to \( v_{j} \) in the network. By taking the power of the Markov matrix, we can increase the number of steps taken. For instance, each element \( p_{ij}^{t} \) in \( \varvec{P}^{t} \) sums up all walks of length \( t \) from node \( v_{i} \) to node \( v_{j} \). As we increase the value of \( t \), we can characterize the connectivity of brain network in a local to global manner, as shown in the top row of Fig. 1(c).
Diffusion Distance.
Given the Markov matrix \( \varvec{P} \), the diffusion distance \( d_{t} \) between two nodes is a time-dependent metric that measures the connectivity strength at the specific scale \( t \) as \( d_{t} \left( {v_{i} ,v_{j} } \right) = \sum\nolimits_{k = 1}^{N} {\left( {p_{ik}^{t} - p_{kj}^{t} } \right)^{2} } \). As shown in Fig. 1(a), the Euclidean distance between points \( Q_{1} \) and \( Q_{2} \) is roughly the same as the distance between \( Q_{1} \) and \( Q_{3} \). However, as the diffusion process runs forward, the diffusion distance between \( Q_{1} \) and \( Q_{3} \) is much larger since the pathway between \( Q_{1} \) and \( Q_{3} \) has to go through the bottleneck between two clusters (Fig. 1(b)). Since the diffusion distance encodes the topology of the whole network, it is more robust to noise and spurious connections than the classic Euclidean distance.
Approximate Diffusion Distance in the Euclidean Space.
Denote the set of eigenvectors of \( \varvec{P} \) as \( \left\{ {\psi_{c} } \right\}_{c = 1}^{N} \), with corresponding eigenvalues \( \left\{ {\lambda_{c} } \right\}_{c = 1}^{N} \,\left( {\lambda_{1} \ge \lambda_{2} \ge \ldots \ge \lambda_{N} } \right) \). The diffusion map \( {\varvec{\Psi}}_{t} = \left[ {\xi_{c,t} } \right]_{c = 1}^{N} \) is formed by a set of orthogonal column vectors, where \( \xi_{c,t} = \lambda_{c}^{t} \psi_{c} \) is a time-dependent basis and \( \left\{ {\xi_{c,t} } \right\}_{c = 1}^{N} \) form the spectrum bases of underlying network. Meanwhile, each row vector in \( {\varvec{\Psi}}_{t} \) can be regarded as the diffusion embedding vector of the corresponding network node where each element is the projection coefficient of each spectrum basis. Thus, the diffusion distance \( d_{t} \left( {v_{i} ,v_{j} } \right) \) can be approximated by the \( \ell_{2} \)-norm distance in the Euclidean space as: \( d_{t} \left( {v_{i} ,v_{j} } \right) = \mathop \sum \limits_{c = 1}^{N} \left\| {\xi_{c,t} \left( i \right) - \xi_{c,t} \left( j \right)} \right\|^{2} \). It is clear that nodes that are strongly connected in the network should have similar diffusion embedding vectors. The bottom row in Fig. 1(c) displays the reconstructed networks using the approximated diffusion distance at scale \( t = 0 \), \( t = 1 \), \( t = 2 \) and \( t = 12 \). Compared to the corresponding power of Markov matrix in the top row, the diffusion map can efficiently embed the brain network from the irregular data domain to the Euclidean space where the Euclidean distance is equal to the diffusion distance.
Instead of considering each \( \xi_{c,t} \) as a data array, we further regard \( \xi_{c,t} \) as a graph signal [7] in the following text, where the element-to-element relationship in each \( \xi_{c,t} \) is interpreted in the context of the latent common network \( {\mathbb{G}}_{t} \) (shown in Fig. 2(c)). In other word, the graph signal \( \xi_{c,t} \) describes the characteristics of column vector \( \xi_{c,t} \) that resides on common network \( {\mathbb{G}}_{t} \). Next, we present the learning-based graph inference model for connectome atlas by learning the Laplacian matrix of common network \( {\mathbb{G}}_{t} \) from a set of graph signals across individual networks at the scale \( t \).
2.2 Learn Graph Laplacian of Common Network: From Signals to Network
Since the graph signals \( \left\{ {\xi_{c,t}^{m} } \right\} \) are defined in Euclidean space, it is straight forward to construct common network by averaging them across individuals. However, the graph signals \( \left\{ {\xi_{c,t}^{m} } \right\} \) are not aligned across different subjects yet due to different \( \xi_{c,t}^{m} \)s corresponds to different Eigen values. Simple average might introduce the spurious and irrelevant connectivity information. To overcome this limitation, we propose a learning-based approach to learn the Laplacian matrix of common network \( {\mathbb{G}}_{t} \) from a set of graph signals \( \left\{ {\xi_{c,t}^{m} } \right\} \), which mainly consists of two alternative steps (shown in Fig. 2).
-
Step 1: Modulate graph signals to the common graph spectrum space. We use \( \varvec{L}_{t} \) denote the Laplacian matrix of latent common network \( {\mathbb{G}}_{t} \). Then, \( {\varvec{\Phi}}_{t} = \left[ {\varphi_{c,t} } \right]_{c = 1}^{N} \) is a \( N \times N \) left eigenvector matrix of the Laplacian matrix \( \varvec{L}_{t} \), where \( \varphi_{c,t} \) is the column vector corresponding the \( c^{th} \) Eigen value of \( \varvec{L}_{t} \). Recall that each \( \xi_{c,t}^{m} \) represents the oscillation mode. In each \( c^{th} \) mode, \( \xi_{c,t}^{m} \) and \( \xi_{c,t}^{{m^{\prime}}} \) \( \left( {m \ne m^{'} } \right) \) may not have the same Eigen value (frequency) to each other. Therefore, we first need to align each graph signals \( \xi_{c,t}^{m} \) at the \( c^{th} \) Eigen mode by modulating them to the corresponding Eigen basis \( \varphi_{c,t} \) of latent common network \( {\mathbb{G}}_{t} \) by: \( \tilde{\xi }_{c,t}^{m} \left( n \right) = \sqrt N \xi_{c,t}^{m} \left( n \right)\varphi_{c,t} \left( n \right) \) [7]. Thus, all modulated graph signals \( \left\{ {\tilde{\xi }_{c,t}^{m} } \right\} \) are aligned to the same frequency band where the common network holds at \( c^{th} \) mode, as shown in Fig. 2(e).
-
Step 2: Learn Laplacian matrix of common network. First, we seek for the representative graph signal \( y_{c,t} \) which is close to all aligned graph signals \( \left\{ {\tilde{\xi }_{c,t}^{m} } \right\}_{m = 1}^{M} \) at the corresponding \( c^{th} \) Eigen mode. Second, we examine the consensus between the common network geometry and the current representative graph signal \( y_{c,t} \) by minimizing the value of graph smoothness in the sense that the \( i^{th} \) and \( j^{th} \) elements in \( y_{c,t} \) should have similar values if nodes \( v_{i} \) and \( v_{j} \) are strongly connected in the common network \( {\mathbb{G}}_{t} \). The graph smoothness under the condition of \( {\mathbb{G}}_{t} \) can be measured in terms of a quadratic form of the graph Laplacian \( y_{c,t}^{T} \varvec{L}_{t} y_{c,t} \) [7]. Third, we apply matrix Frobenius norm \( \left\| {\varvec{L}_{t} } \right\|_{F}^{2} \) and the trace norm constraint (i.e., \( tr\left( {\varvec{L}_{t} } \right) = N \)) to guarantee \( \varvec{L}_{t} \) is a valid Laplacian matrix. By combining these terms, the overall energy function for learning the Laplacian matrix of common network \( {\mathbb{G}}_{t} \) is:
where \( \alpha \) and \( \beta \) are two scalars controlling the strength of the constraints on graph smoothness and the Laplacian matrix.
Optimization.
We optimize \( \left\{ {y_{c,t} } \right\} \) and \( \varvec{L}_{t} \) in an alternative manner. First, by discarding unrelated terms w.r.t. \( y_{c,t} \) in Eq. (1), the optimization of each \( y_{c,t} \) is a classic Tikhonov regularization problem. We have closed form solution as: \( \widehat{y}_{c,t} = \left( {\alpha \varvec{L}_{t} + \varvec{I}_{N \times N} } \right)^{ - 1} \left( {\sum\nolimits_{m = 1}^{M} {\widehat{\xi }_{c,t}^{m} } } \right) \).
Second, by fixing \( \left\{ {y_{c,t} } \right\} \), we can optimize \( \varvec{L}_{t} \) by minimizing \( \alpha \sum\nolimits_{c = 1}^{N} {y_{c,t}^{T} \varvec{L}y_{c,t} } + \beta \left\| {\varvec{L}_{t} } \right\|_{F}^{2} \). Considering each element in the adjacency matrix \( \varvec{W}_{t} \) of common network is non-negative, it is relatively easier to search for a valid adjacency matrix \( \varvec{W}_{t} \), instead of \( \varvec{L}_{t} \). In [8], minimizing \( \left\| {\varvec{L}_{t} } \right\|_{F}^{2} \) is equivalent to minimize \( \left\| {\varvec{W}_{t} } \right\|_{F}^{2} \) and the degree vector \( \varvec{W}_{t} 1 \) as:
where \( \gamma = \beta /\alpha \) controls the sparsity of \( \varvec{W}_{t} \). We first convert Eq. 2 to an augmented Lagrangian function with KKT (Karush Kuhn Tunker) multipliers and then optimize \( \varvec{W}_{t} \) using the classic ADMM (Alternating Direction Method of Multipliers) [8].
Note, we repeat the same procedure for different scale t and we are able to obtain multi-scale connectome atlases \( \left\{ {{\mathbb{G}}_{t} |t = 1, \ldots ,T} \right\} \).
3 Experiments
As shown in Eq. 1, \( \alpha \) and \( \beta \) are the only parameters required in the optimization. We fix \( a = 0.5 \) and \( \beta = 1.0 \) in all experiments by the grid search. The counterpart methods include (1) matrix averaging on original networks (called “matrix averaging”), and (2) graph-based averaging with matrix diffusion [5] (called “diffusive shrinking graphs”). Note, none of the counterpart methods is learning-based.
Description of Datasets.
There are in total three datasets used in the following experiments. (1) ASD dataset. We use the resting state fMRI data of 45 ASD patient and 47 healthy controls (HC) from the NYU Langone Medical Center of Autism Brain Imaging Data Exchange (ABIDE) database. (2) OCD dataset. The resting state fMRI data includes 65 HC subjects, 62 OCD subjects, and 73 unaffected first-degree relatives of OCD subjects. (3) FTD dataset. We select 90 FTD subjects and 101 age-matched HC subjects from the recent NIFD database which is managed by the frontotemporal lobar degeneration neuroimaging initiative. For all imaging data, we follow the partitions of AAL template and construct the function network for each subject with 90 nodes. The region-to-region connection is measured by Pearson’s correlation coefficient.
Evaluation of the Replicability.
In this experiment, we use the ASD dataset to evaluate the replicability of connectome atlas. For clarity, we only focus on one scale \( \left( {t = 1} \right) \) in our learning-based method. For each connectome atlas construction method, we apply the following resample procedure for 1,000 times: (1) randomly sample 70 networks from in total 92 networks; (2) continue sample another two sets of networks separately, each with 5 networks; (3) form two paired cohorts by combining the networks sampled in step 1 and 2; (4) run the underlying atlas construction method on two datasets independently. Since two paired cohorts only have 6.7% differences in terms of subjects, we can evaluate the replicability of connectome atlas method by examining whether the resulting connectome atlases show significant difference via paired t-test at each link \( \left( {p < 0.01} \right) \). Figure 3 visualizes the connectivities that present statistical significant difference between two largely overlapped datasets by three different connectome atlas construction methods, where the thickness and color of edge indicates the magnitude of difference (t score) after paired t-test. To display the locations of connectivities showing statistical difference in the whole brain, we map the indices (yellow dots) of connectives showing significant difference under the resampling test. Since our learning-based method leverages the global information such as network topology during atlas construction, our method shows better replicability than other methods.
Evaluation of the Accuracy.
We apply matrix averaging, diffusive shrinking graph, and our learning-based connectome atlas construction method to overall 92 functional networks from ABIDE database, 200 functional networks from OCD dataset, and 191 functional networks from NIFD database. The connectome atlases by matrix averaging, diffusive shrinking graphs, and our learning-based method \( \left( {t = 1} \right) \) are shown in Fig. 4(a)–(c), respectively. Through visual inspection, the connectome atlases built by our learning-based method show more distinct modular structure than the counterpart methods. Furthermore, we calculate the network distance between each network to the connectome atlas using the Gromov-Hausdorff distance [9]. As shown in Fig. 4(d), the overall network distance with respect to the connectome atlas by our learning-based method is much smaller than other two counterpart methods, indicating that our connectome atlas is more reasonable since our connectome atlas is more close to the population center which is supposed to have the minimal discrepancy to all other subjects.
Multi-scale Connectome Atlas.
In Fig. 4(e), we display the multi-scale connectome atlases for 62 OCD patients (bottom), 73 OCD siblings (middle), and 65 age-matched HC subjects (top), where the scale \( t \) ranges from 0.5 to 10. It is clear that (1) high-level information of network organization such as modules has been captured at the global scale; (2) the difference of network organization between OCD and HC cohorts becomes more and more distinct as the scale \( t \) increases. Also, it is apparent that the multi-scale connectome atlas from OCD siblings cohort is more similar to HC cohort than OCD cohort since these siblings subjects have not been diagnosed OCD yet.
4 Conclusion
The construction of connectome atlas is much more challenging than image atlas, due to the irregular graph structure. To address this problem, we proposed a learning-based approach to discover the common network in the setting of graphs. Our proposed method offers a new window to investigate the population-wise whole brain connectivity profiles in human brain. Promising results have been obtained in various brain network studies such as ASD, OCD, and FTD, which indicates the applicability of our proposed learning-based connectome atlas construction methods in neuroimaging applications.
References
Sporns, O.: Networks of the Brain. The MIT Press, Cambridge (2011)
Joshi, S., Davis, B., Jomier, M., Gerig, G.: Unbiased diffeomorphic atlas construction for computational anatomy. NeuroImage 23, S151–S160 (2004)
Shen, X., Tokoglu, F., Papademetris, X., Constable, R.T.: Groupwise whole-brain parcellation from resting-state fMRI data for network node identification. Neuroimage 82, 403–415 (2013)
Fan, L., et al.: The human brainnetome atlas: a new brain atlas based on connectional architecture. Cereb. Cortex 26, 3508–3526 (2016)
Rekik, I., Li, G., Lin, W., Shen, D.: Estimation of brain network atlases using diffusive-shrinking graphs: application to developing brains. In: International Conference on Information Processing in Medical Imaging, pp. 385–397 (2017)
Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmonic Anal. 21, 5–30 (2006)
Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30, 83–98 (2013)
Kalofolias, V.: How to learn a graph from smooth signals. Proc. Mach. Learn. Res. 51, 920–929 (2016)
Lee, H., Kang, H., Chung, M.K., Kim, B.N., Lee, D.S.: Persistent brain network homology from the perspective of dendrogram. IEEE Trans. Med. Imaging 31, 1387–1402 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kim, M. et al. (2019). Constructing Multi-scale Connectome Atlas by Learning Graph Laplacian of Common Network. In: Shen, D., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. MICCAI 2019. Lecture Notes in Computer Science(), vol 11766. Springer, Cham. https://doi.org/10.1007/978-3-030-32248-9_81
Download citation
DOI: https://doi.org/10.1007/978-3-030-32248-9_81
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32247-2
Online ISBN: 978-3-030-32248-9
eBook Packages: Computer ScienceComputer Science (R0)