The term "functional connectivity" derives from electrophysiology, where it was used to describe coherence relationships between relatively high temporal resolution time series. Friston and colleagues [1] differentiated between "structural" connectivity, referring to anatomical axonal projections observed by imaging white matter, or via tractography experiments performed on macaque monkeys; "functional" connectivity, referring explicitly to statistical dependence relationships between functional time series produced by methods such as EEG, PET, or fMRI; and "effective" connectivity, referring to inferences about (possibly directed) coupling derived through model comparison (are the observed data more probable given the presence of connection X, or its absence?). These terms have since become more or less dogma in brain connectivity research, and papers investigating these phenomena probably number in the thousands.
The term "functional connectivity", however, has been the source of more than a little confusion. This is largely because the word "connectivity" implies some sort of physical connection, which is clearly not inferable solely from the evidence of a correlative relationship. The lowliest undergrad is taught very early on in Statistics 101 that correlation does not imply causation. And yet, in brain connectivity research, we commonly employ a term that implies just that. In defense of this terminology, I commonly encounter the argument that, while functional connectivity is perhaps a misnomer, researchers in general are careful not to infer physical connectivity from correlative relationships. A reviewer once pointed out to me (whilst berating me for not being sufficiently optimistic for his/her taste) that "one of the most common refrains I observe across papers examining functional connectivity is that there is not a 1:1 correspondence between functional connectivity and measures of structural or anatomical connectivity". This is quite true. Buried somewhere in the fine print of most manuscripts is such a disclaimer, in much the same way that ads for internet gambling sites invariably include vague disclaimers indicating that they are not, in fact, gambling sites.
Except they are. And despite all the disclaimers, there are countless examples of peer-reviewed articles on functional connectivity whose methods and conclusions cannot be interpreted in any way other than that the authors are, actually, drawing the inference that functional correlations are equivalent to physical connectivity [2]. It is much sexier to be able to conclude that "Factor X is related to a decrease in connectivity in Network Y" than to conclude that "We found a structured covariance pattern in BOLD activity, some of which was significantly correlated with Factor X" — even if the latter is accurate and the former is simply a misleading overinterpretation of the data.
Functional covariance analysis is far from useless. I want to make that assertion from the start, in case this discussion is mistaken for a diatribe against connectivity research generally. Covariance patterns are extremely useful for dimensionality reduction, in that they allow us to segregate and cluster regions of brain tissue that co-activate (over a certain time window) in the presence of (or absence of) task demands. It is when we try to use functional covariance to infer integrative relationships, that things fall apart. To quote Friston [3]:
By definition, functional connectivity does not rest on any model of statistical dependencies among observed responses. This is because functional connectivity is essentially an information theoretic measure that is a function of, and only of, probability distributions over observed multivariate responses. This means that there is no inference about the coupling between two brain regions in functional connectivity analyses: the only model comparison is between statistical dependency and the null model (hypothesis) of no dependency. This is usually assessed with correlation coefficients (or coherence in the frequency domain). This may sound odd to those who have been looking for differences in functional connectivity between different experimental conditions or cohorts. However, as we will see later, this may not be the best way of looking for differences in coupling.
The importance of noise
But those are just words. Just how well can functional covariance capture physical connectivity in practice? Let's consider a simple abstraction: a three-node network. In the figure below, a sequential network is shown. If we make the A node our input node, with a 30 Hz sinusoid as a signal, we can propagate that signal along physical connections, specified by delays \(\delta_{ij}\) and connection strengths \(c_{ij}\) (for simplicity we'll set the delays to zero). Each node in the network can be specified by a Gaussian noise amplitude \(w_i\), that is added to its functional signal \(Y_i\) (we can think of this signal as representing an average firing rate, or as a BOLD response). This noise variable is crucial to understanding the behaviour of functional covariance. It should not be considered measurement noise (let's pretend we have perfect observations), but rather "neural" noise; it represents the sum of spontaneous neuronal activity intrinsic to the brain area represented by the node, and all afferent projections to node \(i\) that aren't explicitly modelled.
We can use this little simulation (Matlab code available here as "fc_models") to ask some specific questions. How well can correlation coefficients between functional signals from two regions be used to estimate the physical connectivity \(c_{ij}\) between them? Can thresholding correlation coefficients segregate existing connections from non-existing ones? To answer these we are going to keep the connection strengths \(c_{ij}\) constant. That is, in all simulations, all edges have equal connection strength. If our functional covariance is a valid estimate of these values, it should be equal for all edges regardless of how we manipulate noise in the model, right? Of course, this is not the case, because covariance is a function of noise.
In the figure above, I'm showing what happens when we vary the relative baseline amplitude of noise for all nodes (image y-axis, line plot x-axis), and the relative amplitude for just one of the nodes, B (image x-axis). A few important conclusions can be drawn from this result:
As noise increases, correlations decrease (okay that's trivial).
When noise is equal across nodes (\(w_a = w_b = w_c\)), we find that correlations \(\rho_{ab} = \rho_{bc}\), indicating that they do indeed predict connectivity strength \(c\).
When noise varies across nodes, their correlations diverge; when \(w_b\) is less than the baseline, \(\rho_{ab} > \rho_{bc}\), whereas when \(w_b\) is greater than the baseline, \(\rho_{bc} > \rho_{ab}\). Thus, the resulting correlation coefficients depend on the relative noise level in B, which implies that, if noise is unequally distributed across nodes, correlation cannot be used to estimate connection strength.
In all cases, \(\rho_{ac}\) is less than that for the other two edges. Since there is no physical connection between A and C, this suggests that for our simple sequential network it may be possible to binarize a network to segregate existent from non-existent connections, given the appropriate threshold. But does this generalize to other more complex networks? And how do we know what the appropriate threshold is?
Parallel processing streams
We can test the generalizability of point 4 above with another simple network configuration, this time containing four nodes, two of which have a common afferent B. This is shown in the figure below. This parallel network is important to consider, because having a common input will lead to correlated time series, even in the absence of a direct connection. Moreover, we already know that the mammalian brain is highly symmetric, and comprised of major parallel processing streams, leading to strong apparent homotopic connectivity.
As before, we'll make \(c_{ij}\) equal for all edges, and in order to make my point we're going to set \(w_b\) a bit higher to make B noisy, and then vary \(w_c\) and \(w_d\) together. Showing the same plots as before:
We see that:
- Where C and D (which have no physical connection) have relatively low noise, they correlate more than A and B (which do have a physical connection). Thus, correlation cannot be used to threshold and binarize a network.
Of course, this example was jerrymandered by me to make my point. It is quite possible that such configurations are rare, and on the whole the relationship between functional covariance and physical connectivity is still fairly generalizable. On the other hand, it may not be. Numerous examples from the literature underscore this issue. For example, functional covariance is typically strongest between homotopic brain regions, and while physical connections clearly do exist between these regions, much of their covariance might also be explained by the interhemispheric symmetry of brain activation patterns. Patients who have had their corpus callosum completely resected [4] and individuals with agenesis of the corpus callosum [5] have still been found to have robust (if somewhat reduced) homotopic functional connectivity. Additionally, using a generative model to simulate functional signals, Honey and colleagues found an average correlation of roughly 0.5 between functional and structural connectivity [6]. This means that, even in a model where the ground truth is known a priori, only about 25% of functional covariance can be explained by the physical connections used to generate it.
Partial correlations to the rescue?
Partial correlations, which are obtained after controlling for the influence of all known covariates, can potentially allow us to deal with the noise conundrum highlighted above. Specifically, if the fraction of noise attributable to the influence of competing afferents can be removed from the signal, the residual should have a stronger dependence on the remaining afferent of interest. Indeed, partial correlation has been broadly employed in brain connectivity research, and one commonly gets the impression that the issue of noise has been solved.
So let's put it to the test. The most pressing question is: Can partial correlations resolve the parallel network issue highlighted above?
Some interesting stuff happening here.
Firstly, and most strikingly, the non-existent CD connection has been completely eliminated by partial correlation. Great! Partial correlations may be useful for thresholding a network to eliminate non-existent edges.
On the other hand, some weird stuff is happening to edge AB. As C and D get noisier, \(\rho_{ab}\) increases. Why? Basically, as they get noisier, they share less and less variance with A, and thus less shared variance is removed when considering the partial correlation between A and B. This implies that partial correlation hasn't really solved the noise issue, it's simply shifted it; the relationship between \(\rho_{ab}\) and \(\rho_{bc}\) changes as a function of \(w_c\) and \(w_d\). Thus, partial correlations also cannot reliably be used to estimate physical connectivity.
Another observation of note with respect to partial correlations is that they are generally quite low in magnitude, resulting in very sparse networks. This has led to it occasionally being lauded as a method which can be used to isolate the connections about whose existence we can be most confident, and this is in a way true. However, it is also in a way false, because as we've seen, there is no trivial relationship between the actual strength of physical connectivity between regions and the resulting partial correlations; connections with equal strength will be made more or less prominent purely as a function of relative noise levels, and the configuration of projections influencing their activity patterns. Moreover, we know from animal studies that the brain is not likely sparsely connected, but more likely quite densely connected (66% ipsilaterally in macaques; [7]), despite speculative claims to the contrary. Thus, considering only the very sparsely connected networks derived from partial correlations, even if they did not suffer from the aforementioned noise bias, severely limits one's ability to draw useful inferences about whole-brain connectivity.
What does this mean for graph theory?
Graph theory is a well-established, highly useful field of mathematics that has had an enormous influence on, for example, game, signal, and network theories. There's nothing wrong with graph theory. Any mathematical method, however, is only so good as the data that are fed into it. The dictum "garbage in, garbage out" applies. Constructing graphs from functional correlation coefficients is, in my opinion, almost always a matter of "garbage in". The toy examples above illustrate largely why. When constructing a graph, it is crucial to ask ourselves, what does such a structure represent? With internet graphs, transportation system graphs, or social graphs, this is obvious; and consequently, analyzing the topology of those graphs can allow us to derive useful, interpretable conclusions.
In the case of functional covariance, if we construct a binary graph by thresholding, what do our edges represent? How do we choose a threshold? How do we have any confidence that our edges represent physical connections, and our missing edges represent non-existent connections? How do we get a handle, in other words, on false positives and false negatives? If we construct a weighted graph using correlation coefficients, what do the edges represent? Basically, some complex interaction between physical connectivity, noise, and network architecture. What does path length mean in such a structure? What does clustering coefficient mean? Betweenness centrality? Worst of all, what does efficiency mean? These metrics all have useful interpretations when computed on graphs whose edges have actual physical meaning. They are less than useful when those edges represent a poorly understood statistical dependence relationship. They are "garbage out".
In my opinion, it is imperative that we acknowledge this severe limitation as neuroscientists. As the field of brain connectivity matures, it is no longer sufficient to hand-wave about graph theoretical metrics derived from functional covariance when we haven't even established a link between functional covariance and the physical world.
We can start by refusing to call it functional connectivity.
Friston et al., Human Brain Mapping, 1995
I am using the term physical connectivity here to refer to both anatomical (axon-synapse) and effective connectivity, in the Friston sense. Anatomical projections between regions that conduct action potentials that invoke postsynaptic firing in the target region. Classical connectivity, of the sort one is taught in any first-year neuroscience course.
Friston, Brain Connectivity, 2011
Uddin et al., Neuroreport, 2008
Tyszka et al., J. Neurosci, 2011
Honey et al., PNAS, 2008
Markov et al., Cerebral Cortex, 2014