Blog posts

Intro

Learning about multiple linear regression
Published on 2021-12-30 by Andrew Reid	#18

Edited on 12/01/22 to add the $R_{adj}^2$ plot

In a previous blog post, I introduced simple linear regression, in which statistical inference can be used to assess the degree to which two variables in a population (an outcome and a predictor) can be modelled as a line. This is the most intuitive way to introduce the topic, not least because it can be easily visualized as a two-dimensional scatterplot. Commonly, however, we are interested in more than just a single predictor variable. There are a number of possible reason we may want to do this; here are three:

We want to know how well several variables together can be used to predict the outcome variable
We want to be able to account for the influence of one or more confounders
We are interested in seeing the influence of one variable on the linear relationship between the outcome variable and another predictor variable; i.e., an interaction effect

In this blog post, I will introduce multiple linear regression (MLR), which is quite simply an generalization of simple regression to cases with two or more predictor variables. Firstly, I'll go over the three-dimensional case, where we can still visualize the problem as a 3D scatterplot, with a regression plane instead of a line. Next, I'll introduce how to perform statistical inference on models with multiple predictors. Thirdly, I'll discuss the more conceptual idea of a hyperplane, which is a generalization of lines and planes to an arbitrary number of dimensions. And finally, I'll explain why including additional predictor variables can lead to the problem of overfitting, and how this can be addressed.

3D regression: From line to plane

In the 2D case, we used a line equation to relate the predictor and outcome variables, with $\beta_0$ as our intercept and $\beta_1$ as our slope (recall that $\hat{y}$ means "the predicted value of $y$" ):

$$\hat{y}=\beta_0+\beta_1 x$$

How do we extend this to three dimensions? Let's consider an example. Say a researcher is interested in the link between depression and cardiovascular function [1], and measures depression severity (DS) and resting heart rate (HR) in a sample of 100 individuals. Plotting these two variables against each other reveals a positive linear relationship:

They find that the corresponding simple linear model, $\hat{DS} = \beta_0 + \beta_1 HR$, is statistically significant ($p<0.01$), warranting a rejection of the null hypothesis ($H_0: \beta_1=0$). However (as an astute a postdoc points out), depression leads to higher rates of smoking, and smoking is also associated with higher HR [2]. Smoking is thus a likely confounder of the observed relationship. The research team asks the participants to report their tobacco smoking frequency (SF), and find the following associations:

As suspected, SF is positively associated with both DS and HR. But how much does variability does it explain? To answer this, we need to combine both variables into a single model. Adding a second predictor variable to a linear model is as easy as adding another term [3]:

$$\hat{y}=\beta_0+\beta_1 x_1+\beta_2 x_2$$

Or, for our example:

$$\hat{DS}=\beta_0+\beta_1 HR + \beta_2 SF$$

Notably, this equation is equivalent to the equation for a plane, and the residual error $DS-\hat{DS}$ is then the distance between the observed data points and this plane.

This is something we can visualize:

Okay, deep breath. There's a lot going on in this interactive figure, so let's unpack it.

We are looking at a three-dimensional point cloud, where each orange sphere is a single participant, with coordinates for each of the measured variables: DS ($y$), HR ($x_1$), and SF ($x_2$). For simplicity (and so that we can keep this to three dimensions), our variables have all been zero-centered; this allows us to effectively ignore the $\beta_0$ parameter, because $\beta_0 = \bar{y} = 0$.

The yellow square is the plane representing our linear model of the data. You can use your mouse/trackpad/finger to change the view of this 3D plane (you'll have to experiment I'm afraid as this will be device-dependent). The black lines are the residual errors, or the distance $y-\hat{y}$ along the y-axis.

Starting out, our model plane is the same as the $x_1$-$x_2$ plane: in other words, both $\beta$ parameters are zero. In this case, our prediction $\hat{y}$ is the mean of the data $\bar{y}$. This is our best estimate of $y$ with no knowledge of the other two variables.

To get a feel for the relationship between the 2D case of simple regression and the 3D case of the current model, we can look at how our two predictor variables relate to our outcome variable. If you click the "X1" button, the view will show us the relationship between $y$ and $x_1$. This gives us the same 2D scatterplot as above, indicating a positive relationship. As for the simple regression example, you can use the $\beta_1$ slider to adjust the plane to best fit the data points; notably, the plane appears as a line because it is the same across all values of $x_2$. That's the first important take-home: setting the $\beta$ parameter for a variable to zero effectively removes it from the model.

You can set $\beta_1$ back to zero, click the "X2" button, and do the same thing for $x_2$. Both variables are positively associated with $y$.

Now let's look at the cool coloured image on the left. This is the 3D equivalent of the plot from our simple regression example, showing the mean squared error (MSE, the mean squared length of the residual lines) for different values of $\beta_1$ and $\beta_2$. Dark red is a low MSE and yellow is a high MSE. The cyan crosshairs show the current values of $\beta_1$ and $\beta_2$; the corresponding MSE is shown at the top of the figure.

In the simple regression blog post, we learned that the objective of ordinary least squares (OLS) is to find the minimal MSE, from the U-shaped MSE-by-$\beta$ plot. We can do this by finding the point where the derivative $\frac{d MSE}{d \beta_1}=0$. This extends to multiple linear regression, but generally we need to look at partial derivatives to solve these systems. For the above example, we are looking for the point where the partial derivatives of both $\beta$ parameters are zero:

$$\frac{\delta MSE}{\delta \beta_1}=\frac{\delta MSE}{\delta \beta_2}=0$$

This point corresponds to the middle of the dark red circle in the image plot above. Moreover, because the function is convex, we know that it has only one such point. If you click near this point, the $\beta$ sliders and the plane will move close to the OLS solution for this point cloud. If you click somewhere in the yellow region, you'll see what a poor solution looks like.

See if you can use the sliders to fine-tune the fit by minimizing MSE.

The coefficient of determination

I know, the term sounds a bit epic. In a nutshell, the coefficient of determination, or $R^2$, is a useful way to assess the proportion of variance of $y$ our model is explaining. This is also called its goodness of fit. It is calculated as:

$$R^2 = 1 - \frac{SS_R}{SS_T} $$

where $SS_R$ is the sum of squared residual error:

$$SS_R = \sum{(y-\hat{y})^2}$$

and $SS_T$ is the sum of squared total error:

$$SS_T = \sum{(y-\bar{y})^2}$$

This ratio relates the error of the model ($SS_R$) to the total variability of the data about its mean ($SS_T$). In other words, it quantifies how much variance is explained by the predictor variables $\mathbf{X}$, in addition to our best guess without knowledge of them, which is $\bar{y}$.

From this equation, it is clear that, as long as the model does at least as well as the mean at predicting $y$, $R^2$ should range between zero (when $SS_R=SS_T$, i.e., the predictors add no explanatory power) and one (when $SS_R=0$, i.e., the predictors perfectly predict $y$).

A peculiarity of this definition of $R^2$ is that, for choices of $\beta$ parameters that increase error beyond the variability around the mean (i.e., the model is worse than the mean at predicting $y$), $R^2$ will be negative. You can see this for yourself above, by selecting values of $\beta_1$ and $\beta_2$ that lie in the yellow part of the MSE distribution. In practice, this should never happen, because any parameters that perform worse than the mean are by definition sub-optimal (i.e., setting them to all to zero would result in a better solution).

Statistical inference with MLR models

Statistical inference on linear regression models can be done using an F-ratio test, in an identical manner to simple linear regression (see this blog post for a recap). The number of predictor variables $k$ is reflected in the degrees of freedom equations for this analysis:

$$df_M = k$$ $$df_R = n - 1 - k$$

where $n$ is the sample size.

It is often more interesting, however, to understand which individual variables are predictive, and to generalise at the level of variables, rather than the full model. To return to the heart rate example above: can our research team infer whether resting heart rate is predictive of depression, after accounting for the shared relationship with smoking?

They can go about this in a number of ways. I'll elaborate on two.

In the simultaneous regression approach, the goal is to include all covariates of interest in a single model, and perform statistical tests on each variable separately. For a given predictor variable $i$, this analysis involves a test of the null hypothesis that the coefficient $\beta_i$ is zero. For a one-tailed test; i.e., where we hypothesise that the association is in a particular direction (e.g., positive), we have:

$$H_0: \beta_i = 0$$ $$H_a: \beta_i > 0$$

Under $H_0$, $\beta_i$ has a t-distribution centered on zero. Our researchers can therefore compute a test statistic for their observed data and test this against the null distribution [4]:

$$t_{\beta_i} = \frac{\beta_i}{\text{se}(\beta_i)}$$

where $\text{se}(\beta_i)$ is the standard error of $\beta_i$ (see this site for details on how to estimate this).

They can then obtain a p value estimate as $p(t \geq t_{\beta_i} | H_0)$, and reject $H_0$ if this value is less than some threshold $\alpha$. If, for example, they reject $H_0$ for $\beta_1$ in the model $\hat{DS} = \beta_0 + \beta_1 HR + \beta_2 SF$, then they could conclude that heart rate is predictive of depression severity even after accounting for smoking frequency.

Our hypothetical researchers may not be satisfied with this, however. It would be even more interesting to know how much additional variability is explained by heart rate. This can be accomplished with hierarchical regression, where two models are evaluated — one including this variable and one without it:

$\text{Model 1:}$ $\hat{DS} = \beta_0 + \beta_1 SF + \beta_2 HR$

$\text{Model 2:}$ $\hat{DS} = \beta_0 + \beta_1 SF$

Since Model 2 is a reduced version of Model 1, it is said to be nested within it. Our researchers can compute $F$ and $R^2$ values for both models, and the change in the the latter, or $\Delta R^2$, indicates how much additional variance is explained by the inclusion of HR.

To infer whether this change is generalisable (i.e., to test the null hypothesis $H_0: \Delta R^2=0$), we can use the following test statistic [5]:

$$F_\Delta = \frac{(SS_{R,2}-SS_{R,1}) / (k_1-k_2)}{SS_{R,1} / (n-k_1)}$$

where $SS_{R,i}$ is the residual sum of squares for Model $i$, and $k_i$ is the number of predictor variables in Model $i$. Under $H_0$, this statistic has an F distribution with ($k_1-k_2$, $n-k_1$) degrees of freedom, so we can reject $H_0$ if $p(F \geq F_\Delta|H_0) < \alpha$ [6].

If this is the case, our researchers can then happily report that HR predicts an additional $\Delta R^2$ of the variance of DS (however, see below for a discussion of overfitting).

What about higher dimensions?

The physical world in which our visual systems have evolved has three spatial dimensions, and our ability to perform spatial reasoning is largely limited to these three. To intuit this, try as hard as you can to imagine a four-dimensional space [7]. Treating variables as dimensions in a physical space has the huge advantage of allowing us to visualize data sets, and relationships between variables, as 1-, 2-, or 3-dimensional plots (such as the ones above) that we can intuitively interpret.

In mathematics, however, Euclidean space can be generalized to $n$ dimensions, where $n$ is any non-negative integer. In this generalized notion of space, linear objects are referred to as hyperplanes, and are defined by the equation:

$$y = \beta_0 + \beta_1 x_1 + \ldots + \beta_i x_i + \ldots + \beta_{k} x_{k}$$

where $m=k+1$ is the number of dimensions (i.e., including $y$).

This reduces to the equation of a line for two dimensions ($m=2,k=1$):

$$y = \beta_0 + \beta_1 x_1$$

...and the equation of a plane for three ($m=3,k=2$):

$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2$$

Often, you'll see this hyperplane equation expressed in matrix form:

$$\mathbf{y} = \mathbf{\beta} \mathbf{X}$$

where $\mathbf{\beta}$ is a vector of $m$ coefficients and $\mathbf{X}$ is a matrix of size $n \times m$. In this form, the $\beta_0$ (or intercept) term is represented in $\mathbf{X}$ as a column of ones.

Even though we can't visualize an $m$-dimensional linear model, the same logic applies as for the 2- and 3-dimensional cases we've discussed so far. Namely, we can use OLS to determine which $\beta$ coefficient values minimize the mean squared error (MSE). This is the point at which all partial derivatives of MSE with respect to the $\beta$ coefficients are zero:

$$ \frac{\delta MSE}{\delta \beta_0} = \ldots = \frac{\delta MSE}{\delta \beta_i} = \ldots = \frac{\delta MSE}{\delta \beta_{k}} = 0$$

Incidentally, this minimization problem is often shown as:

$$\hat{\beta} = \text{argmin}_{\beta} (SS_R)$$

where $\hat{\beta}$ is the estimated value of the $\beta$ coefficients, and $\text{argmin}_{\beta}$ can be read as "find the values of $\beta$ that minimize the expression on the right" [8].

Working through the math, it can be shown that the solution for this minimization is:

$$\hat{\beta} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}$$

where $\mathbf{X}^T$ denotes the transpose of a matrix $\mathbf{X}$, and $\mathbf{X}^{-1}$ denotes its inverse.

Dealing with overfitting

William of Ockham wrote: "It is futile to do with more what can be done with fewer" (9). Ockham's philosophy has been summarized as "Occam's Razor", which when applied to statistical models encapsulates the concept of overfitting.

As you make a model more complex by adding parameters to it, you will always be able to better fit a particular set of observations. To illustrate this: you can perfectly fit any random two points with linear model of size $k=1$ (i.e., a line), any three points with a model of size $k=2$ (i.e., a plane), and in general any $n$ points with a model of size $k=n-1$.

For a linear model, this suggests that, if the data are completely random, its goodness of fit $R^2$ will be directly proportional to the model complexity $k$, and inversely proportional to sample size $n$. Because of this, the utility of the model for predicting new data from a sample will be a trade-off between the its ability to capture patterns that exist in the underlying population, and the fit to random noise produced by its complexity.

In short, too many parameters means poorer generalizability!

The figure below captures these relationships. Here, I've simulated completely random data ($1000 \times k \times n$ iterations) and varied both the model complexity $k$ and the sample size $n$:

From the plot on the left, we can see that our goodness of fit $R^2$ increases with the number of parameters $k$ (left-to-right), and this increase is more profound as we decrease our sample size $n$ (bottom-to-top). In the top right corner, for cases where we have equal or more predictor variables than actual observations, $R^2$ is always 1, indicating a perfect fit (to 100% noise). For the plot on the right, I've shown the result of adjusting $R^2$ as described below; this effectively removes the overfitting.

We can deal with overfitting in a variety of ways:

Recall that the F-ratio depends on degrees of freedom (see above), with $df_M=k$ and $df_R=n-1-k$. From this, we can see that the F distribution accounts to some extent for overfitting; see this blog for an interactive F-distribution plot that shows how $n$ and $k$ affect the shape of the curve. In particular, it is clear that when $k=n-1$, the F-ratio is undefined, and when $k \leq n$, it is negative; in short, the F-ratio test assumes that $k < n-1$.
An adjusted $R^2$ value can be computed to account for its spurious inflation by additional predictor variables. It uses an alternative formulation based on mean sums of squares (MS). As for the F-ratio above, this adjustment corrects using the degrees of freedom:

$$R^2_{adj} = 1 - \frac{MS_R}{MS_T} = 1 - \frac{SS_R/df_R}{SS_T/df_T} = 1 - \frac{SS_R/(n-1-k)}{SS_T/(n-1)}$$

In cases where a model is very ill-posed (e.g., when $n < k$), OLS is not an appropriate approach. Regularization is commonly used to adjust the minimization process such that a penalization is applied to model complexity.

Some commonly used regularization methods are ridge regression and elastic nets. See this blog post for a discussion of how the latter can be applied to functional brain imaging data.

Summing up

There are no fast rules about which predictor variables to include in your model. It is a balance between having too many confounders to meaningfully interpret a model and having too complex a model for your interpretation to be generalizable to the population of interest. This xkcd comic sums it up with typical dry humour:

In the end, understanding these issues can be quite beneficial to the design of a research study and corresponding analysis that allows you to draw meaningful, generalizable conclusions about the phenomenon you are interested in.

Okay... if you're still with me, you've (hopefully) learned a good deal about the ins and outs of multiple linear regression. Of course, we've only scratched the surface, and the next step is to try to apply what you've learned to actual data. There are plenty of software tools/languages (especially: R) that simplify the practical application of linear regression analysis, but I encourage you to validate the results of those software tools by crunching the numbers yourself.

Here are some additional free resources that cover this topic much more thoroughly:

Russ Poldrack: Statistical Thinking for the 21st Century (online e-book)
Murray Logan: Show Us Your R's (website)
James et al.: An Introduction to Statistical Learning - 2nd Edition (PDF)

NOTE: Matlab code for the overfitting plot above, and Javascript code for the other plots, are available here.

See, e.g., Glassman et al., 1998 and Schiweck et al., 2019

See, e.g., Breslau et al. and Papathanasiou et al., 2013

Note that I've added subscripts here to differentiate the two predictor variables.

See this website for the gritty details.

See this Wikipedia section

Notably, it must be true that $k_1>k_2$; in other words Model 2 is nested in Model 1, not vice versa. It is also interesting to note that this F ratio can never be negative, because adding more predictor variables will never increase residual error (we can always set their $\beta$ parameters to zero): $SS_{R,2} \geq SS_{R,1}$.

Not that this stops people from theorizing...

Note that minimizing $SS_R$ is equivalent to minimizing MSE, since $MSE = SS_R / n$, and $n$ is constant.

Original Latin: "Frustra fit per plura quod potest fieri per pauciora". From Hoffman et al..

Comments here

In this post, I explore multiple linear regression, generalizing from the simple two-variable case to three- and many-variable cases. This includes an interactive 3D plot of a regression plane and a discussion of statistical inference and overfitting. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Linear regression · Teaching

Causal discovery: An introduction
Published on 2024-09-23 by Andrew Reid	#21

This post continues my exploration of causal inference, focusing on the type of problem an empirical researcher is most familiar with: where the underlying causal model is not known. In this case, the model must be discovered. I use some Python code to introduce the PC algorithm, one of the original and most popular approaches to causal discovery. I also discuss its assumptions and limitations, and briefly outline some more recent approaches. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Causality · Causal inference · Causal discovery · Graph theory · Teaching

Causal inference: An introduction
Published on 2023-07-17 by Andrew Reid	#20

Hammer about to hit a nail, representing a causal event.

In this post, I attempt (as a non-expert enthusiast) to provide a gentle introduction to the central concepts underlying causal inference. What is causal inference and why do we need it? How can we represent our causal reasoning in graphical form, and how does this enable us to apply graph theory to simplify our calculations? How do we deal with unobserved confounders? This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Causality · Causal inference · Graph theory · Teaching

Multiple linear regression: short videos
Published on 2022-08-10 by Andrew Reid	#19

In a previous series of posts, I discussed simple and multiple linear regression (MLR) approaches, with the aid of interactive 2D and 3D plots and a bit of math. In this post, I am sharing a series of short videos aimed at psychology undergraduates, each explaining different aspects of MLR in more detail. The goal of these videos (which formed part of my second-year undergraduate module) is to give a little more depth to fundamental concepts that many students struggle with. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Linear regression · Teaching

Learning about fMRI analysis
Published on 2021-06-24 by Andrew Reid	#17

In this post, I focus on the logic underlying statistical inference based on fMRI research designs. This consists of (1) modelling the hemodynamic response; (2) "first-level" within-subject analysis of time series; (3) "second-level" population inferences drawn from a random sample of participants; and (4) dealing with familywise error. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · FMRI · Hemodynamic response · Mixed-effects model · Random field theory · False discovery rate · Teaching

Learning about simple linear regression
Published on 2021-03-25 by Andrew Reid	#16

In this post, I introduce the concept of simple linear regression, where we are evaluating the how well a linear model approximates a relationship between two variables of interest, and how to perform statistical inference on this model. This is part of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology.

Tags:Stats · Linear regression · F distribution · Teaching

New preprint: Tract-specific statistics from diffusion MRI
Published on 2021-03-05 by Andrew Reid	#15

In our new preprint, we describe a novel methodology for (1) identifying the most probable "core" tract trajectory for two arbitrary brain regions, and (2) estimating tract-specific anisotropy (TSA) at all points along this trajectory. We describe the outcomes of regressing this TSA metric against participants' age and sex. Our hope is that this new method can serve as a complement to the popular TBSS approach, where researchers desire to investigate effects specific to a pre-established set of ROIs.

Tags:Diffusion-weighted imaging · Tractography · Connectivity · MRI · News

Learning about correlation and partial correlation
Published on 2021-02-04 by Andrew Reid	#14

This is the first of a line of teaching-oriented posts aimed at explaining fundamental concepts in statistics, neuroscience, and psychology. In this post, I will try to provide an intuitive explanation of (1) the Pearson correlation coefficient, (2) confounding, and (3) how partial correlations can be used to address confounding.

Tags:Stats · Linear regression · Correlation · Partial correlation · Teaching

Linear regression: dealing with skewed data
Published on 2020-11-17 by Andrew Reid	#13

One important caveat when working with large datasets is that you can almost always produce a statistically significant result when performing a null hypothesis test. This is why it is even more critical to evaluate the effect size than the p value in such an analysis. It is equally important to consider the distribution of your data, and its implications for statistical inference. In this blog post, I use simulated data in order to explore this caveat more intuitively, focusing on a pre-print article that was recently featured on BBC.

Tags:Linear regression · Correlation · Skewness · Stats

Functional connectivity as a causal concept
Published on 2019-10-14 by Andrew Reid	#12

In neuroscience, the conversation around the term "functional connectivity" can be confusing, largely due to the implicit notion that associations can map directly onto physical connections. In our recent Nature Neuroscience perspective piece, we propose the redefinition of this term as a causal inference, in order to refocus the conversation around how we investigate brain connectivity, and interpret the results of such investigations.

Tags:Connectivity · FMRI · Causality · Neuroscience · Musings

Functional connectivity? But...
Published on 2017-07-26 by Andrew Reid	#11

Functional connectivity is a term originally coined to describe statistical dependence relationships between time series. But should such a relationship really be called connectivity? Functional correlations can easily arise from networks in the complete absence of physical connectivity (i.e., the classical axon/synapse projection we know from neurobiology). In this post I elaborate on recent conversations I've had regarding the use of correlations or partial correlations to infer the presence of connections, and their use in constructing graphs for topological analyses.

Tags:Connectivity · FMRI · Graph theory · Partial correlation · Stats

Driving the Locus Coeruleus: A Presentation to Mobify
Published on 2017-07-17 by Andrew Reid	#10

How do we know when to learn, and when not to? Recently I presented my work to Vancouver-based Mobify, including the use of a driving simulation task to answer this question. They put it up on YouTube, so I thought I'd share.

Tags:Norepinephrine · Pupillometry · Mobify · Learning · Driving simulation · News

Limitless: A neuroscientist's film review
Published on 2017-03-29 by Andrew Reid	#9

In the movie Limitless, Bradley Cooper stars as a down-and-out writer who happens across a superdrug that miraculously heightens his cognitive abilities, including memory recall, creativity, language acquisition, and action planning. It apparently also makes his eyes glow with an unnerving and implausible intensity. In this blog entry, I explore this intriguing possibility from a neuroscientific perspective.

Tags:Cognition · Pharmaceuticals · Limitless · Memory · Hippocampus · Musings

The quest for the human connectome: a progress report
Published on 2016-10-29 by Andrew Reid	#8

The term "connectome" was introduced in a seminal 2005 PNAS article, as a sort of analogy to the genome. However, unlike genomics, the methods available to study human connectomics remain poorly defined and difficult to interpret. In particular, the use of diffusion-weighted imaging approaches to estimate physical connectivity is fraught with inherent limitations, which are often overlooked in the quest to publish "connectivity" findings. Here, I provide a brief commentary on these issues, and highlight a number of ways neuroscience can proceed in light of them.

Tags:Connectivity · Diffusion-weighted imaging · Probabilistic tractography · Tract tracing · Musings

New Article: Seed-based multimodal comparison of connectivity estimates
Published on 2016-06-24 by Andrew Reid	#7

Our article proposing a threshold-free method for comparing seed-based connectivity estimates was recently accepted to Brain Structure & Function. We compared two structural covariance approaches (cortical thickness and voxel-based morphometry), and two functional ones (resting-state functional MRI and meta-analytic connectivity mapping, or MACM).

Tags:Multimodal · Connectivity · Structural covariance · Resting state · MACM · News

Four New ANIMA Studies
Published on 2016-03-18 by Andrew Reid	#6

Announcing four new submissions to the ANIMA database, which brings us to 30 studies and counting. Check them out if you get the time!

Tags:ANIMA · Neuroscience · Meta-analysis · ALE · News

Exaptation: how evolution recycles neural mechanisms
Published on 2016-02-27 by Andrew Reid	#5

Exaptation refers to the tendency across evolution to recycle existing mechanisms for new and more complex functions. By analogy, this is likely how episodic memory — and indeed many of our higher level neural processes — evolved from more basic functions such as spatial navigation. Here I explore these ideas in light of the current evidence.

Tags:Hippocampus · Memory · Navigation · Exaptation · Musings

The business of academic writing
Published on 2016-02-04 by Andrew Reid	#4

Publishers of scientific articles have been slow to adapt their business models to the rapid evolution of scientific communication — mostly because there is profit in dragging their feet. I explore the past, present, and future of this important issue.

Tags:Journals · Articles · Impact factor · Citations · Business · Musings

Reflections on multivariate analyses
Published on 2016-01-15 by Andrew Reid	#3

Machine learning approaches to neuroimaging analysis offer promising solutions to research questions in cognitive neuroscience. Here I reflect on recent interactions with the developers of the Nilearn project.

Tags:MVPA · Machine learning · Nilearn · Elastic net · Statistics · Stats

New ANIMA study: Hu et al. 2015
Published on 2016-01-11 by Andrew Reid	#2

Announcing a new submission to the ANIMA database: Hu et al., Neuroscience & Biobehavioral Reviews, 2015.

Tags:ANIMA · Neuroscience · Meta-analysis · ALE · Self · News

Who Am I?
Published on 2016-01-10 by Andrew Reid	#1

Musings on who I am, where I came from, and where I'm going as a Neuroscientist.

Tags:Labels · Neuroscience · Cognition · Musings

Andrew Reid PhD