August 3, 2009

uh-oh

Uh-oh. There's another problem. And, like the first problem, it's well-known, and has been for years; but the limitations imposed by it have not always been respected by the fMRI community.

The first problem – you may recall! – is voodoo baloney.  That's what happens when you use some criterion to select data, and then perform, on those selected data, a statistical test, related to that selection criterion, which assumes that the data are completely random, rather than pre-selected. Why is this a problem? Because if you try so very many different things that one of them is bound to turn out right, just at random, but then pretend that the one that turned out right was the only one you tried, you're fibbing. In the context of functional neuroimaging, those "many different things" are the many different locations in the brain.

Voodoo baloney is the topic of  Vul, Harris, Winielman and Pashler's paper, formerly known as "Voodoo correlations in social neuroscience", now known as "Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition", Perspectives on Psychological Science 4:274-290, 2009. A "correlation" is a mathematical summary of how two variables are related; it is known as "r"; r=1 is complete correlation. The correlation here is between brain activity (measured by fMRI) and some behavioral or cognitive measure (for example, IQ). The key point is that these correlations are between variations in brain and performance measures over different people. The goal is to probe brain-mind relationships by exploiting inter-individual differences, by asking where do differences-in-the-brain "track" with differences-between-people's behavior or cognition.

So, is Vul's charge that researchers using fMRI in social neuroscience were guilty of looking at fMRI signals from a hundred thousand different brain locations, finding, just by pure random chance, some locations with signals that correlated with behavior, and then reporting the very high correlation between signal in those regions, and behavior? Almost. Vul charged not that such correlations were completely bogus; Vul charged that when such correlations exist (weakly) in reality, this analytical approach will artificially inflate them. Inflate them enough to make 'em "voodoo" or "curiously high".

That's the charge. Now, Perspectives on Psychological Science published not just Vul et al.'s paper, but many responses to it, too. Most of the responses are pretty much on topic - they say, essentially, that Vul's reporting and analyses are right and here's why, or Vul's reporting and analyses are wrong and here's why. Except one.

Tal Yarkoni's paper, entitlted "Big Correlations in Little Studies: Inflated fMRI Correlations Reflect Low Statistical Power – Commentary on Vul et al. (2009)" says that many reported correlations are surely inflated, but for a completely different reason, namley, lack of power. Because when we avoid voodoo baloney by correcting for multiple comparisons, we raise the threshold for determining that correlations are significant. And then, if we haven't scanned enough people, we will erroneously report an inflated correlation.

Too abstract? Here's the example given by Yarkoni: Assume there are ten brain regions-of-interest, each with fMRI signal correlated with behavior at r=0.4, and that our study has 20 participants (this is the "big N"; so we can write N=20). Then, if we properly correct for multiple comparisons, by raising the threshold for significance, based on the number of comparisons, we will find, on average, that 1.3 (call that one, on average) of those ten regions are correlated. But, we cannot report a correlation of r=0.4, because the threshold for significance will have become r=0.6. So, thanks to random variation, we'll find that one region is correlated with the behavior, with a correlation of at least r=0.6. 

Ain't that troubling? Yarkoni's simulation says that when ten brain regions are truly correlated with behavior, each at r=0.4, our N=20 study is likely to find just one region, with r=0.6! So, we'll exaggerate the correlation, and also exaggerate the degree of brain localisation, by turning what had been a distributed network of ten regions, into a single "hot spot." What's more, if we (or others) repeat the study, we're just as likely to find a different one of those ten regions. The solution is much larger sample sizes. If we study 50 or 100 or hundreds of people, this problem goes away. 

So, please, the next time you see a claim based on an fMRI study of inter-individual differences - check the N!

No comments:

Post a Comment

ShareThis

Blog Archive