Machines Learn How Brains Change

In last week's Science, Dosenbach et al describe a set of sophisticated machine learning techniques they've used to predict age from the way that hemodynamics correlate both within and across various functional networks in the brain. As described over at the BungeLab Blog, and at Neuroskeptic, the classification is amazingly accurate, generalizes easily to two independent data sets with different acquisition parameters, and has some real potential for future use in the diagnosis of developmental disorders - made all the easier since the underlying resting-state functional connectivity data takes only about 5 minutes to acquire from a given subject.

Somehow, their statistical techniques learned the characteristic features of functional change between the ages of 7 and 30 years. How exactly did they manage this?

First, they started with three data sets of resting-state BOLD activity; the first consisted of 238 resting-state scans from a 3T scanner from 192 individuals between 7-30 years of age. The second was of 195 scans from 183 subjects aged 7-31 years, each scan being an extraction of "rest" blocks from blocked fMRI designs which were then concatenated, having initially been acquired on a 1.5T scanner and a different pulse sequence than the first dataset. The third data set was 186 scans of 143 subjects aged 6-35 performing linguistic tasks, with task-related activity regressed out, using the same pulse sequence as the second dataset.

All the data was transformed to a single atlas and sent through a standard artifact-removal pipeline; next, activity in each of 160 10-mm spherical ROIs was calculated for each image in each scan, with the ROIs determined by a series of five meta-analyses the authors undertook on data of their own (wow!). The full cross-correlation matrix of correlations of ROIs across time was then calculated (yielding 12,270 correlations for each scan) and z-transformed.

Next they take this massive correlation matrix and use a support vector machine (SVM with soft margin, including a radial basis function "kernel trick") to classify each timeseries as belonging to a child (7-11 years old) or an adult (24-30 years old), tested with leave-one-out-cross-validation. They kept only the highest-ranked 200 features of the trained SVMs for further analyses (a process of recursive feature elimination didn't really help, so they just stuck with 200). Across all validations, the same set of 156 features consistently ended up in the top 200, and were used for visualization of the feature weights. In this step they could classify adults vs. children at 91% accuracy.

They next used support vector regression to predict, based on the retained 200 features, the age of the subject in the scanner. Predicted ages were converted into a "functional connectivity maturation index" which had a mean of 1.0 for ages 18 to 30 (we'll come back to this), and revealed beautiful curves you've no doubt seen elsewhere by this point:

i-045d6e5ac9e006d6a6e5cc2cf7d463ab-DosenbachCurve.jpg

The best-fitting line here is actually either the Pearl-Reed (gray line - used in other contexts to model the growth of human populations in settings with limited resources) or the Von Bertalanffy (black line - used to model the growth of animals). The same basic effects were replicated on all three data sets.

The rest of the paper is mostly dedicated to visualizing what exactly it was that the SVMs were basing their surprisingly accurate predictions. It turns out that twice as much of the predicted age-related variance was explained by functional connectivity that decreased with advancing age as by that which increased with age. Moreover, decreasing connectivity was more common among nearby regions, whereas increasing functional connectivity tended to occur among more far-flung regions (similar to the local-to-distributed shift discussed previously). Functional connections that increased with age were more aligned in the anterior-posterior dimension than those that decreased with age; the single most age-discriminative set of ROIs was the "cingulo-opercular" network (also discussed previously), and the most age-discriminative individual ROI was the right anterior prefrontal cortex.

If all that wasn't complicated enough, here's a glimpse of the paper's money shot:

i-1fbbc1298fb8bfb211db1014c19fd132-DosenbachMoney.jpg

Obviously, this is an incredibly impressive set of results with real-world value. But what are some of the potential pitfalls here?

One is that the classification actually took place in higher-dimensional space (>200 dimensions, as I understand it), meaning that the results are dependent on interactions of changes in functional connectivity among and within the 156 features visualized above. This kind of thing is not easily captured in the way the results have been visualized.

A second thing to be wary of is the conversion of chronological age to the predicted brain maturity index. I'm not following why exactly this conversion was necessary, but I assume it was due to a fall-off in the classifier's accuracy for predicting the age of subjects who are, in reality, between the ages of 18 and 30. This likely indicates that functional connectivity asymptotes in its sensitivity to change in functional connectivity around that time. In other words, it's likely not capturing whatever "wisdom" a 30 year old might have that differentiates them from an 18 year old.

(Assuming such a thing actually exists, it seems like it's not "in" the functional connectivity data. On the other hand, some of their data sets may have under-sampled the older part of the age distribution - perhaps wisdom just takes statistical mega-power to detect.)

These caveats aside, it's really beautiful work, and I believe it will really help real people really soon (TM). That's far more than can be said about most of the work being done in this area, which is far more theoretical in nature.

More like this

It's not that easy to tell from the graph (which I realize illustrates only 2 dimensions) but it looks like there are non-linearities. I wonder if a non-linear method like k-NN would outperform it.

Researchers should release the raw data for things like these. There are lots of data scientists who would love to come up with better methods of classification and modeling. Or they could launch a competition at kaggle.com.

Hi Blogger,

Iâm contacting you on behalf of MS Patient Resources, a new mobile application from DIME (The Discovery Institute of Medical Education). The MS Patient Resources app provides doctors and patients with state-of-the-art, expert information that can be easily shared with others. MS Patient Resources enables users to search for information ranging from symptom management to MS clinical trials, save the information they find, and share it easily with others via e-mail.

The application directs users to journals, newsletters, book excerpts, patient-focused websites, and organizations that focus on helping patients become educated, motivated and adherent, and allows physicians to stay informed about current clinical trials, state-of-the-art treatments, and neuroscience topics. The content in MS Patient Resources has been selected and reviewed by well-known experts in MS care and treatment and can be easily shared with other patients or physicians.

I wanted to invite you to take a look around the applicationâs website, www.mspatientresources.com, and possibly feature us in your blog, âScienceBlogsâ, should it meet your stamp of approval.

We hope you can take the time to do a quick write-up of MS Patient Resources and thank you tremendously for your support. Please do not hesitate to contact me should you need anything else (additional screenshots, MS Patient Resources logos, etc.)

Lauren Alexander
Audience Generation
lalexander@audiencegeneration.com
P.S. Quick reminder-- if you do post something about MS Patient Resources, please be sure to send me a line so I can send some visitors your way through Twitter and our Facebook community.

The new Zune browser is surprisingly good, but not as good as the iPod's. It works well, but isn't as fast as Safari, and has a clunkier interface. If you occasionally plan on using the web browser that's not an issue, but if you're planning to browse the web alot from your PMP then the iPod's larger screen and better browser may be important.

I do enjoy the way you have framed this problem plus it does give me some fodder for consideration. However, through what precisely I have seen, I just simply wish as the actual remarks pack on that folks remain on point and in no way get started upon a tirade involving the news du jour. Anyway, thank you for this exceptional point and though I can not necessarily agree with the idea in totality, I value the standpoint.