23andMe performs genome-wide association study on NFL players, fails to find athlete genes

Details are pretty sketchy, but a press release announced today suggests that personal genomics company 23andMe has performed a genome-wide association study comparing 100 current or former professional NFL players with a set of controls of unspecified sample size.

The shocking result:

The study did not find the tested players to be genetic outliers, suggesting that genetics may not be a good predictor of athletic success.

It's unsurprising that the results of this study are negative (more on this below), but the conclusions they draw from this are fallacious. In fact we know from twin and family studies that many (but not all) traits related to athletic performance are highly heritable; researchers just haven't been able to track down the vast majority of the genetic variants responsible yet, and this study is no exception.

What 23andMe have actually shown here is that the limited subset of genetic variation captured by their genotyping chip (which almost exclusively targets genetic variants with a frequency of greater than 5%) doesn't include any variants with an extremely strong association with NFL prowess.
That shouldn't come as a surprise to anyone who's been following advances in human genetics for the last few years; a genome-wide association study on a highly complex trait with a sample size of 100 has, historically speaking, a vanishingly small chance of yielding any positive results at all. (Yes, there are exceptions, but I don't think a sensible prior expectation would be that athletic performance has a similar genetic architecture to macular degeneration.)
The press release argues that the results "speak to the breadth of the genetic research the company is undertaking". That may be so, but I certainly hope they aren't indicative of the general quality of 23andMe's research program. Much as I hate to say it about a company whose work I generally admire, this study carries all the hallmarks of being pure PR fluff. If you want to do a GWAS for athletic performance, at least wait until you have a homogeneous sample that's well-powered enough to have a fighting chance of detecting real associations.
On the bright side, 23andMe has been building up much more sensible sample sizes for other projects, including 4,500 older amateur athletes and over 3,000 Parkinson's disease patients. I'm hopeful that we'll see something a little more interesting than this NFL story roll out of the company over the next few months.

More like this

It's been an intensive week of genomics here at the American Society of Human Genetics meeting, and I haven't been able to grab time to blog as much as I'd have liked. In fact there's a whole load of genomics news I'll be trying to cover in some detail over the next couple of weeks; for the moment…
Over at the 23andMe blog The Spittoon, company co-founder Linda Avey expands on her vision for a novel model of genomic research, in which personal genomics customers contribute their genetic and health data to fuel research into the inherited and environmental triggers for disease. This is a model…
Medland et al. (2009). Common Variants in the Trichohyalin Gene Are Associated with Straight Hair in Europeans. The American Journal of Human Genetics DOI: 10.1016/j.ajhg.2009.10.009 A couple of weeks ago I reported on a presentation by 23andMe's Nick Eriksson at the American Society of Human…
I received an email a while back from a reader wondering why his friend has had to submit multiple saliva samples to personal genomics company 23andMe without getting a result back. Customers in a similar position may be reassured by a lengthy explanation posted yesterday on 23andMe's blog about…

they should have studied athletes, not football players. :)

What 23andMe have actually shown here [...]

*cough* power *cough*

rb: take a look at Adrian Peterson and tell me that he isn't an athlete. Or even look at any NFL or major college offensive lineman versus a heavyweight Olympic weightlifter. You must have a very narrow definition of athlete.

Another problem with this study is the diversity of subjects. How can you pool a 5'11'', 190 lbs defensive back with 5% body fat with a 6'6", 310 lbs offensive lineman with 24% body fat? In terms of of experimental design, this is a clear example of "quick and dirty".

By natural cynic (not verified) on 13 Oct 2009 #permalink

Hey Chris,

Well, I did throw in the vague term "extremely strong association". If there was a common variant with an OR of 20 for being an NFL player, there's a reasonable chance they would have found it. Of course the study was horribly under-powered for detecting variants with any sort of realistic effect size.

I can't wait till the video comes out! Yet another freaking example of PR people running the show at the asylum. What a joke. Seriously, how can we think anything good about a company who would publish anything with 100 samples.....

This is pure media swill.


When press releases of this sort comes out what does it mean? Money running out, finance needed? They certainly don't need these releases to bump up the stock price and there is no good reason for the release from a scientific point of view - or even marketing really, given the results...

Hey Daniel - looks like a few more of your predictions for this year are coming true!

Attention will move gradually away from complex diseases and towards the genetic dissection of disease-related traits. Tick.

Most of the "missing heritability" will stay missing. Tick.

We will start to see "bad" genome-wide association studies. Oh dear.

Hey Neil,

Excellent... more fuel for a post at the end of the year gloating about how awesome my predictions were. :-)

Of course, the prediction that's potentially most relevant here is this one:

There will be an uptick in "it's all lies" stories about personal genomics, impacting on public perception of the industry.

That's already happening, and personal genomics companies can only turn the tide by being scrupulous in avoiding the temptation to sacrifice scientific credibility on the altar of cheap publicity. This press release is a step in the wrong direction.


This press release will spawn a few dozen minor mainstream press articles and blog posts scattered across the web, many of them simply echoing the feel-good fallacy that "genetics may not be a good predictor of athletic success". It all contributes to the gestalt that 23andMe are doing actual meaningful research (which, to be fair, they are - this just isn't a great example of it).

This blog post fits in perfectly with the thrust of one of those papers in the October 8th edition of Nature, Manolio et al.'s review "Finding the missing heritability of complex diseases" (full text!). Manolio's Table 1 lists eight complex traits; the "Proportion of heritability explained" for five of them is under 7%. (Macular degeneration tops the list at 50%, of course.)

How about a betting pool: Prior to release, the authors of GWASs like this release the topic under study, the N, and the method (array/sequencing). Genetic Future readers can then place wagers on the effect size and significance of the result.

Hey Anti-Manichean,

Very droll. :-)

Three quick points, though, while Steve is composing his response: (1) the same sample size has very different power for a candidate gene study than a GWAS, due to the horrors of multiple testing; (2) 2003 is not 2009; and (3) I'm still frankly astonished that the ACTN3 study actually replicated, and I'm honest enough to admit that this had more to do with good luck than good judgement.

The study by Daniel MacArthur is a completely different scenario because the number of tests of association they did differs by several orders of magnitude to what 23andMe did.
Happy fishing.

By Wade Davis, PhD (not verified) on 14 Oct 2009 #permalink

I am surprised that the results of any genetic study on athletic performance, even a poor study, is revealed. In spite of the strong interest in sports and fitness in the US, few US genetics studies have been reported in the literature. I have assumed that this is due to political correctness, but maybe there are other reasons.

This is rather embarrassing, and I am surprised that 23andMe doesn't see this as something they should rather hide. Why would anyone want to trust a company like this with a genetic analysis when they don't seem to understand the basics of genetics. Chances where pretty much zero from the get-go to find any meaningful association. 100 samples to find associations with such a complex and hard to define trait - it's a joke!

Hey Anti-Manichean and Daniel,
Thanks for the quick points. Which are partly what I want to talk about. The bigger issue is precisely what I have a problem with. It is called talking out of your A$$ is a scientific publication....

"We have recently demonstrated that α-actinin-3 deficiency is common in the general population and is due to homozygosity for a premature stop codon in ACTN3 (R577X) (North et al. 1999). It is likely that α-actinin-2 is able to âcompensateâ for the absence of α-actinin-3 in type 2 fibers, although there is no upregulation of α-actinin-2 levels in response to α-actinin-3 deficiency (authors' unpublished observations)."

Well then sparky, how does it "compensate?"

The they claim that athletic performance is "Highly Genetic" and link to this review from 2001 http://www.ncbi.nlm.nih.gov/pubmed/12165675

Seriously. I am absolutely amazed by what swill gets published these days.....

The same thing happens in the media and happens in the world.....a screaming circus which no one can shut out....

We need to crystallize our focus and this is not the way. Crappy studies designed to get on the cover of magazines again is not the way to go for this company. In 12 months they will be just another Google App.....Mark my words.

Why do the purveyors of what is important (Google) choose to highlight that which is absolutely not important (23andSergey)???

My only answer is Vanity. This is a distraction to the greater aims of the only vested investor.......

I agree with Daniel, the backlash is already here and will continue to put out the hot flames of hype.....at least when it comes to DTC SNP scans....


Steve, you do realize that the paper you quoted and ripped is *Daniel's* paper, don't you?

"The same thing happens in the media and happens in the world.....a screaming circus which no one can shut out...."

Nice phrase Steve, that's really insightful -- I guess sometimes it's possible to be self-aware and not even know it....


Your point is well taken (as are Daniel's).

However, my comment was an experiment to test a different hypothesis--the result of which is now evident.

By Anti-Manichean (not verified) on 16 Oct 2009 #permalink

Steve Murphy, can you please make an attempt to spell and write coherently? Could you at least try to write in full sentences? Your use of "long repeat" ellipses alone (namely random insertions of between 4 and 10 periods in the middle of raving prose) makes me constantly amazed that you are affiliated with Yale in any capacity.

Finally: what papers have *you* published on 10000 person cohorts? And what has "Helix Health" done to move the ball forward here?

I must admit, returning to a gloomy Cambridge morning after almost two weeks in Hawaii was pretty depressing - but reading back over this comments thread has cheered me up immensely. :-)

It is dangerous to draw any scientific conclusion from a Press Release, but it seems that the studies relied in part on SNP "raw data files" from 23andMe - and subsequently specific interrogation of particular genes was conducted by third parties (e.g. Bucks Institute); NOT following the SNP protocol of 23andMe.

If it is so, the study left out most (almost all) of the hologenome. Present DTC interrogates less than 1 M bases (microarray capacity is less than 1.6 M), and the very few genes mentioned are likely to amount maybe another 1 M bases. Since the full human DNA is 6.2 Billion bases, even if only the genome is concerned, practically the entire genome regulatory mechanism could be missed.

And that is only half of the story. Genomic- and epigenomic channels these days should be considered as two sides of the same coin.

Is it perhaps that athletes are under special diets - thus at least some of their uniqueness is due more to epigenomic rather than (a mere handful of genes from) their genomic channels?