Predicting height: the Victorian approach beats modern genomics

By dgmacarthur on March 3, 2009.

Yurii S Aulchenko, Maksim V Struchalin, Nadezhda M Belonogova, Tatiana I Axenovich, Michael N Weedon, Albert Hofman, Andre G Uitterlinden, Manfred Kayser, Ben A Oostra, Cornelia M van Duijn, A Cecile J W Janssens, Pavel M Borodin (2009). Predicting human height by Victorian and genomic methods European Journal of Human Genetics DOI: 10.1038/ejhg.2009.5

Human height is a strongly genetic trait: in well-nourished Westerners somewhere in the vicinity of 80-90% of the variation in height is due to genetic factors; if your parents are tall, there's a very good chance you will be too. That means that if we understood the genetic factors that influenced height we could predict the future height of a child (or even an embryo) with a reasonable degree of accuracy.

However, developing that genetic understanding has proved extremely difficult. It turns out that height is also the classic model of a genetically complex trait: a spate of very large recent genome-wide association studies has nailed down over 50 different regions of the genome that affect height, which in total explain less than five percent of the overall variation - suggesting that hundreds (if not thousands) of individual genetic variants contribute, most of them nudging us upwards or downwards by just a few millimetres.

Height is not the only human trait to demonstrate such a convoluted molecular basis; aside from a few unusual traits such as skin pigmentation, the majority of traits that vary between humans are genetically complex. This is one of the reasons why embryo screening for traits like height or IQ is unlikely to be effective in the near future.

A recent paper in the European Journal of Human Genetics (also covered by Dienekes) illustrates this point by comparing the predictive power of modern genetics with a method for height prediction developed back in 1886 by Sir Francis Galton. The results are a humbling reminder of just how much we have to learn about the genetic architecture of variable human traits.

The study looked at all 54 of the markers identified by the three studies linked above as being associated with height, and used these markers to generate a "genotypic score" - basically a count of the number of "tall" variants that each individual carried. This score was generated for 5,748 individuals of known height.

This graph shows the results: sex- and age-adjusted height are shown on the vertical axis, and genotype score on the horizontal axis.

i-cd4f413ffe3b6cca4515f6831edadf13-54-gene_height_regression.jpg

The blue line shows the slope of best fit for the data; you can see that the individuals forms a diffuse cloud around the line, with a barely discernible upwards trend. In fact, genotype score explained just 3.8% of the variance in height, although the authors could squeeze this up to 5.6% by carefully weighting the markers. (For the curious, the red lines represent the mean residual height in people coming from top and bottom 5% of the profile distribution.)

That's a fairly disappointing result from the best set of markers that modern genomics has to offer. Adding insult to injury, the authors go on to compare this result to the predictive power of the 123-year-old Galtonian height prediction method, which relies simply on the average deviation of the two parents from the population average (corrected for age and sex). This measure was calculated for a smaller set of 550 individuals for whom parental height data was available.

Here's the data for this approach:

i-0b2578dfb89b274808912a793c7a9d2a-parental_height_regression.jpg

In this graph, the blue line is the best fit, and the green line has a slope of 1; the difference between these two lines is due to regression toward the mean. The correlation between this predictor and measured height is much stronger than for the genotype score above: this measure predicts around 40% of the variance in height, 6 to 10 times more than is explained by genotype score.

The two measures were highly correlated, as you would expect, meaning that adding genetic data to the Galtonian method only improved the variance explained by ~1.3%.

The difference in predictive power between these two methods (which are, in effect, measuring the same thing) is a powerful testament to our current ignorance of the genetic determinants of human traits.

Of course, that ignorance is a temporary state. I know there is at least one massive meta-analysis of height genome-wide association data that should be published early this year, and will add a few more dozen common markers; the hunt for rare height-altering variants will take a little longer but should bear fruit over the next couple of years.

So, will adding more and more genetic markers eventually provide a useful predictive test that exceeds the power of simply measuring parental heights? Well, here's what that first graph would look like under the hypothetical scenario that we knew everything about height genetics (i.e. had markers capturing the full 80% of the heritable variance):

i-280bf8b78fd0918105911f8b7da7e129-ideal_height_regression.jpg

You can see immediately that the predictions made under this hypothetical future scenario would be extremely accurate. However, whether we can actually find all of the genetic variants underlying height - or even the majority of them - remains to be seen. It's entirely possible that a large fraction of this variation is determined (for instance) by very large numbers of variants with extraordinarily small effect sizes, or by subtle non-additive interactions, in which case the sample size required to characterise the full spectrum of variants may be larger than the population available for sampling. Now that would be a disappointing outcome.

It's worth noting that the better performance of the Galtonian approach is not universal. For some traits with low heritability (such as some serum lipid levels) the Galtonian approach performs even more poorly than current genetic markers; this indicates that it won't be much longer before genetic tests are better than family history for risk prediction, at least for a subset of traits.

Still, this paper is a timely reminder of the primitive state of our current understanding of complex trait genetics, and just how far we have still to go before personal genomics can provide useful, novel predictions for the majority of human traits.

Subscribe to Genetic Future.

More like this

Predicting eye colour from genes

Fan Liu, Kate van Duijn, Johannes R. Vingerling, Albert Hofman, AndrÃ© G. Uitterlinden, A. Cecile J.W. Janssens, Manfred Kayser (2009). Eye color and the prediction of complex phenotypes from genotypes Current Biology, 19 (5) DOI: 10.1016/j.cub.2009.01.027 In a recent post I noted that genetic…

Making sense of changing risk predictions from personal genomics

Mihaescu, R., van Hoek, M., Sijbrands, E., Uitterlinden, A., Witteman, J., Hofman, A., van Duijn, C., & Janssens, A. (2009). Evaluation of risk prediction updates from commercial genome-wide scans Genetics in Medicine, 11 (8), 588-594 DOI: 10.1097/GIM.0b013e3181b13a4f Caroline Wright from…

The genetic architecture of metabolic traits: a data explosion

Nature Genetics has just released six advance online manuscripts on the genetic architecture of complex metabolic traits. The amount of data in the manuscripts is overwhelming, so this post is really just a first impression; I suspect I'll have more to say once I've had time to dig into the juicy…

Slender yield from fat gene studies

Willer et al. (2008). Six new loci associated with body mass index highlight a neuronal influence on body weight regulation Nature Genetics DOI: 10.1038/ng.287 Thorleifsson et al. (2008). Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity…

...it won't be much longer before genetic tests are better than family history for risk prediction, at least for a subset of traits.

But if they have low heritability, genetic tests still won't be much good, of course.

Hi Bob,

Absolutely. The authors note in their discussion:

Thus, for lipid levels the genomic prediction is already doing as good as (or as bad as) the Galtonian one. However, the genomic profiling, unlike the Galtonian, still has the potential to improve, as more loci affecting the phenotype of interest are discovered.

However, they also note that even a small predictive value can be of some use - the genomic score prediction for height was pretty bad, but it actually has similar discriminatory power to predicting the risk of coronary heart disease using low-density lipid levels!

It probably matters that stature is not genetic, for the most part.

"It probably matters that stature is not genetic, for the most part."

I wish that people, when flat out disagreeing with a statement made in the post ("in well-nourished Westerners somewhere in the vicinity of 80-90% of the variation in height is due to genetic factors") would say what they're basing their disagreement on.

Greg,

I'm not sure what you're talking about - as I noted in the first sentence of the post, height is 80-90% heritable among well-nourished Westerners (the population being studied here). It's also worth noting that in this population, similarity between siblings in height is almost entirely due to genetics rather than a consequence of shared environmental factors.

that's really interesting stuff. Did they have any ideas as to why their genomic study didn't do well? Do you think it would have been more accurate had they used FEWER genes rather than more?

Hey scicurious,

I think it's just that the genetic markers discovered so far are the tip of the iceberg; the more markers we find, the more graph (a) will start to look like graph (c).

As for whether fewer markers would have helped: it's possible, but I think their weighted analysis would have effectively done this for them (by assigning the unnecessary markers a weight of zero) - and that still only explained 5.6% of the variance.

Do humans have a homolog of the canine IGF-1? In dogs, it's supposed to account for most of the size variation.

http://www.nih.gov/news/pr/apr2007/nhgri-05.htm

Yes, humans have IGF1 too. In fact, mutations in this gene lead to growth retardation (and other things) http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=608747
However, here we're talking about height in the 'normal' population, not about the extremes of dwarfism or gigantism. Of course, looking at genes in these extremes could provide valuable information. This is how the old candidate gene approach worked. But for example in this GWAS of height ( http://www.ncbi.nlm.nih.gov/pubmed/18952825 ), their top hit leads to lower IGF1 levels.

Daniel, maybe the previous poster was joking and talked about stature in the context of status?

Just curious, what role does methylation (epigenetics) play in height. Perhaps the interplay of environment with gene silencing (or turn on) might be the place to look for the remaining 20%.

something is missing from this story
do you mean relative height?
average height has been increasing
at an alarming rate.
I am 5 feet tall and really feel left behind

Daniel,
The moral of the story......Until we can do a genome for 100 bucks, family history is key. Even when we can do a genome for 100 USD, it will take 10 years before it trumps a good family history. Too bad you can't charge 399 to do a family history online.....

-Steve
www.thegenesherpa.blogspot.com

Hi Catherine,

In the Western world the population as a whole may have been increasing in height, but the variation between people is still primarily genetic. This is because the environmental improvements underlying this increase (I'd guess mainly improved infant nutrition and a massive reduction in infectious diseases) have affected pretty much everyone in society. With the major sources of environmental variation removed, the remaining variation in height is almost entirely due to genetics.

Steve,

No, you can't charge $399 to do an online family history; but you could add a family history section to your existing $399 genome scan, to improve the accuracy of your gene-based risk predictions.

The combination of both sources of data will always be better (albeit sometimes marginally) than either one alone, and gene-based predictions will improve over time while the value of family history remains static.

Will it be 10 years before genetics beats family history? As I said in the post, for some disease-related traits the two approaches are currently about even, so genetics will trump family history within a year or two. But I agree that for many traits (the high-heritability genetically complex ones) it seems like family history will provide better predictions than genome scans for at least the next five to ten years.

Quote: "As I said in the post, for some disease-related traits the two approaches are currently about even, so genetics will trump family history within a year or two"

Come back in a year or two and make that statement again.

You're seriously arguing that genetics won't be a better predictor than family history for any disease-related trait within two years? Shall we put some money on that?

Within two years? No! You must combine a family history with the genetic information.

Daniel,
Interesting paper. I have recently co-authored a book on height called "Normal at Any Cost: Tall Girls, Short Boys, and the Medical Industry's Quest to Manipulate Height." A portion of the book examines the science of height prediction, which is still akin to looking into a crystal ball. And Catherine, don't feel bad about your height. Average American height is actually falling; the Dutch, on the other hand, keep growing and growing and no one knows why.
--Chris

That's because we Dutch drink lots of milk ;-)
But I don't think height prediction is like looking into a crystal ball. I haven't read your book, but estimates using the average parents' height +/- x cm (can't remember the numbers exactly; different for boys and girls), usually gives a good estimate height. Of course, this will not give everybody's exact height, but at least a good estimate for the majority and better then genetics can do at the moment.

Being a natural "giant" ie over 6'6" (as per BMA definition)without some sort of pituary problem I cam honestly say that after extensive, often painful tests, the old system wouldn't have predicted my eventual height. Father 6'2" mother 5'8" my height (allowing for quite bad lardosis) at 6'11" (6'8" with lardosis) is outside the standard deviation of either predictive test. It is a pain in the neck (literally) I think one of the contributing factors is diet. Both parents suffered malnutrition in their child hood (c 1930's Britain) so not reaching their "natural" height, therefore skewing both predictive tests. And anyone needing their ceiling painted can call me on.......

That is lordosis not lardosis (I have weight issues so a Freudian slip?) :)

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Genetic Future is moving

January 18, 2011

After a semi-hiatus due to various distractions, I'm about to restart blogging in earnest again over at the new home of Genetic Future on Wired Science. Please update your RSS feed: my new one is here. And a reminder: you can always keep track of new posts here as well as other nuggets of…

One more step towards the end of recessive diseases

January 13, 2011

In the last century infant mortality has declined precipitously in the Western world, thanks in large part to the development of antibiotics and vaccination. Yet as the suffering and death from infectious disease has reduced, the burden from genetic disease has become proportionately greater:…

New FireFox plugin for 23andMe customers

January 11, 2011

Software company 5AM Solutions has just launched a neat little FireFox plug-in for customers of consumer genomics company 23andMe. The idea is very simple: Download your raw data from 23andMe (or use one of the files from me or my colleagues at Genomes Unzipped); Install the plug-in from here…

Why you CAN have your $1000 genome - so long as you learn what to do with it

January 7, 2011

As part of his Gene Week celebration over at Forbes, Matthew Herper has a provocative post titled "Why you can't have your $1000 genome". In this post I'll explain why, while Herper's pessimism is absolutely justified for genomes produced in a medical setting, I'm confident that I'll be obtaining…

Bioscience Resource Project critique of modern genomics: a missed opportunity

December 15, 2010

Late last week I stumbled across a press release with an attention-grabbing headline ("The Causes of Common Diseases are Not Genetic Concludes a New Analysis") linking to a lengthy blog post at the Bioscience Resource Project, a website devoted to food and agriculture. The post, written by two…