Gene Expression

For the past few decades there has been a long standing debate as to the origins of modern Europeans. The two alternative hypotheses are:

* Europeans are descended from Middle Eastern farmers, who brought their Neolithic cultural toolkit less than 10,000 years ago.

* Europeans are descended from Paleolithic hunter-gatherers, who acculturated to the farming way of life through diffusion of ideas.

The two extreme positions are not really accepted in such stark forms by anyone. Rather, the debate is over the effect size of #1 vs. #2. Bryan Sykes, a geneticist at Oxford, has been arguing for the primacy of #2 for many years. His argument is most fully laid out in The Seven Daughters of Eve. In short the model is that on the order of 80% of the ancestors of Europeans today derive from Paleolithic hunter-gatherers, while 20% derive from Middle Eastern farmers. Foremost amongst those who argue for #1 would be the famed genetic anthropologist L. L. Cavalli-Sforza. Cavalli-Sforza has objected strongly to Sykes’ characterization of his own position, and suggests that the most recent data do not refuse his model in any way. His point is that the “demic diffusion” model simply points to the critical role of demographic advance, and does not positive total genetic replacement. Or, even preponderant effect, seeing as how there will be dilution of the genetic signal along the wave of advance. It is therefore a glass-half-empty vs. glass-half-fully argument. Remember also that Sykes’ values are averaged across Europeans, so that the signal of Middle Eastern farmers would be greater in southeast Europe than in the British Isles.

Of course there are some methodological issues here; Sykes’ argument relied on mitochondrial DNA, passed only through mothers. Cavalli-Sforza initially relied on classical autosomal markers, though his group later focused on Y chromosomes, passed through males. Some workers have found values closer to 50% for Middle Eastern contribution. And most importantly, DNA extraction techniques are suggesting that inferences made from contemporary patterns of variation may not give us an accurate map of past patterns of variation. These techniques are coming together and suggesting that in fact European hunter-gatherers left a much smaller contribution to the ancestry of modern Europeans than Sykes et al. have inferred.

In this unsettled landscape comes a new paper which turns some assumptions about Y chromosomal variation in Europe on its head. The focus is on a subclade of the R1b haplogroup, which has its highest frequencies in Western Europe, in particular along the Atlantic fringe. The pattern of variation has led many to infer that this lineage, in particular the R1b1b2 haplgroup, is a marker of the Paleolithic populations of Western Europe. The high frequency of this marker among the Basques in particular is seen as evidence of this, because this group speaks a language which is a pre-Indo-European isolate (the Basques are used as a Paleolithic reference group in many papers). But perhaps not. A Predominantly Neolithic Origin for European Paternal Lineages:

The relative contributions to modern European populations of Paleolithic hunter-gatherers and Neolithic farmers from the Near East have been intensely debated. Haplogroup R1b1b2 (R-M269) is the commonest European Y-chromosomal lineage, increasing in frequency from east to west, and carried by 110 million European men. Previous studies suggested a Paleolithic origin, but here we show that the geographical distribution of its microsatellite diversity is best explained by spread from a single source in the Near East via Anatolia during the Neolithic. Taken with evidence on the origins of other haplogroups, this indicates that most European Y chromosomes originate in the Neolithic expansion. This reinterpretation makes Europe a prime example of how technological and cultural change is linked with the expansion of a Y-chromosomal lineage, and the contrast of this pattern with that shown by maternally inherited mitochondrial DNA suggests a unique role for males in the transition.

Let’s look at Figure 1 first to see how they get to their conclusion.


The first panel shows the standard SE-NW expansion of agriculture. The second the west-east decline in R1b1b2 frequency. These I’ve been aware of, and the patterns pointed to the inverse relationship between agriculturalists and R1b1b2. But the third panel points to something different. It shows the correspondence of variation in R1b1b2 with panel A. This is counterintuitive in light of our previous assumption. Regions where lineages have been extant the longest should have the highest variation. This is the insight which allows researchers to be confident that modern human beings emerged from Africa within the last 50,000 years or so; Africa has by far the most genetic variance of any region of the world. By contrast, the New World has the least. This is because serial bottlenecks from population A → B results in a loss of information, like serial photocopying. Genetic drift results in the extinction of many lineages, and the stochastic rise in frequency of a few lineages. In other words, in this model the high frequency of R1b1b2 in Western Europe is a function not of that marker’s long residence in that region, but a rapid population expansion of the lineage from a small group of founders, whereby other lineages simply went extinct due to stochastic factors on the wave of advance.

Here’s a graphic illustration of how the variation relates to geography. Remember that R-squared is the % of the variation of Y which can be explained by variation of X.


i-34188342dbc7d28b477b23ce70d2f7da-r1fig4.pngBut probably the clincher has to be their calculations of “The Most Recent Common Ancestor” (TMRCA) of the individual lineages of R1b1b2 within the various populations. These data seem to suggest that it is in Turkey that R1b1b2 has the most time depth; that is, from the founding of the lineage via its unique mutations there was diversification over time. When new populations are founded they tend to only reflect a small proportion of the variation, so the TMCRA will be a lower value because there is a shallower time depth to the populations for them to build up new mutations.

But take a look at the 95% confidence intervals. There’s a lot of overlap, and though it is nice that the trend fits expectations, that is, regions with more R1b1b2 diversity should be those where there’s more time depth to build up that diversity, there have been so many flip-flops in this area that people will probably question dating here. One issue to note is that it seems likely that if the model presented here is true, that R1b1b2 is newcomer from the Middle East which rapidly expanded in frequency across Western Europe, it’s going to be hard to getting the clarity you need from molecular clock based methods because the demographic processes occurred rather rapidly. We know from archaeology that agricultural societies could sprout up almost instantaneously, as if they simply transplanted their culture to new locales. Some of this likely occurred via sea, using the Mediterranean and the Atlantic fringe.

The authors point out that in places like Japan and India there is a great deal of circumstantial evidence for agriculture resulting in the expansion of particular lineages, so the preponderance of acculturation in Europe as the mode of transmission seems atypical. Though I don’t see citations to the recent DNA extraction results which have unsettled historical population genetic orthodoxy, I think that should also change our priors a bit in terms of whether we should give much weight to this particular result. New data from India, which posits a hybrid autosome, predominantly recent exogenous Y lineages, and indigenous mtDNA lineages, may serve a model for Europe. The authors here do not reject that mtDNA is predominantly indigenous, so they posit that agriculture spread with male lineages from the Middle East who intermarried with the daughters of locals. One analog might be the emergence of mestizos in the New World, who have predominantly European male lineages and native female lineages. Finally, one question a friend brought up: if the higher frequency of R1b1b2 is a function of the wave of advance, why is it the same haplogroup all along the wave front? Standard population genetic theory tells us that fragmented small groups will tend to lose genetic diversity and fix particular alleles, but those alleles are not going to be the same. It seems that it is more plausible that there were serial bottlenecks through coastal migrations, and eventually these expanded inland once they stumbled onto the northwest European plain. But that’s just speculation.

Update: Also see Dienekes, I assume there’ll be a robust discussion thread on this paper….

Citation: Balaresque P, Bowden GR, Adams SM, Leung H-Y, King TE, et al. (2010) A Predominantly Neolithic Origin for European Paternal Lineages. PLoS
Biol 8(1):e1000285. doi:10.1371/journal.pbio.1000285


  1. #1 frog
    January 19, 2010

    Look at SP6 & FR2 regions: western Spain and the Netherlands. Those regions have the highest diversity in Western Europe, while also having high frequency of R1b1b2.

    So it seems plausible that the coastal settlement by Neolithic farmers were there (instead of going east to west, it went west to east after the initial small number of boats traveled the Mediterranean).

    It also seems unsurprising that it’s male lines that are amplified by association with high technology. After all, the maximum number of children a male can have is much larger than the maximum number a female can have, therefore in a situation of a technological front, a few male lines can quite easily explode, particularly for the Y chromosome. It’s probably a situation that rarely arose before the neolithic, since technological wave fronts composed of steep gradients are difficult to imagine before the founding of cities.

  2. We had my husband’s mitochondrial DNA analyzed through the National Geographic Genographic project, and discovered that – despite his Middle Eastern birth – his ancient mother was from Europe. And – to be honest – we need a middle of the road guide to explain it to us. The explanations we encounter are either to basic, or jump ahead to scientific jargon we don’t understand. Any ideas?

    This subject fascinates us, and the more we know, the more we don’t!

  3. #3 vineviz
    January 19, 2010

    Finally, one question a friend brought up: if the higher frequency of R1b1b2 is a function of the wave of advance, why is it the same haplogroup all along the wave front?

    Is it?

    We can see glimmers of different distributions for the subclades of R1b1b2, with R-U106 having a somewhat different distribution than R-P312 for example. And within R-P312, we see R-L21 and R-U152 having different distributions.

    They overlap, but they have different frequency peaks. I think the forthcoming Cruciani/Scozzari paper will shed more light on this to some extent.


  4. #4 Kirsten Saxe
    January 19, 2010

    Kayla, have you tried joining the International Society of Genetic Genealogy? It’s a very good online organization with free membership, and they have a group for people who consider themselves Newbies.

  5. #5 Joe Walker
    January 19, 2010

    If most Western Europeans are descended from Middle Eastern farmers then doesn’t it seem odd that celiac disease is very common in Western and Northern European populations such as the Irish? I find it difficult to believe that an inability to process wheat gluten would be common in Middle Eastern farmers.

  6. #6 rec1man
    January 19, 2010

    Can someone explain how the basque having the same R1b as the rest of west europeans dont speak Indo-European language

  7. #7 Ponto
    January 20, 2010

    I can explain the gluten thing. Northern Europeans are but a subset of Southern Europeans and not a large one. Northern Europeans are very much alike, because they come from a smaller genepool. Founder effect. Just as in groups that have budded off much larger groups but with limited number of founders, mutations can move quickly into that budded off group. Ashkenazim Jews, the Finns as a whole, Amish in the USA are examples of what happens when small groups with limited founders bud off a larger group. The same effect with the retention of the lactase enzyme into adulthood, except that mutation was not deleterious. Mutations can work both ways, good and bad. It is higher in Northern Europeans than Southern Europeans. You can then see that the differences between Northern and Southern Europeans is just from founder effect with later sexual selection as in the physical features which are often reported to the point of over exaggeration, like fair hair or blue eyes or pink skin.

    When it comes to Y chromosome haplogroups, none is native to Europe. Some minor downstream SNPs within the haplogroup may have developed in Europe. Those ones which mean so much to some of you: Celtic, Dalriadic, Iberian…yawn. If you are looking for native European haplogroups it is in the female side you need to look. Cheddar Man belonged to U5. He died before agriculture was introduced to Britain i.e pre Neolithic farmers. Mito haplogroup U* is ancient, however not all mito haplogroup U* clades are equally ancient. A Paleolithic European’s remains were found in Russia, he was U2, and he was dated to more than 20,000 years ago. U2 is not common in Europe, more so in South Asian populations suggesting its origin point.

    Now the Basques speak an unusual language for Europe. Well so do the Hungarians, Estonians and Finns. The Maltese speak a Semitic language and Turkey has territory in Europe where its language is spoken. When it comes to genetics, the Basques show the effects of inbreeding, and isolation. There are other Europeans that show odd genetics due to isolation and inbreeding. Does that make them original and Paleolithic? Indo European languages have an Asian origin. I don’t understand why those languages are so special other than being common. At one time, non I.E languages were common in Europe, the Iberians did not speak I.E languages or what is Basque. The Etruscans and Pelasgians did not speak I.E languages. I.E languages only entered some parts of Europe in the historic period. Languages are transmitted by women, mothers. The Basques obviously had recent I.E language speaking male forebears, the female forebears spoke Proto Basque. The women prevailed. Isolated populations sometimes have matrinial societies. The opposite happened in Hungary, male transmission of language. In Bulgaria, female transmission of language.

    The Basques other than their odd language are just Europeans whose isolation in their valleys has made them European genetic outliers.

  8. #8 razib
    January 20, 2010

    The same effect with the retention of the lactase enzyme into adulthood, except that mutation was not deleterious.

    that’s not founder effect at all. there’s a large literature on this now, and you are wrong. i’m skeptical that founder effect can explain most of the traits you are talking about, but in the case of lactase persistence it’s more than skepticism, rather plain contradiction.

  9. #9 toto
    January 20, 2010

    I guess the obvious question is: why is there such a striking negative correlation between diversity and frequency?

    Or in other words: if the haplotype really originates from Asia Minor and the Balkans (as it seems to do), then why was it virtually wiped out from there? Did the Asian Turks really make that much of a genetic impact?

  10. #10 razib
    January 20, 2010

    why is there such a striking negative correlation between diversity and frequency?

    i think the assumption is that random genetic drift resulted in the extinction of all the other founder lineages. anatolia has the full range of diversity. i don’t know if i buy that though…..

    Did the Asian Turks really make that much of a genetic impact?


  11. Thank you Kirsten! I am now a proud member of the International Society of Genetic Genealogy, and have already learned a ton. What a great resource.

  12. #12 vineviz
    January 20, 2010

    I guess the obvious question is: why is there such a striking negative correlation between diversity and frequency? Or in other words: if the haplotype really originates from Asia Minor and the Balkans (as it seems to do), then why was it virtually wiped out from there?

    The contention of the paper is, by reference, that R1b1b2 was not necessarily a major lineage in southwest Asia. The authors conclude that R1b1b2 increased in frequency as it expanded across Europe. This is the so-called surfing effect.

    This “surfing”, which happens when a mutation arises near the front of an expanding wave, produces a cline of increasing frequency and decreasing variance. Exactly what we see with R1b1b2.

  13. #13 Tod
    January 20, 2010

    Why would they go along the coast?Sealing and fishing would mean the coast was the least nomadic and most populated part of Europe. Unlike the hunters who were moving north with the herds coastal people would be likely to stay put and cause trouble; the plain would be where the best land would be.

    The appearance of the inhabitants of west coast Ireland (highest percentage of R1b1b2) hardly suggests they have much ME ancestry.

  14. #14 plschwartz
    January 20, 2010

    Sykes and the diffusion model has never made much sense to me. Farming is not just putting a seed in the ground. There are endless pieces of information needed which don’t get diffused easily. But more then ” I’m gonna put down my spear and get me a hoe” There is a whole culture around hunting which just don’t get picked up and transferred into a farming culture. There are lots of former hunter-gatherer cultures around and few of them have made a successful transition to farming. The KUNG are a good example. As are the natives of New Guinea.
    I even imagine that with assortive mating, that the physical and psychological traits best for survival in a hunting gathering group are quite different from those in a farming group.

  15. #15 Rafe Kelley
    January 21, 2010

    Fascinating every since the results came out on the mesolithic hunter y-chromosomes I have been waiting for the the other show to drop on the origin of R1b the putative UP y chromosome I was hoping it would turn out to Orginate the russian steppe and act as conformation of the Steppe orgin of indo europeans. So much for that.