Africa's genetic diversity revealed by full genomes of a Bushman and a Tutu

i-b43ab7ad0fa627fbfcf886a8338ea3e5-Bushmen.jpg

i-ec1318bdbd7e9ab3c62b38fd2f9f70cf-!Gubi.jpgMeet !Gubi, the tribal elder of a group of Bushmen (or Khoisan), one of the oldest known human lineages. He lives the life of a hunter-gatherer in the Namibian part of the Kalahari Desert. But he also has a strange connection to James Watson, the British American scientist who helped to discover the structure of DNA. For a start, they're both around 80 years old. But more importantly, they are two of just 11 humans to have their entire genomes sequenced.

Along with Archbishop Desmond Tutu, !Gubi is one of two southern Africans, whose full genomes have been sequenced by Stephan Schuster and an international team of scientists . Schuster's team also analysed the genes of three other Bushmen - G/aq'o, D#kgao and !AÄ±Ë (see footnote for pronunciation guide) - focusing on the parts of their genome that codes for proteins. Like, !Gubi, these men are tribal elders and all are around 80 years old. Despite the fact that the four Bushmen come from neighbouring parts of the Kalahari, their genetic diversity is astounding. Pick any two and peer into their genomes and you'd see more variety than you would between a European and an Asian. 

This diversity reveals just how important it is to include African people in genome sequencing projects. Until now, the nine complete human genomes have included just one African - a Yoruban man from Nigeria. The rest have hailed from Europe, America, China, Korea and, most recently, Greenland circa 4,000 years ago. This is a major oversight. Africa is the birthplace of humanity and its people are the most genetically diverse on the planet. To understand human genetics without understanding Africa is like trying to learn a language by only looking at words starting with z.

The Bushmen certainly provide a glimpse into this diversity. Desmond Tutu was also selected because his ancestry covers the two largest of southern Africa's Bantu groups - the Tswama and the Nguni - making him an excellent representative for many southern Africans. Vanessa Hayes, who worked on the study, says, "This work is very expensive so we wanted to maximise the amount of diversity we could get in one individual." The team had other reasons for sequencing the bishop."He's a voice for southern Africans and for his people. He's a chairman of the Global Elders. He provides a genome with a lot of medical history behind it, having survived prostate cancer, polio and Tb, diseases that affect many southern Africans." But most importantly, Hayes says, "He wanted to participate. He himself wanted to study medicine so this for him was a personal endeavour."

The researchers hope that their new data will allow medical research to become more inclusive. Vanessa Hayes, who led the study, says that she found HIV research in South Africa to be very difficult because most genetic databases are severely Eurocentric, which rules out a lot of Africans from medical research. Without this knowledge, for example, we have no way of knowing if a drug that was developed and tested in Western patients will have the same benefits and risks in African ones.

i-19963e956e8ea8edd0d0444f08532daa-African_genomes.jpg

Schuster and Hayes compared the two new African genomes to the reference version (a composite of several anonymous people) and looked for "single nucleotide polymorphisms" - places that differed by a single DNA letter. He found these "SNPs" in their hundreds of thousands. Both !Gubi and Tutu have around a million unique SNPs that they don't share with each other or any of the other fully sequenced humans. Likewise, the proteins of the five Africans had over 27,000 amino acids that differed from the reference sequence, around half of which are unique to them.

Some of these differences are easily explained by the Bushmen's lifestyle, reflecting adaptations to hunting and gathering in a hot, dry climate. Being dark-skinned hunter-gatherers, all of them lack genetic variant that give Europeans light-coloured skin and allow them to cope with eating dairy products.

All of the Bushmen had a version of the vitamin D receptor that is associated with denser bones and three of them have a variant linked to better sprinting performance. Some of the SNPs grant the carrier the ability to taste bitter plant chemicals, and hunter-gatherers would certainly find it useful to avoid toxic plants. One of !Gubi's variants could allow him to break down foreign substances or resist parasites. Another might make his kidneys better at reaborsbing vital chloride ions and reduce the loss of salt and water, an important skill when you live in the scorching desert.

Other differences are perhaps more surprising - the five Africans all lack a genetic variant that is specific to Africa and that grants resistance against malaria. It's possible that the Bushmen may not need such resistance. But that could well change as their populations dwindle and they become forced into agricultural lifestyles, which carry higher risks of disease. Again, these newfound genetic markers could allow scientists to watch how they adapt to such challenges at a genetic level.

Most surprising of all, many of their unique SNPs are actually fairly recent developments. The Bushmen are one of the oldest human groups on the planet and you might expect their genes to reflect humanity's most ancestral state. But not the SNPs - Schuster found that only 6% of !Gubi's newfound SNPs matched the equivalent sequences in the chimpanzee genome; by comparison, the same positions in the human reference genome are an 87% match for the chimp one. They can't be ancestral sequences. They must have turned up after the Bushmen dynasty diverged from other human populations, and they provide hints about the history of this most ancient of human lineages.

The south African genomes will also make geneticists re-evaluate what they know about how our genes affect our health and risk of disease. Our current knowledge in this area is incredibly biased towards Western societies and the results of studies in such populations don't always translate to other continents. For example, one of the Bushmen had a SNP that is reputedly linked to Wolman's syndrome, a disease that prevents people from storing fat properly and kills at a young age. Try telling that to the eighty-something gentleman! Hayes says, "!Gubi is a very fit and healthy man and much better skipper on a skipping rope than I am

Reference: Schuster, S., Miller, W., Ratan, A., Tomsho, L., Giardine, B., Kasson, L., Harris, R., Petersen, D., Zhao, F., Qi, J., Alkan, C., Kidd, J., Sun, Y., Drautz, D., Bouffard, P., Muzny, D., Reid, J., Nazareth, L., Wang, Q., Burhans, R., Riemer, C., Wittekindt, N., Moorjani, P., Tindall, E., Danko, C., Teo, W., Buboltz, A., Zhang, Z., Ma, Q., Oosthuysen, A., Steenkamp, A., Oostuisen, H., Venter, P., Gajewski, J., Zhang, Y., Pugh, B., Makova, K., Nekrutenko, A., Mardis, E., Patterson, N., Pringle, T., Chiaromonte, F., Mullikin, J., Eichler, E., Hardison, R., Gibbs, R., Harkins, T., & Hayes, V. (2010). Complete Khoisan and Bantu genomes from southern Africa Nature, 463 (7283), 943-947 DOI: 10.1038/nature08795

A note on names: The Bushman language Bushmen languages includes a variety of clicks, which explains the strange characters in their names. The # is an alveolar click, made by pulling the tip of the tongue down sharply from the roof of the mouth to make the sound of a popping cork. The ! is a palatal click, which is a softer version of the alveolar one and made with a flat tongue. The / is a dental click, which is made by sucking air through the front teeth and sounds like an English tsk!

Another note on names: I'm aware that there's controversy over the use of the term "Bushmen" in some circles. I'm using it because it's by far the most commonly used term in the paper, which also mentions San or Khoisan. Note the captial B to denote an actual group of people rather than a colloquial descriptor.

More genomics and anthropology:

i-77217d2c5311c2be408065c3c076b83e-Twitter.jpg i-988017b08cce458f49765389f9af0675-Facebook.jpg i-6f3b46114afd5e1e9660f1f502bf6836-Feed.jpg i-deec675bab6f2b978e687ca6294b41a5-Book.jpg

More like this

Ed what is your take on this statement with respect to the position of at least one science blogger that there is no evidence of genetic determinants for athletic performance (and that the very idea is a thinly-veiled cover for racism):

"All of the Bushmen had a version of the vitamin D receptor that is associated with denser bones and three of them have a variant linked to better sprinting performance."

I'd say that some of these SNP associations aren't exactly conclusive and this study, if anything, confirms that. The sprinting thing was mentioned in the paper, which is why I mention it here. Worth noting that the authors themselves cite the uncertainty of some of these associations - will drag
out actual quote when I get to a computer.

Great stuff. Re "only 6% of !Gubi's newfound SNPs matched the equivalent sequences in the chimpanzee genome; by comparison, the same positions in the human reference genome are an 87% match for the chimp one. They can't be ancestral sequences."

Could the reference genome be ancestral, and this man have a derived sequence?

and you might expect their genes to reflect humanity's most ancestral state. But not the SNPs - Schuster found that only 6% [â¦] matched the equivalent sequences in the chimpanzee genome

Of course, the present state of the chimpanzee genome isn't the same as the state our last common ancestor had some 6 million years ago. It's not like mutation and drift, or even selection, had stopped for chimps.

The Bushman language

It's a large, large language family, easily comparable to Indo-European at the very least.

Another note on names:

It being a language family, there's no self-designation for the whole family in any of the languages. So we can't just "call them what they call themselves"â¦

"Khoisan" is a term for the language family formed by the Kxoekxoe (Hottentot) and, well, Bushman language families. That name is an artificial composite of "Kxoekxoe" and "San".

By David MarjanoviÄ (not verified) on 17 Feb 2010 #permalink

adaptations to malaria in africa are pretty recent on an evolutionary scale. i think on the order of 5,000 or so. no surprise that hunter-gatherer peoples lack the variants.

Could the reference genome be ancestral, and this man have a derived sequence?

Of course, but probably both are derived to varying extents.

By David MarjanoviÄ (not verified) on 17 Feb 2010 #permalink

Not only have chimps obviously evolved somewhat since our divergence with the last common ancestor (although not nearly enough to account for only 6% similarity! Even 87% is a bit low...) but the state of the chimp genome within the databases is a bit miserable. The assembly is simply wrong in some places.

James Watson is British?

Aw hell. You can tell I wrote this in a rush, can't you? Fixed the "Bushman language" thing and Watson's nationality. Thanks to everyone for the constructive feedback.

What does it mean for a SNP to 'match' since a SNP is a locus at which there is variation. Presumably one allele or the other is ancestral at every SNP.

The 'old lineage' stuff doesn't quite seem sensible either. Pick an allele at a locus from me, an allele from the same locus from a Bushman, follow them back to their common ancestor. The length of the two branches, hence the probability of a mutation have happened is exactly the same on each branch. Hence the probability of the ancestral state (or the expected frequency of the ancestral state if we extend the sample of size 2 to larger samples)is exactly the same in both of us. Alan Rogers recently published a nice paper showing this in greater detail.

By Henry Harpending (not verified) on 17 Feb 2010 #permalink

Genetic diversity increases where there is low selection pressure. Africa has a mild climate without seasonal food scarcity. It is also true that the founder effect decreases genetic diversity the farther a population migrates from their origins.

By Scientist (not verified) on 17 Feb 2010 #permalink

With regard to African athletic performance, the prevalence of myostatin mutations ranges from ten to twenty percent in Africa, while only 1% in Europe and Asia. As well the slavers culled for heavily muscled slaves because they fetched a better price and were more likely to survive the arduous journey bound and shackled, stacked like cordwood in the hold of a ship.

By scientist (not verified) on 17 Feb 2010 #permalink

One interesting implication (not quite enough data for a result, yet) is that this group has managed to avoid inbreeding.

It seems pretty common to assume aboriginal groups (especially when they have small populations today) were somehow always isolated. Logically, that should be the exception, not the rule, since inbreeding isn't exactly a good thing for the long term success of a population. Yeah, the exceptions are cool simplified cases we can make more interesting conclusions about, but again, not the rule.
Maybe we still have too many geneticists trained on lab strains.

Also, as I mentioned elsewhere, these results seem to be exactly what we should expect for a species which has undergone a recent radiation. We've seen the same sort of thing numerous times in other clades.

I think that you will find that Archbishop Tutu is of Twsana, not "Twsama" origen.
I also question "scientist's" description of "Africa" having a mild climate. Africa is a very large continent with a huge variety of climatic zones. Even just in Southern Africa you have the mediterranian climate of the southern Cape, the tropical climate of northern Kwazulu-Natal and Mozambique, the velt of the central plateau and the extreme desert of the Namib. It is also subjected to periodic drought conditions that have waxed and waned based on oceanic and climate cycles. I can assure you that the Kalahari where the majority of the San peoples live is in no way a mild climate. Extremely hot with very scarce water and major fluctuations of food availability based on animal migrations and a growing season based on the very seasonal rains. I agree with the authors of the study that the genetic diversity is primarily based on the age and ancestral nature of the populations rather than relaxed selection pressure. Having spent a considerable amount of time out in the Southern African bush including the kalahari I challenge anyone who things that the climate and selection pressure are "mild" to live like the San peoples do for a month...

It's a large, large language family, easily comparable to Indo-European at the very least.

If at all. Isn't it more of a convenient cultural/geographical lumping?

By Trond Engen (not verified) on 19 Feb 2010 #permalink