Massive study of African genetic diversity

By dgmacarthur on April 30, 2009.

This is a profoundly impressive paper - a study of the patterns of genetic variation in 2,400 individuals from 113 African populations, by far the most comprehensive analysis of African genetic diversity ever performed.

I had heard that Sarah Tishkoff's group had assembled a large
collection of African DNA samples, a daunting achievement in itself
given the logistical, ethical and social challenges involved (New Scientist
notes that "the researchers often had to use their vehicle batteries to
power the
centrifuges used to separate out white blood cells from their samples")
- but the breadth of the analysis in this paper has blown me away.

The number of markers used in the study is limited by current standards - just 1,328 variants compared to the hundreds of thousands used in modern genome-wide association studies - but by subjecting these markers to an extremely detailed and careful analysis Tishkoff and her co-authors have done justice to the sheer scale of the genetic diversity within the African continent.

Here's an image that provides just a glimpse of that diversity:

i-274aeb3170f0cfcbbc64f4bbc5cecb0f-tishkoff_africa_map.jpg
This map was generated by first using the program STRUCTURE to infer 14 ancestral populations that best define worldwide human genetic diversity; each of these clusters has been assigned a colour, and the pie graphs above show the proportions of each of these clusters contributing to each of the African populations in the study.

By contrast, using this colour scheme virtually the whole of East Asia is a virtually undifferentiated sea of pink, Europe a block of blue, and even the diversity of India is reduced to a mix of just two colours. The reason for this is simple: our species evolved in Africa, and all of us non-Africans represent just a paltry sub-sample of the genetic variation that arose there.

The best bit about this study is that it is just the beginning. Tishkoff et al. lay out the future of research in the area:

Given the extensive amount of ethnic diversity in Africa, additional sampling, particularly from under-represented regions such as North and Central Africa, is important. Because of the extensive levels of substructure in Africa, ethnically and geographically diverse African populations need to be included in re-sequencing, genome wide association (GWAS) and pharmacogenetic studies, to identify population or regional-specific functional variants associated with disease or drug response (1). The high levels of mixed ancestry from genetically divergent ancestral population clusters in African populations could also be useful for mapping by admixture disequilibrium (MALD). Future large scale re-sequencing and genotyping of Africans will be informative for reconstructing human evolutionary history, for understanding human adaptations, and for identifying genetic risk factors, and potential treatments, for disease in Africa.

Increasing the number of markers used - and ultimately sequencing entire genomes - from Tishkoff's already impressive collection of samples will no doubt be the first step; collecting more samples (especially from disease patients) of African populations is also a major priority for a number of research groups internationally. The sheer diversity in the African genome provides substantial power for genome-wide association studies looking to narrow down the regions associated with specific diseases, so work in this area will benefit us genetically impoverished non-Africans - as well, I hope, as the Africans themselves.

It will take me a long time to fully digest this paper - the supplementary information alone (warning: large PDF) contains 33 figures and 9 tables! - but while I flounder around looking for more superlatives to describe this work, you should check out Dienekes, Razib, GenomeWeb Daily News and the Spittoon for coverage of the major messages.

Subscribe to Genetic Future.

More like this

"The number of markers used in the study is limited by current standards - just 1,328 variants compared to the hundreds of thousands used in modern genome-wide association studies..."

Yep, a few hundred thousand SNPs would be better, because that's what's generally expected today, but then again, 10â20 SNPs are required to equal the power of four to six microsatellite loci (Phillip A. Morin et al, 2004). So this ain't bad.

Why were the northern countries of Africa excluded?

Why were the northern countries of Africa excluded?

how much further north do you get than algeria???

Maybe the Mozabite aren't a very good representation of Algerians or other Arab North Africans, or indeed of most other Berbers? They're a fairly isolated Saharan group with comparatively a lot of Sub-Saharan African admixture.

Btw, interesting to note that little sample from Western Sahara coming out more blue "Eurasian" than the Mozabite. But I think that's one of the probems with the resolution of the markers used in this study. More markers would perhaps distinguish between true Eurasian blue and African (possibly the ancestral) blue?

I am merely an economic historian, so please be patient with my ignorance. What intrigues me is that genetic diversity would be accompanied by a diversity of talents, that is, a high variance in various sorts of intellectual and physical abilities, to the extend that they are inheritable. Africa already has the tallest and the shortest people. Therefore (I say optimistically) when in the next 50-100 years Africa attains modern riches (routinely sending a high percentage of young people to university, for example) we should see a cultural explosion. Twenty Einsteins. Ten Mozarts. Does such an argument (believe me on the economics!) make sense to you genetic folk?

Blah, I just checked the paper. They used 848 microsatellites, 476 in-dels and...drum roll...3 SNPs.

It's fascinating that at k=13, the Dogon of the Sahel look about 85% European and 15% San in this study - images of them, also bear this out, except that they are phenotypically darker than either European or San.

What does this say about the peopling of the Sahel I wonder?

Deirdre, one of the first things students learn in genetics is that the phenotype (or the observable trait) of an organism is dependent not only on its genetic inheritance but dependent also on the environment. Thus while the genetic diversity in Africa may mean diversity in intelligence and physical ability associated genes, I would think that the lack of cultural infrastructure would impede an imminent cultural explosion from Africa. Moreover, one must presuppose that the genetic diversity with regard to intelligence and physical ability associated genes in Africa is much larger than elsewhere.

who cares? you just need them as a non-sub-saharan african outgroup. that's what they're in there for, right? they're not as outgroupish as you might like, but for a first approximation they give you africa north of the sahara. on the case of african diversity there just isn't much on the south shore of the mediterranean.

Therefore (I say optimistically) when in the next 50-100 years Africa attains modern riches (routinely sending a high percentage of young people to university, for example) we should see a cultural explosion. Twenty Einsteins. Ten Mozarts. Does such an argument (believe me on the economics!) make sense to you genetic folk?

well, if you follow your logic out we'll be looking at pygmy and khoisan physicists in the late 21st century as these populations exhibit more heterozygosity than other sub-saharan african groups. in any case, there's plenty of genetic variation differences in other regions of the world and it doesn't seem to track in the manner you're implying, so i'd be skeptical to say the least. total genome content variation doesn't imply variation on any given set of traits, right? africans are FAR less diverse on the set of genes for skin color than the populations of the middle east and india, as an example. but these latter populations remain a subset of africans genetically.

Isn't the social construction of race ironic when you consider the genetic variation?

note: the scientist most likely had a hard time distinguishing between true Eurasian blue and African ( ancestral) blue.

Therefore it's more then likley that most of the blue associated with Eurasia,amoung the three Saharan groups, the Dogon, the Mozambite, and the Beja are that of the indigenous Saharo-Sudanic ancestral group that eventually gave rise to the younger Eurasian specific group.

The Dogon, the Mozambite and the Beja, previous tests have linked them to ancestral Saharan groups, so the blue repersented in their gene-pool is likely that of an ancestral Saharo-Sudanic group.

The Dogon ALWAYS cluster with other West/Central/Saharo-Sudanic Africans, while the Beja cluster with various Africna groups along the Sahel-Saharo and Nile Valley/Horn of Africa.

The Mozambite, unlike what some posters have stated, are a very good repersentation for the general North African population, being predomiantely African with a significant Eursian component, clustering between the two extremes.

Arab Noth Africans are largely of an indigenous Berber origin, they were Arabnized.

Also, Africa has the most diversified genological and phenological variation profile, surpassing both Indian and Middle Eastern populations when it comes to skin color and other physical traits.

Aki is completely false.
North Afrians are predominately of West-Eurasian extraction.
Fuck Afro-centrists.

There are few northafricans choosen because there are very few isolated populations. This is why this study is misleading. Africa has many exotic phenotypes, but only the minority have a great deal of diversity. Id think its highly likely that most bantu language speakers(600 million africans. are on par with Eurasians with monodiversity.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Genetic Future is moving

January 18, 2011

After a semi-hiatus due to various distractions, I'm about to restart blogging in earnest again over at the new home of Genetic Future on Wired Science. Please update your RSS feed: my new one is here. And a reminder: you can always keep track of new posts here as well as other nuggets of…

One more step towards the end of recessive diseases

January 13, 2011

In the last century infant mortality has declined precipitously in the Western world, thanks in large part to the development of antibiotics and vaccination. Yet as the suffering and death from infectious disease has reduced, the burden from genetic disease has become proportionately greater:…

New FireFox plugin for 23andMe customers

January 11, 2011

Software company 5AM Solutions has just launched a neat little FireFox plug-in for customers of consumer genomics company 23andMe. The idea is very simple: Download your raw data from 23andMe (or use one of the files from me or my colleagues at Genomes Unzipped); Install the…

Why you CAN have your $1000 genome - so long as you learn what to do with it

January 7, 2011

As part of his Gene Week celebration over at Forbes, Matthew Herper has a provocative post titled "Why you can't have your $1000 genome". In this post I'll explain why, while Herper's pessimism is absolutely justified for genomes produced in a medical setting, I'm confident that I'll be obtaining…

Bioscience Resource Project critique of modern genomics: a missed opportunity

December 15, 2010

Late last week I stumbled across a press release with an attention-grabbing headline ("The Causes of Common Diseases are Not Genetic Concludes a New Analysis") linking to a lengthy blog post at the Bioscience Resource Project, a website devoted to food and agriculture. The post, written by two…