Massive study of African genetic diversity

This is a profoundly impressive paper - a study of the patterns of genetic variation in 2,400 individuals from 113 African populations, by far the most comprehensive analysis of African genetic diversity ever performed.

I had heard that Sarah Tishkoff's group had assembled a large
collection of African DNA samples, a daunting achievement in itself
given the logistical, ethical and social challenges involved (New Scientist
notes that "the researchers often had to use their vehicle batteries to
power the
centrifuges used to separate out white blood cells from their samples")
- but the  breadth of the analysis in this paper has blown me away.

The number of markers used in the study is limited by current standards - just 1,328 variants compared to the hundreds of thousands used in modern genome-wide association studies - but by subjecting these markers to an extremely detailed and careful analysis Tishkoff and her co-authors have done justice to the sheer scale of the genetic diversity within the African continent.

Here's an image that provides just a glimpse of that diversity:

i-274aeb3170f0cfcbbc64f4bbc5cecb0f-tishkoff_africa_map.jpg
This map was generated by first using the program STRUCTURE to infer 14 ancestral populations that best define worldwide human genetic diversity; each of these clusters has been assigned a colour, and the pie graphs above show the proportions of each of these clusters contributing to each of the African populations in the study.

By contrast, using this colour scheme virtually the whole of East Asia is a virtually undifferentiated sea of pink, Europe a block of blue, and even the diversity of India is reduced to a mix of just two colours. The reason for this is simple: our species evolved in Africa, and all of us non-Africans represent just a paltry sub-sample of the genetic variation that arose there.

The best bit about this study is that it is just the beginning. Tishkoff et al. lay out the future of research in the area:

Given the extensive amount of ethnic diversity in Africa, additional sampling, particularly from under-represented regions such as North and Central Africa, is important. Because of the extensive levels of substructure in Africa, ethnically and geographically diverse African populations need to be included in re-sequencing, genome wide association (GWAS) and pharmacogenetic studies, to identify population or regional-specific functional variants associated with disease or drug response (1). The high levels of mixed ancestry from genetically divergent ancestral population clusters in African populations could also be useful for mapping by admixture disequilibrium (MALD). Future large scale re-sequencing and genotyping of Africans will be informative for reconstructing human evolutionary history, for understanding human adaptations, and for identifying genetic risk factors, and potential treatments, for disease in Africa.

Increasing the number of markers used - and ultimately sequencing entire genomes - from Tishkoff's already impressive collection of samples will no doubt be the first step; collecting more samples (especially from disease patients) of African populations is also a major priority for a number of research groups internationally. The sheer diversity in the African genome provides substantial power for genome-wide association studies looking to narrow down the regions associated with specific diseases, so work in this area will benefit us genetically impoverished non-Africans - as well, I hope, as the Africans themselves.

It will take me a long time to fully digest this paper - the supplementary information alone (warning: large PDF) contains 33 figures and 9 tables! - but while I flounder around looking for more superlatives to describe this work, you should check out Dienekes, Razib, GenomeWeb Daily News and the Spittoon for coverage of the major messages.

More like this

"The number of markers used in the study is limited by current standards - just 1,328 variants compared to the hundreds of thousands used in modern genome-wide association studies..."

Yep, a few hundred thousand SNPs would be better, because that's what's generally expected today, but then again, 10â20 SNPs are required to equal the power of four to six microsatellite loci (Phillip A. Morin et al, 2004). So this ain't bad.

Why were the northern countries of Africa excluded?

Maybe the Mozabite aren't a very good representation of Algerians or other Arab North Africans, or indeed of most other Berbers? They're a fairly isolated Saharan group with comparatively a lot of Sub-Saharan African admixture.

Btw, interesting to note that little sample from Western Sahara coming out more blue "Eurasian" than the Mozabite. But I think that's one of the probems with the resolution of the markers used in this study. More markers would perhaps distinguish between true Eurasian blue and African (possibly the ancestral) blue?

I am merely an economic historian, so please be patient with my ignorance. What intrigues me is that genetic diversity would be accompanied by a diversity of talents, that is, a high variance in various sorts of intellectual and physical abilities, to the extend that they are inheritable. Africa already has the tallest and the shortest people. Therefore (I say optimistically) when in the next 50-100 years Africa attains modern riches (routinely sending a high percentage of young people to university, for example) we should see a cultural explosion. Twenty Einsteins. Ten Mozarts. Does such an argument (believe me on the economics!) make sense to you genetic folk?

By Deirdre McCloskey (not verified) on 01 May 2009 #permalink

It's fascinating that at k=13, the Dogon of the Sahel look about 85% European and 15% San in this study - images of them, also bear this out, except that they are phenotypically darker than either European or San.

What does this say about the peopling of the Sahel I wonder?

Deirdre, one of the first things students learn in genetics is that the phenotype (or the observable trait) of an organism is dependent not only on its genetic inheritance but dependent also on the environment. Thus while the genetic diversity in Africa may mean diversity in intelligence and physical ability associated genes, I would think that the lack of cultural infrastructure would impede an imminent cultural explosion from Africa. Moreover, one must presuppose that the genetic diversity with regard to intelligence and physical ability associated genes in Africa is much larger than elsewhere.

Maybe the Mozabite aren't a very good representation of Algerians or other Arab North Africans, or indeed of most other Berbers? They're a fairly isolated Saharan group with comparatively a lot of Sub-Saharan African admixture.

who cares? you just need them as a non-sub-saharan african outgroup. that's what they're in there for, right? they're not as outgroupish as you might like, but for a first approximation they give you africa north of the sahara. on the case of african diversity there just isn't much on the south shore of the mediterranean.

Therefore (I say optimistically) when in the next 50-100 years Africa attains modern riches (routinely sending a high percentage of young people to university, for example) we should see a cultural explosion. Twenty Einsteins. Ten Mozarts. Does such an argument (believe me on the economics!) make sense to you genetic folk?

well, if you follow your logic out we'll be looking at pygmy and khoisan physicists in the late 21st century as these populations exhibit more heterozygosity than other sub-saharan african groups. in any case, there's plenty of genetic variation differences in other regions of the world and it doesn't seem to track in the manner you're implying, so i'd be skeptical to say the least. total genome content variation doesn't imply variation on any given set of traits, right? africans are FAR less diverse on the set of genes for skin color than the populations of the middle east and india, as an example. but these latter populations remain a subset of africans genetically.

note: the scientist most likely had a hard time distinguishing between true Eurasian blue and African ( ancestral) blue.

Therefore it's more then likley that most of the blue associated with Eurasia,amoung the three Saharan groups, the Dogon, the Mozambite, and the Beja are that of the indigenous Saharo-Sudanic ancestral group that eventually gave rise to the younger Eurasian specific group.

The Dogon, the Mozambite and the Beja, previous tests have linked them to ancestral Saharan groups, so the blue repersented in their gene-pool is likely that of an ancestral Saharo-Sudanic group.

The Dogon ALWAYS cluster with other West/Central/Saharo-Sudanic Africans, while the Beja cluster with various Africna groups along the Sahel-Saharo and Nile Valley/Horn of Africa.

The Mozambite, unlike what some posters have stated, are a very good repersentation for the general North African population, being predomiantely African with a significant Eursian component, clustering between the two extremes.

Arab Noth Africans are largely of an indigenous Berber origin, they were Arabnized.

Also, Africa has the most diversified genological and phenological variation profile, surpassing both Indian and Middle Eastern populations when it comes to skin color and other physical traits.

Aki is completely false.
North Afrians are predominately of West-Eurasian extraction.
Fuck Afro-centrists.

There are few northafricans choosen because there are very few isolated populations. This is why this study is misleading. Africa has many exotic phenotypes, but only the minority have a great deal of diversity. Id think its highly likely that most bantu language speakers(600 million africans. are on par with Eurasians with monodiversity.