Gene Expression

Genomic diversity & historical inference

p-ter points me to a new paper, Global distribution of genomic diversity underscores rich complex history of continental human populations:

Characterizing patterns of genetic variation within and among human populations is important for understanding human evolutionary history and for careful design of medical genetic studies. Here, we analyze patterns of variation across 443,434 SNPs genotyped in 3,845 individuals from four continental regions. This unique resource allows us to illuminate patterns of diversity in previously under studied populations at the genome-wide scale including Latin America, South Asia, and Southern Europe.Key insights afforded by our analysis include quantifying the degree of admixture in a large collection of individuals from Guadalajara, Mexico; identifying language and geography as key determinants of population structure within India; and elucidating a North-South gradient in haplotype diversity within Europe. We also present a novel method for identifying long-range tracts of homozygosity indicative of recent common ancestry. Application of our approach suggests great variation within and among populations in the extent of homozygosity suggesting both demographic history (such as population bottlenecks) and recent ancestry events (such as consanguinity) play an important role in patterning variation in large modern human populations.

These authors did not start out with the HGDP sample. Rather, these data sets tended to come from urban locations, while the HGDP derive from more isolated groups. Interesting they found that the genetic clusters tended to be less discrete than in the HGDP sample; reasonable in light of the disparate sampling strategies. The first figure has STRUCTURE and PC based charts to display genetic variation. I’ve rotated it clockwise so as to increase resolution (the screen width is 500 pixels).

i-bac309f9f52c08f6cc6901550b1f2ac7-pcbust1.jpg

Here’s a figure from a related paper:

i-415cf0d20d3c458e5b36e84efca54010-bust2.jpg

Note the similarity between Mexicans from Guadalajara and African Americans. Both are admixed populations, and their admixture is not equally distributed. This results in a wide range of genetic variation between the two source populations. South Asians overlap with Mexicans not because of shared recent ancestry, but the fact that the genetic distance between South Asians and East Asians is smaller than that between Europeans and East Asians (though South Asians are closer to Europeans than they are to East Asians). Interestingly rotating the figure shows the clear geographic variable when it comes to genetic variation in South Asia. The cluster on the PC chart replicates the rough triangular geography of South Asia. I looked in the data source and the “Dravidian” samples are all from the far south of India, with 2/3 of the South Asians being Indian Punjabis.

Also, the paper notes that inbreeding is not equally distributed within a population. So those messed up people you see around….