How many SNPs to distinguish Japanese & Chinese?

By razib on March 4, 2010.

A new paper in PLoS, Rapid Assessment of Genetic Ancestry in Populations of Unknown Origin by Genome-Wide Genotyping of Pooled Samples:

Many association studies have been published looking for genetic variants contributing to a variety of human traits such as obesity, diabetes, and height. Because the frequency of genetic variants can differ across populations, it is important to have estimates of genetic ancestry in the individuals being studied. In this study, we were able to measure genetic ancestry in populations of mixed ancestry by genotyping pooled, rather than individual, DNA samples. This represents a rapid and inexpensive means for modeling genetic ancestry and thus could facilitate future association or population-genetic studies in populations of unknown ancestry for which whole-genome data do not already exist.

All fine & good. But I thought figure 4 was interesting. I've highlighted the part that I thought noteworthy:

The panel on the left shows that with 420 Ancestrally Informative Markers (AIMS) you can separate the Japanese and Chinese (from Beijing) pretty well. These are markers which exhibit a lot of between population variation. But note the last section: with 3,100 random SNPs you can generate the same separation. And remember that you'd need way fewer markers for African populations, since they have more diversity. In any case, I'll keep these numbers in mind when people ask me genetic distance related questions and I know that Fst numbers won't mean anything....

Citation: K. Chiang CW, Z. Gajdos ZK, Korn JM, Kuruvilla FG, Butler JL, et al. 2010 Rapid Assessment of Genetic Ancestry in Populations of Unknown Origin by Genome-Wide Genotyping of Pooled Samples. PLoS Genet 6(3): e1000866. doi:10.1371/journal.pgen.1000866

More like this

I think that a length post summing up this kind of question, the difference between neutral markers (which Cavalli-Sforza tried to use) and significant difference (e.g. lactose intolerance), and the idea that groups are populations defined statistically by mixes of genes, rather than identities, and putting it in common sense language, would be valuable.

...with 3,100 random SNPs you can generate the same separation. And remember that you'd need way fewer markers for African populations, since they have more diversity.

Okay, here are my two questions Razib: If two bushmen have more diversity between them than a European and an East Asian, does that mean that the offspring of a Bushman and a European or an East Asian is less diverse at the genotype level than a full-blooded bushman? Also, I know that 'inbreeding' is not a healthy thing, but if anybody could do it with the least amount of problems would it be the bushmen?..

Razib,

Here's a theoretical analysis of resolving power as a function of FST, number of SNPs, etc.

http://infoproc.blogspot.com/2008/12/resolution-of-genetic-population.h…

With current technology any two European nationalities (or Chinese and Japanese) are easily distinguishable with random SNPs.

"For example, given a 100,000 marker array and a sample size of 1,000, then the BBP threshold for two equal subpopulations, each of size 500, is FST = .0001. An FST value of .001 will thus be trivial to detect. To put this into context, we note that a typical value of FST between human populations in Northern and Southern Europe is about .006 [15]. Thus, we predict: most large genetic datasets with human data will show some detectable population structure."

PS Thanks for that followup on lactose tolerance and height

Is there a reason why the AIMs seem to identify more Chinese individuals as Japanese than vice versa? Or is that just due to the sample size?

i wouldn't put too much stock in that, though i assume that the chinese population will be more diverse than the japanese (the main caveat being that the japanese are probably a recent admixture between yayoi, 3 parts, and jomon, 1 part).

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Communism V. Journalists: Beijing’s Crackdown on Press Freedom

More by this author

Remember to switch RSS feeds

April 3, 2010

If you link to this weblog from your weblog, please update links: http://blogs.discovermagazine.com/gnxp/ If you have not updated your feeds, please do so now: http://feeds.feedburner.com/GeneExpressionBlog The old feed address will point for another week or so to the new feed, but eventually it…

I'm moving to Discover

March 26, 2010

Update your bookmarks: http://blogs.discovermagazine.com/gnxp And RSS: http://feeds.feedburner.com/GeneExpressionBlog If you have a weblog that links to ScienceBlogs GNXP, I would appreciate you update the link for the sake of PageRank. There isn't much to say about the move. There wasn't one big…

Canada is not a "free society"

March 24, 2010

That's all I have to say to Eric Michael Johnson's post, Ann Coulter, Hate Speech, and Free Societies. OK, seriously, from what I recall Eric is an American, though resident in the forgotten north. American absolutist stances on free speech are not shared by most Western societies, so demanding…

Others in Siberia

March 24, 2010

The complete mitochondrial DNA genome of an unknown hominin from southern Siberia: With the exception of Neanderthals, from which DNA sequences of numerous individuals have now been determined...the number and genetic relationships of other hominin lineages are largely unknown. Here we report a…

The biophysical limits of cognitive computation

March 23, 2010

In this diavlog with Glenn Loury the behavioral economist Sendhil Mullainathan recounts the results of an experiment. - If given the option of paying $100 for an item vs. $80 for an item, but in the second case having to go across town for the item, respondents choose $80 and going across town - If…