Guest post: Kai Wang on the McClellan and King critique of genome-wide association studies

Kai Wang is a postdoctoral fellow at the Center for Applied Genomics, Children's Hospital of Philadelphia and an author on numerous genome-wide association studies. He left this lengthy comment as a response to my recent post on this comment by McClellan and King in Cell, and I felt it warranted promotion to a full post (with Kai's permission). For more discussion of the M&K review see also two recent posts by Steve Turner at Getting Genetics Done, and an excellent post from p-ter at Gene Expression. 

A similar version of this comment is also published at Getting Genetics Done. I've done some mild editing here for clarity, added some sub-headings and links, and deleted two statements that could be regarded as ad hominem arguments. None of these changes affect the substance of Kai's argument.

Citation: McClellan, J., & King, M. (2010). Genetic Heterogeneity in Human Disease Cell, 141 (2), 210-217 DOI: 10.1016/j.cell.2010.03.032

Quite a few people mentioned to me about the McClellan et al paper and the related Internet posts about it (including those in Genetic Future). Discussion on at least three diseases in the paper (hearing loss, SCA and autism) cited some of my published papers, and I therefore decided to post my comments on the Internet, to set the records straight. 
Although I whole-heartedly agree that rare variants play a substantial role in human diseases, I also think that the section on GWAS reflects misunderstandings of the concept of GWAS, ignorance of standard practices in GWAS, misinterpretation of published primary research data, and as a result, is misinforming the general readership of Cell. These issues need to be rectified for the good of the scientific community, and for the healthy development of methodology and practice of human genetic research.

For impatient readers, these are the major points: 

  1. GWAS interrogate disease loci through linkage disequilibrium, so the lack of known biological function on GWAS SNPs does not justify the attack against GWAS by McClellan et al; 
  2. Methods for adjusting population stratification are well established in the GWAS community; it is not a valid argument to explain most GWAS signals (with odds ratio less than 2) by stratification, especially if family-based study design is used (including the autism GWAS); 
  3. McClellan et al used rs4307059 (from autism GWAS) as a "particularly dramatic" example of stratification because its frequency varies across Europe and it is monoallelic in Africa, which is not scientifically and statistically justified. In fact, it is the nature of SNPs to have differing allele frequencies across populations, and almost half of the SNPs in Illumina array have higher Fst population divergence values than rs4307059 (that is, half the SNPs are more variable than rs4307059 across human populations). 

Below I elaborate these points more specifically for interested readers. 

1. Lack of known biological function doesn't invalidate GWAS
McClellan et al use the fact that most detected SNPs in GWAS are from intergenic regions to question the utility and the reliability of GWAS, and raised a serious question: "How did genome-wide association studies come to be populated by risk variants with no known function?". 

In fact, GWAS do not attempt to identify functional SNPs, but rather identify approximate location of loci that harbor disease variants. This is possible due to the extensive linkage disequilibrium (LD) between segregating sites in a given human population. Most SNPs in SNP arrays have unknown biological function, only because most SNPs in HapMap are outside of coding regions and because manufacturers of SNP arrays usually do not select SNPs by known function. Unfortunately, this fact may not be well known outside of the GWAS community, such as most readers of the journal Cell. 
McClellan and King did mention LD but they did not recognize that GWAS do not attempt to interrogate causal variants in the first place. More interestingly, they discussed the SCA GWAS and hearing loss GWAS that I published; the hits in both GWAS are actually outside but close to the causal gene (HBB and GJB2), yet they tag exonic variants in the causal gene, representing two particularly vivid and classic examples on how GWAS work through LD. 
It is unclear how McClellan and King can discuss these two examples extensively by ignoring the basic facts that both non-coding hits indeed faithfully tag the causal variants in causal genes through the magic of LD. For readers not familiar with GWAS, I need to also emphasize that GWAS variants were typically referred to as "risk variants" only because of convention of published literature, not because they are the actual functional variants that confer risk. Unlike what some readers may think based on McClellan and King, 100% of Africans carry a risk allele does not suggest that all subjects of African descent are predisposed to risk; it merely suggest that LD patterns in European and African populations at a locus are different. One cannot interpret GWAS results without acknowledging these basic facts.
2. Population stratification is not a plausible explanation for most GWAS hits
McClellan and King erroneously attributed many published GWAS hits as caused by population stratification, as if GWAS used similar strategies as candidate gene association studies. Without any scientific support, they even claimed that "an odds ratio of 3.0, or even of 2.0 depending on population allele frequencies" would be robust to be interrogated in GWAS. 
In fact, the beauty of whole-genome SNP data is that inflation of test statistics due to population substructure can be identified and adjusted. Populations do not differ in one or two SNPs; they differ in many loci and that explains why whole-genome data helps identify stratification, and several recent studies already show how extremely fine-scale sub-populations in Europe can be separated by whole-genome data. The GWAS community has established methods to deal with population stratification and these methods are fairly effective for common variants without any controversy in the field. 
There are certainly some challenges on analyzing rare variants or recently admixed populations, and these are research topics that we are actively studying. McClellan and King failed to inform readers of the standard practices of genomic control, EigenStrat, multi-dimensional scaling or many dozens of other approaches for addressing stratification, which are now commonly used in case/control GWAS. Furthermore, family-based study design in GWAS has the advantage of protecting against stratification, which should be emphasized to readers. For example, McClellan and King attack our autism paper as a false positive due to population stratification, but our paper is largely driven and replicated by family-based cohorts, not case/control cohorts. 
Therefore, their general claim lacks scientific support, ignores massive amounts of work by the statistical genetics community in developing stratification adjustment methods, and reflects unrealististic speculation and unfamiliarity with standard GWAS practices.
3. The provided example of a false positive hit is exaggerated
McClellan and King mistakenly treat GWAS hits as "false positive" if their allele frequencies vary across European populations or HapMap populations. The allele frequency variation for ANY (I mean it, ANY!) SNP across populations is not something that should be surprising to researchers with substantial GWAS knowledge. Of course, it is the very nature of ANY SNP to have variable allele frequencies across human populations, so that Asians, Caucasians and Africans differ from each other. 
It appears that McClellan and King are surprised because they believe that most SNPs should have similar allele frequencies in all populations. Specifically, they described the SNP rs4307059, reported by us to be associated with autism, as a "particularly dramatic example of the perils of cryptic population stratification". Their reasoning on "stratification" is that the frequency of the proposed risk variant varies from 0.21 to 0.77 across European populations and that it is monomorphic in African populations. 
In reality, the allele frequency of rs4307059 is fairly consistent among large cohorts of European Americans (MAF=39%), WTCCC (MAF=38%), POPRES British (MAF=39%), POPRES Spanish (MAF=37%). In HGDP data, I did confirm that the allele frequency differ in Tuscany (MAF=75% in 7 samples, yes you read it right, SEVEN) and Orcadian (MAF=25% in 15 samples), but readers should be aware that frequency estimate depends on the sample size (seriously, mathematically, what would you expect from 7 or 15 samples, and how much do these two populations contribute to genes in European Americans?). 
[Update: Kai adds: "I realized that the Toscani population is actually part of HapMap3, so the allele frequency can be inferred from there (n=102, still small but good enough). I assumed that "Toscani in Italia" in HapMap is similar to "Tuscan Italy" in HGDP. The MAF (C allele) is indeed 41% in HapMap sample (202 chromosomes, HapMap 3 release 3 (warning: huge file), which is fairly similar to European Americans and not even remotely close to the 77% number inferred from n=7 by McClellan et al."]
Furthermore, assuming that allele frequency measures are indeed accurate, if we want to do science rigorously, we need appropriate control experiments, so let us compare this SNP with others in the same genomic region: there is no any evidence of increased population differentiation for this particular SNP in 2Mb genomic region across human populations (chr5:25500000..26499999 in the HGDP browser). Finally, if we examine the SNP in the context of the whole genome, based on HGDP browser, we can see that 44% of SNPs (-log(0.44)/log(10)=0.35 for rs4307059 in the "Fst" track, raw data) in the Illumina array have a more extreme Fst values than this SNP, so about half of the SNPs have stronger population divergence than this SNP. One cannot just take a random SNP from the MIDDLE of a ranked list and claims it as "particularly striking" example of population stratification. Any such claim needs to be made in the context of comparative analysis with other SNPs, otherwise it is not a scientifically rigorous practice and serves a purpose solely to misinform readers outside of the field.
[DM: for a graphic illustration of this point, see this post from Steven Turner.]
4. Misinterpretation of the autism GWAS
McClellan and King's interpretation of the autism locus is wrong. McClellan and King utilized this as an example of "false positive", without any valid scientific evidence (differences of allele frequencies in Tuscany and Africans does NOT suggests false positive in European Americans!). Another study (Weiss et al.) cited by McClellan and King was not able to garner evidence for this SNP, but the study has very small non-overlapping sample size and therefore little power to "replicate" loci with moderate effect sizes. Furthermore, Weiss et al. used a family-based association test (TDT test), so there is no comparison of case/control allele frequencies as mentioned by McClellan and King. 
Due to power issues and sample comparability issues, Weiss and Arking (both are nice people who I know) faithfully described their research results in the paper without comments, yet McClellan and King mistakenly interpolate these primary results without scientific support and attach a "false positive" label that completely misled the scientific community. On the other hand, McClellan and King failed to mention another companion study identifying this same locus purely by family-based cohorts. In addition, a paper in press shows that the SNP also functions as a quantitative trait locus for autistic traits in ~8000 children in a single UK city born at the same year, which pretty much blows away any concern on stratification in case/control studies. 
For me, these are compelling evidence that population stratification does not explain the signal, though I think that functional studies are certainly necessary to identify causal variants and to study their roles. In summary, their criticism on the autism locus lacks any rigorous scientific support whatsoever. 
5. Misinterpretation of hearing loss and sickle cell anemia GWAS
McClellan and King mistakenly interpreted the hearing loss GWAS and sickle-cell anemia GWAS that we published in PLoS Biology. Interestingly, they even have a somewhat opposite interpretation of the primary research data presented in our paper: our original purpose is to demonstrate how rare variants may contribute to human diseases (and may show up in GWAS through LD with common SNPs in Illumina arrays), so our paper should really be interpreted as supporting the arguments for studying rare variants in their paper
For readers, I need to clarify that sickle-cell anemia is a classic example of heterozygosity advantage in any genetic textbook, and our study demonstrates how rare alleles under balancing selection can show up in GWAS. On the other hand, hearing loss is known to be caused by many genes but the major cause is GJB2 mutation, so the GWAS demonstrates that moderately rare alleles (MAF=1.2%) can be picked up by GWAS without balancing selection. I simply do not understand what they are trying to get by "had inherited hearing loss been investigated in a region where it is more common (e.g., in the Middle East), ", as any GWAS should be focused on a specific ethnicity group, and I cannot just combine Caucasians with Middle East people together and of course this will dilute the signal in GWAS. 
Why would I even bother to apply GWAS "in heterogeneous populations of common diseases" at all, as suggested by McClellan and King, when the very power of GWAS comes from examination of LD? I do not understand how they can take the exactly same results and re-interpret the data and get a drastically different interpretation from the data.
Conclusions
I will send a shortened version of my comments to Cell. I cannot predict what will be the outcome of this appeal, but I would appreciate comments from readers of this post and I will try to address them. 
I wonder what is the appropriate balance between academic freedom and scientific responsibility for researchers to make comments on subjects outside of their expertise in the absence of rigorous scientific support; I also wonder what is the appropriate standard for basic fact checking for journals to publish especially strong claims, even for non-research articles (essays/commentary/review), and what is the appropriate response from well-respected journals to recognize and rectify these mistakes. Let us wait and see.

More like this

A piece such as this should have been put through peer review since it is positioned as a scientific critique and since King has standing as a geneticist (she did contribute, after all, do the original BRCA mappings). Whether it was or not is probably impossible to determine from simple examination. Even if journals do not adhere to an open review process (which I am ambivalent about), they should clearly label which articles have been through review and which have not.

Couldn't population stratification be completely controlled for by limiting studies to within-family comparisons?

Could not agree more with Keith Robinson - MC King x Cell = incredibly influential. The fact that is is pure personal opinion and not based on any particular research effort should be very clear - Nature & Science do a much better job of differentiating between opinion and hard research

I agree entirely that a response is required to Cell. Some top-quality peer-reviewed work has been dismissed needlessly.

However, the focus on McClellan and King's misrepresentation of GWAS has downplayed any criticism of the rest of the opinion piece.

Basically, even if we decided, yeah, we've done enough GWAS now, and "it is time to sequence" - who do we sequence?

So, the McClellan and King opinion piece ends (with my emphasis):

A Time to Sequence â With an Appreciation to Maynard Olson

...

Genome-wide screening for mutations remains the most effective and unbiased way to discover genes involved in complex illnesses. Heretofore, the identification of rare severe disease-causing variants was limited by the resolution of mutation detection strategies. The widespread availability of next-generation sequencing technology renders this limitation essentially moot. Designs based on genome-wide identification of all exonic variants, all variants in a defined genomic region, or even all variants in a whole genome are replacing genome-wide association approaches. However, although the power of sequencing is enormous, genetic heterogeneity remains a daunting challenge. With next-generation sequencing technology, the issue is not finding potentially deleterious mutations but rather determining which of many potential deleterious mutations in an individual play a role in disease.

Two powerful strategies for identifying critical mutations are (1) tracing coinheritance of potential disease alleles with the illness in severely affected families, and (2) identifying different rare functional mutations in the same gene in unrelated affected individuals.

Do you think it is time to say that BRCA mutations are not a very helpful paradigm in complex disease genetics?

My guess would be that while both strategies might give early results - by separating rare highly-penetrant sub-diseases out of a complex disease - neither will have much to say about the genetic component of most sporadic cases.

To give a concrete example: in type 1 (childhood) diabetes, a VNTR near the insulin gene, INS, is associated with the disease. A small number of cases (< 0.1% - some of those with very early onset), have mutations in the INS gene itself.

* Do these mutations explain the association? No: there aren't enough people with the mutations.

* Does knowing there are mutations advance the understanding of the disease? No: the region was first identified in 1984 and has been extensively worked.

While I can well imagine that some families would want to know whether they are carriers - although, unlike other known rare mutations, this would not affect treatment - sequencing these people is a clinical genetic testing service, and not a research proposal.

So - who and what do we sequence? It is not hypothesis-free ...

Underlying assumptions about the nature of genomes are the major issue of this discussion. 'Science' recently published a wide-ranging study of a plant, Arabidopsis thaliana, which study revealed no single genome across the global spread of that species. A very plastic genome was the term used, throwing doubt on the idea of undifferentiated, concrete species genomes, and posing instead an image of undifferentiated and highly mobile genomes, that produce wide percentages of variability in genes within a single species.
GWAS depends for its logical basis upon species genomes being undifferentiated and more or less static across global dispersions, something which is an assumption.
Studies of creatures with minimal genomes cast doubt upon this assumption.

Science article ref:One species, many genomes, 20 July 2007, <www.eurekalert.org/pub_releases/2007-07/m-osm072007.php

By Graham Philip (not verified) on 30 Apr 2010 #permalink

woops, missed a mistake above, should read: A very plastic genome was the term used, throwing doubt on the idea of DIFFERENTIATED, concrete species genomes.
sorry.

By Graham Philip (not verified) on 30 Apr 2010 #permalink

The paper by McClellan and King argues that many findings from GWAS may be false positives based on cryptic population stratification, of a kind that has not been corrected for by current GWAS protocols. Whether this is true or not, it is only one part of their argument. More fundamentally, they argue that it is expected that the variants contributing the most to phenotypic variance in individuals will be rare and of large effect size.

This is based on very sound evolutionary genetic arguments and modeling (e.g., see paper by Adam Eyre-Walker, below). It also has strong empirical support from two angles: first, even accepting the GWAS positives as real, they have been so few and with such small effect size that one can draw the general conclusion that common variants do not contribute substantially to phenotypic variance (which is why they are common).

Second, a growing number of rare, highly-penetrant mutations are being identified for all kinds of "complex" disorders. Such disorders appear complex when viewed across the population but this may simply reflect the fact that many clinical diagnoses (like autism or schizophrenia) are umbrella terms for very heterogeneous groups of disorders.

(This is not to underestimate the added complexity of phenotypic expression due to genetic background effects and non-genetic effects on the phenotype)

See Mitchell and Porteous for a discussion of these issues in relation to schizophrenia and the Wiring the Brain blog for more:

http://wiringthebrain.blogspot.com/2010/04/mad-mice.html
http://wiringthebrain.blogspot.com/2010/03/is-mental-illness-good-for-y…
http://wiringthebrain.blogspot.com/2009/07/hot-news-in-genetics-of-schi…

Mitchell, K., & Porteous, D. (2010). Rethinking the genetic architecture of schizophrenia Psychological Medicine DOI: 10.1017/S003329171000070X

Eyre-Walker A. Evolution in health and medicine Sackler colloquium: Genetic architecture of a complex trait and its implications for fitness and genome-wide association studies. Proc Natl Acad Sci U S A. 2010 Jan 26;107 Suppl 1:1752-6. http://www.pnas.org/content/107/suppl.1/1752.long