In this study, we first conducted a genome-wide association study in a Hong Kong Chinese population, followed by replication in three other cohorts from Mainland China and a cohort from Thailand, which totaled 3,300 Asian patients and 4,200 ethnically and geographically matched controls. We identified novel variants in ETS1 and WDFY4 associated with SLE with genome-wide significance and confirmed the association of HLA locus, STAT4, BLK, IRF5, BANK1, TNFSF, and IRF5 with the disease. ETS1 encodes a critical transcription factor involved in Th17 and B cell development. Allelic expression study showed a significantly lower expression of ETS1 from the risk allele, which provided functional support to the genetic findings. WDFY4 is a huge protein with unknown function but is predominantly expressed in primary and secondary immune tissues, and a nonsynonymous SNP in this gene was found to be highly associated with SLE susceptibility. Our findings shed new light on the function of these genes as well as the mechanism of this devastating disease.
This sort of stuff is necessary because you can’t always extrapolate associations found in studies where the subjects were of population X to population Y. Even this minimal level of population genetic variance might cause issues:
An interesting analysis result from our GWAS data is the difference between Hong Kong samples and samples collected in Taiwan and Beijing, shown by principal component analysis…It suggests population substructure for Chinese living in different regions, which may cause spurious findings in association studies when cases and controls are not well matched. With most of the genetic variants of relatively larger effect sizes already being identified, GWAS becomes more susceptible to effects from mismatches between cases and controls in dealing with SNPs of smaller effect sizes. Our analysis echoed two very recent reports delineating population substructures in Chinese populations living in different geographical regions
Imagine that the NIH funded genome-wide association studies where the subjects were Americans of Chinese descent. Because of the nature of American Chinese communities the population would be strongly skewed toward Cantonese and Fujianese. In China itself Cantonese and Fujianese are only a small minority, so that might limit the utility of the inferences in terms of their generalizability. Of course, if you had diverse regional and dialect groups thrown into a big pool then you might get spurious associations, which is the worry mooted above.
Citation: Yang W, Shen N, Ye D-Q, Liu Q, Zhang Y, et al. 2010 Genome-Wide Association Study in Asian Populations Identifies Variants in ETS1 and WDFY4 Associated with Systemic Lupus Erythematosus. PLoS Genet 6(2): e1000841. doi:10.1371/journal.pgen.1000841