Large-scale differences in human genomes

Razib points me to a great plain-language article reviewing our current scientific understanding of human genetic variation.

The major focus is on copy-number variants (CNVs) - genetic variants involving the insertion or deletion of large chunks of DNA, sometimes spanning over a million bases. These large-scale variants lurked essentially unknown within the human genome until the recent advent of chip-based platforms, which make it possible to very rapidly assay almost the entire genome for their presence. Such surveys have revealed that large-scale CNVs are far more common than anyone expected - 65 to 80 percent of the population carries copy number variants affecting over 100,000 bases.

From my point of view, one of the more striking findings of CNV research over the last couple of years has been a somewhat disappointing one: there have been surprisingly few convincing, well-replicated associations between CNVs and common disease. Sure, a few signals from genome-wide association studies for obesity and auto-immune diseases have turned out to be linked to common CNVs, and there have been a rash of papers proposing links between very rare CNVs and psychiatric diseases such as schizophrenia and autism, but the yield has been lower than I would have expected given that these things are large (often affecting one or more genes) and remarkably common.

I suspect to some extent this is due to the relative immaturity of technology for assaying these variants: although platforms for surveying single-base genetic variants are now extremely accurate, current methods still struggle both to find CNVs and to unambiguously determine how many copies of a particular CNV an individual is carrying - particularly in cases of complex genetic rearrangements. That is already changing with the latest generation of chips, and will change even further once cheap long-read DNA sequencing becomes available from emerging third-generation sequencing technologies (e.g. Pacific BioSciences or Oxford Nanopore). No doubt examples of disease-associated CNVs will begin to mount up as technology improves.

Anyway, kudos to Tina Hesman Saey for a very accessible and timely review of an important topic.

 Subscribe to Genetic Future.

More like this

Getting an accurate genome sequence requires that you collect the data at least twice argue Robasky, Lewis, and Church in their recent opinion piece in Nat. Rev. Genetics [1]. The DNA sequencing world kicked off 2014 with an audacious start. Andrew Pollack ran an article in the New York Times…
For the past few months, the shake-up that began with Next Generation DNA Sequencing has been forcing me to adjust to a whole new view of things going on inside of a cell. We've been learning things these past two years that are completely changing our understanding of the genome and how it works…
Update 30/11/10: 23andMe has extended their 80% discount until Christmas, without a need for a discount code. Personal genomics company 23andMe has made some fairly major announcements this week: a brand new chip, a new product strategy (including a monthly subscription fee), and yet another…
New articles in PLoS Pathogens, PLoS Computational Biology and PLoS Genetics were published on Friday. My picks for this week are: Influenza Virus Transmission Is Dependent on Relative Humidity and Temperature: In temperate regions influenza epidemics recur with marked seasonality: in the northern…