Genetic Future

Harvard biostatistician Peter Kraft (co-author of an excellent recent article on genetic risk prediction in the New England Journal of Medicine) has just added an interesting comment on his experience of this week’s Consumer Genetics Show:

I just wanted to share what for me were two stand-out moments at the
CGS. First was Zak Kohane’s discussion of the “Incidentalome”–a great
turn of phrase that captures something I’ve been mulling over myself.
(A less eloquent statement of this idea made it into the recent Nick
Wade NYT article on genetic risk prediction). Basically, the idea is
that even if you have great tests with high clinical validity
[accuracy]–which is not the case right now by a long shot with genetic
risk models–if you do lots of tests, then your chance of at least one
false positive goes up–and so does the risk of unnecessary

From Zak’s JAMA article on the “Incidentalome” (12jul2006):

“Physicians know that as the number of tests increases, the chance
that a spurious abnormal test result will arise also increases. They
also know that it is difficult to ignore abnormal findings, and they
often must embark on a sequence of more expensive tests to investigate
the findings.”

This was a theme of several presentations: that, despite the
potential of “personalized medicine” to reduce costs and improve
health, its immediate impact may be to increase costs by leading to
increased testing [or just increased demands on clinicians' time
explaining why these tests are not clinically useful] without a
decrease in disease burden. [See "Raiding the medical commons," JAMA.

Kari Stefansson pushed back a little, saying [if I understood
correctly] that the risk of false positives is not a problem, as what
DecodeMe [etc.] give the consumer are estimates of their risk [not
binary tests of high/low risk], and that these estimates are unbiased.
I’ll concede the last point here, modulo the fact that these estimates
are based on current knowledge which is incomplete and changing quickly
[cf my NEJM article with David Hunter]. But I don’t think providing
unbiased estimates of relative or absolute risk solves the problem of
the Incidentalome. To me, everybody [after discussions with their
clinician, of course] has a set of risk thresholds above which they
will take action [if there's an action available]. So in practice you
end up with a set of decision rules: do nothing, watch more carefully,
intervene aggressively.

The term “incidentalome” will probably sound somewhat familiar to clinicians; it’s derived from the term “incidentaloma“,
which refers to “a tumour found by coincidence without clinical
symptoms or suspicion”, usually as a result of a whole-body scan. Such
tumours can often be perfectly benign, but nonetheless result in a whole
series of additional (and often invasive) tests to determine their

Any test that generates large amounts of potentially health-relevant data is prone to incidental findings (both genuine unexpected findings and spurious artefacts), and this will certainly be the case for whole-genome sequencing.

The first type of finding will be complete technical artefacts – false positives due to sequencing error – which will be a non-trivial problem over the next few years as the wrinkles are ironed out of rapid sequencing technologies, but will reduce in number as accuracy improves. Once sequencing accuracy is high enough these sorts of findings can be ruled out fairly easily through downstream validation assays (although there certainly is a need to design faster, cheaper and more readily customisable assays for novel sequence variants).

A much larger problem will be genuine sequence changes that would be predicted to seriously mess with the function of an important gene, but for which the health consequences are unknown. Accurate functional assays don’t exist for most genes (and are cumbersome and imperfect even for well-studied genes such as BRCA1; hence the large numbers of BRCA mutations ending up in the “variants of uncertain significance” category), so it will often be difficult, expensive or simply downright impossible to get a handle on the effects of a newly discovered variant on disease risk.

Is this really a good argument against widespread genome sequencing, however, as some people have suggested? I don’t think it’s a compelling one; rather, it’s a strong incentive for the medical establishment to start thinking hard about developing evidence-based strategies for dealing with uncertain genetic data and deciding which of the three strategies Kraft notes is most appropriate: do nothing, watch more carefully,
or intervene aggressively. Preventing people from getting access to genetic information is obviously not a productive long-term solution to the problem of incidental findings.

Kraft continues:

The other high point for me was RC Green’s talk on the results of
the REVEAL (Risk Evaluation and Education for Alzheimer’s Disease)
study. This is one of the few studies [that I know of] that has
measured how folks react to genetic risk testing–whether intensive
counseling pre- and post-test minimizes adverse psychological effects
or maximizes information recall/understanding, how folks interpret and
act on risk estimates. Lots of ink has been spilled about these things,
but there has not been much empirical research along these
lines–although that is sure to change soon. In his keynote, Francis
Collins highlighted this [empirical research into how best to convey
information from genetic tests, how these tests are used by physicians
and patients] as an important area for future research.

This is indeed an incredibly important (and astonishingly under-studied) field. I recently saw a presentation by Theresa Marteau describing her currently unpublished systematic literature review of studies looking at the effects of genetic testing results on behaviour. I was shocked at how little literature actually exists on this topic, but also intrigued by the general findings of the studies done so far: it seems as though testing results – even for serious diseases – actually have virtually no impact on long-term behaviour or quality of life.

There’s clearly much more research to be done here, but if this general finding is confirmed it would be both good and bad news for personal genomics companies: good news in that it means that the wilder claims of critics (customers jumping off bridges after receiving news of an increased Alzheimer’s risk) are overblown, but obviously bad news in that one of the primary stated motivations of companies like 23andMe is to motivate customers to improve their lifestyle.

Anyway, it sounds as though we will soon have a much clearer idea one way or the other about the effects of genetic data on behaviour. It will be good to see the discussion on this issue driven by data rather than protectionistic fear-mongering on one hand and commercial hype on the other.


  1. #1 Steven Murphy MD
    June 12, 2009

    Anyone who does clinical genetics work will tell you that genomic false positives is high. I have mentioned this in no less than 5 of my blogposts. This IS the dirty little secret of these sequencing companies. I will keep us in biz for a while…..

  2. #2 Steven Murphy MD
    June 13, 2009

    The tolerance of false positive lies directly proportional to the severity of disease it aims to predict. Thus, indels, cnv, etc. In gene deserts, or with no family history of disease may be less tolerated than you think. Until the cost is 99 dollars or less for whole genome, most people will want insurance to pay for the novelty. They sure as hell don’t put up with false positive work ups…..

    Zak said this in 2006 and he is spot on today.


  3. #3 wei
    June 15, 2009

    Dr Murphy:
    According to my reading, Peter kraft’s NEJM paper actually argues that those ‘false positive’ findings, which were not replicable in follow-up replication studies, may be true associations. If there are 200-400 true associations and you only see a few out of them in one study, it is very likely that a different study will give you a complete different set of associations; and people will need quite a large number of replication studies to actually confirm those false associations.
    it is a quite important piece work to tell the funding agency that we are not wasting your money and to help graduate students to regain their morale.

  4. #4 Peter Kraft
    June 16, 2009


    Two quick comments.

    1) We’re talking about two different kinds of false positives. a) Folks who are predicted to develop disease by a certain age, but don’t; b) genetic associations observed in one study that fail to replicate, either because they were chance findings or because of some bias in the original study. The discussion of the incadentalome is focused on a); you’re thinking of b). We touched on both in the NEJM price.

    2) It’s not so much that some associations failed to be replicated–it’s that replication efforts for any single GWAS because they are so costly have been focused on a tiny proportion of the most promising markers. There are lots of true associations waiting to be replicated.

    Agreed on grounds for optimism. Whether or not every finding from a GWAS ends up having a direct role in the clinic as part of some risk prediction algorithm, there are certainly more disease/trait loci to be found (as of this moment) using the current GWAS paradigm. For arguably most diseases/traits we haven’t gotten the most out of the data we’ve assembled yet [e.g. there has been no large meta-analysis of breast cancer GWAS or any cancer I can think of]–so I don’t think it’s time to jump ship and move all our investment into new approaches like sequencing to find rare variants [although these are intriguing and deserve attention as well]. These new findings are likely to give us insights into basic trait/disease biology.

  5. #5 Daniel MacArthur
    June 16, 2009

    I completely agree with Peter that there is plenty more useful information remaining to be squeezed out of existing GWAS data. While I’m a pretty strong advocate of moving towards large-scale sequencing as soon as practical, it would be foolish to completely abandon the experimental and informatic infrastructure built around GWAS studies before the time is right.

    As I mentioned in this post, it looks as though we’ll see a considerable overlap period over the next couple of years as GWAS chips dig down into rarer variants (and employ increasingly sophisticated analytical methods) while sequencing slowly becomes affordable enough to scale up to studies involving thousands of patients and controls with whole genome sequences.

    In the meantime, mining existing GWAS data in clever ways – with the aim of uncovering new signals, refining existing signals and revealing the biological basis of the associations – promises to be a lucrative area of research for the next couple of years.

  6. #6 Paul Jones
    June 16, 2009

    If you were told that you had a rare allele for a preventable disease, would you pay 100$ to re-sequence that particular gene?

    I would.

    If you were told that you had a rare allele of a gene conferring a ~.01% chance of hyperlipidemia, would you pay $100 dollars to resequence that gene?

    I would not.

    Context is king.