Matthew Herper rounds up some of the discussion about the decreasing cost of genomics. But one thing that hasn’t been discussed much at all is the cost of all of the other things needed to make sense of genomes, like metadata. I briefly touched on this issue previously:
A related issue is metadata–the clinical and other non-genomic data attached to a sequence. Just telling me that a genome came from a human isn’t very useful: I want to know something about that human. Was she sick or healthy, and so on. These metadata too, will have to be standardized: I can’t say one of genome came from someone who was “sick”, while you provide another genome from someone who had “inflammatory bowel disease.” Worse, I can’t say my patient had IBD, while yours had Crohn’s disease. The data fields have to be standardized, so we’re not comparing apples and oranges.
But I think this has to be considered much more in depth when thinking about the cost of genomics (as opposed to sequencing).
It isn’t that expensive to genotype, and very soon, to anonymously sequence the first 30,000 people that left North Station on a given morning. But that wouldn’t be very useful. Why? Because we need to know all sorts of characteristics. That’s why the various genotyping cohorts–groups of well-characterized subjects–are so critical. Importantly, when someone thinks of a new question (i.e., metadata that haven’t been collected), it’s often possible to return to those subjects, so we don’t have to sequence or genotype new people.
This hasn’t really been considered too costly, since the cost of sequencing/genotyping (and the other needed molecular biology steps) have been far more greater than the cost of collecting the relevant clinical information. But that’s going to become relatively more expensive. Before people start ranting about ‘regulatory issues’, I’m simply talking about the cost of hiring people to collect and manage the metadata. That’s not cheap, and would be a substantial cost.
As the discussion over ‘missing heritability’ indicates, we’re not quite ready to ‘go full diagnostic’ yet (not that there aren’t useful things being done; there are). We still need good clinical metadata.
If you’re in the advanced metadata class, you might ask, “Why don’t we just pull the data from the patients’ records?” Well, that’s a huge challenge. Medical systems organize their data in different ways, so avoiding the apples-and-oranges problem isn’t trivial. Likewise, there would probably be metadata that you wouldn’t want to collect–those have to be stripped out. This has been proposed regarding antibiotic resistance, which has much simpler data and metadata, and has come to naught, so I’m not entirely optimistic.
If you want to sequence your genome for your own needs, it could be really cheap. But to place it into context will require cost more, maybe a lot more..