Precision medicine: Hype over hope?

By oracknows on September 8, 2015.

I am fortunate to have become a physician in a time of great scientific progress. Back when I was in college and medical school, the thought that we would one day be able to sequence the human genome (and now sequence hundreds of cancer genomes), to measure the expression of every gene in the genome simultaneously on a single "gene chip," and to assess the relative abundance of every RNA transcript, coding and noncoding (such as microRNAs) simultaneously through next generation sequencing (NGS) techniques was considered, if not science fiction, so far off in the future as to be unlikely to impact medicine in my career. Yet here I am, mid-career, and all of these are a reality. The cost of rapidly sequencing a genome has plummeted. Basically, the first human genome cost nearly $3 billion to sequence, while recent developments in sequencing technology have brought that cost down to the point where the "$1,000 genome" is within sight, if not already here, as illustrated in the graph above published by the National Human Genome Research Institute. Whether the "$1,000 genome" is truly here or not, the price is down to a few thousand dollars. Compare that to the cost of, for instance, the OncoType DX 21-gene assay for estrogen receptor-positive breast cancer, which costs nearly $4,000 and is paid for by insurance because its results can spare many women from even more expensive chemotherapy.

So, ready or not, genomic medicine is here, whether we know enough or not to interpret the results in individual patients and use it to benefit them, so much so that President Obama announced a $215 million plan for research in genomic mapping and precision medicine known as the Precision Medicine Initiative. Meanwhile, the deeply flawed yet popular 21st Century Cures bill, which passed the House of Representatives, bets heavily on genomic research and precision medicine. As I mentioned when I discussed the bill, it's not so much the genomic medicine funding that is the major flaw in the bill but rather its underlying assumption that encouraging the FDA to decrease the burden of evidence to approve new drugs and devices will magically lead to an explosion in "21st century cures," the same old antiregulatory wine in a slightly new bottle. Be that as it may, one way or the other, the federal government is poised to spend lots of money on precision medicine.

Because I'm a cancer doctor, and, if there's one area in medicine in which precision medicine is being hyped the hardest, it's hard for me not to think that the sea change that is going on in medicine really hit the national consciousness four years ago. That was when Walter Isaacson's biography of Steve Jobs revealed that after his cancer had recurred as metastatic disease in 2010. Jobs had consulted with research teams at Stanford, Johns Hopkins, and the Broad Institute to have the genome of his cancer and normal tissue sequenced, one of the first twenty people in the world to have this information. At the time (2010-2011), each genome sequence cost $100,000, which Jobs could easily afford. Scientists and oncologists looked at this information and used it to choose various targeted therapies for Jobs throughout the remainder of his life, and Jobs met with all his doctors and researchers from the three institutions working on the DNA from his cancer at the Four Seasons Hotel in Palo Alto to discuss the genetic signatures found in Jobs' cancer and how best to target them. Jobs' case, as we now know, was a failure. However much Jobs' team tried to stay one step ahead of his cancer, the cancer caught up and passed whatever they could do.

That's not to say that there haven't been successes. For instance, in 2012 I wrote about Dr. Lukas Wartman, at the time a recently-minted oncologist who had been diagnosed with acute lymphoblastic leukemia as a medical student, was successfully treated, but relapsed five years later. He underwent an apparently successful bone marrow transplant, but recurred again. At that point, there appeared to be little that could be done. However, Dr. Timothy Ley at the Genome Institute at George Washington University decided to do something radical. He sequenced the genes of Wartman's cancer cells and normal cells:

The researchers on the project put other work aside for weeks, running one of the university's 26 sequencing machines and supercomputer around the clock. And they found a culprit — a normal gene that was in overdrive, churning out huge amounts of a protein that appeared to be spurring the cancer's growth.

That was 2011 as well. Today, the sequence could have been done much more rapidly. In any case, Ley identified a gene that was overactive and could be targeted by a new drug for kidney cancer. His cancer went into remission. Wartman is now the assistant director of cancer genomics at Washington University.

The technology now, both in terms of sequencing and bioinformatics, has advanced enormously even since 2011. With it has advanced the hype. But how much is hype and how much is really hope? Let's take a look. Also, don't get me wrong. I do believe there is considerable promise in precision medicine. However, having personally begun my research career in the 1990s, when angiogenesis inhibitors were being touted as the cure to all cancer (and we know what happened there), I am also skeptical that the benefits can ever live up to the hype.

The origin of "precision" medicine

"Precision medicine" is now the preferred term for what used to be called "personalized medicine." From my perspective, it is a more accurate description of what "personalized medicine" meant, given that many doctors objected to the term because they felt that every good doctor practices personalized medicine. Even so, "precision medicine" is no less a marketing term than was "personalized medicine." If you don't believe this, look at the hype on the White House website:

Today, most medical treatments have been designed for the "average patient." In too many cases, this "one-size-fits-all" approach isn't effective, as treatments can be very successful for some patients but not for others. Precision medicine is an emerging approach to promoting health and treating disease that takes into account individual differences in people's genes, environments, and lifestyles, making it possible to design highly effective, targeted treatments for cancer and other diseases. In short, precision medicine gives clinicians new tools, knowledge, and therapies to select which treatments will work best for which patients.

If you think this sounds like what alternative medicine quacks (but I repeat myself) routinely say about "conventional medicine," you'd be right. It's not that precision medicine advocates don't have a germ of a point, but they fail to put it this criticism into historical context. Medicine has always been personalized or "precision." It's just that in the past the only tools we had to personalize our care were things like family history, comorbid conditions, patient preferences, and aspects of the patient's history that might impact which treatment would be most appropriate. In other words, our tools to personalize care weren't that "precise," making our precision far less than we as physicians might have liked. Genomics and other new sciences offer the opportunity to change that, but at the cost that too much information will paralyze decision making. Still, at its best, precision medicine offers the opportunity to "personalize" medicine in a science-based manner, rather than the "make it up as you go along" and "pull it out of my nether regions" method of so many alternative medicine practitioners. It could also offer the clinical trials tools to do it, such as NCI-MATCH. At its worst, precision medicine is companies jumping the gun and selling genomic tests direct to the consumer without having an adequate scientific basis to know what they mean or what should be done with the results.

In any case, up until 2011, the term "personalized" medicine tended to be used to describe a form of medicine not yet in existence in which the each patients' unique genomic makeup would serve as the basis to guide therapies. Then, the National Academy of Sciences Committee issued a report, "Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and a New Taxonomy of Disease", which advocated the term "precision medicine" and differentiated it from "personalized medicine" thusly:

"Personalized medicine" refers to the tailoring of medical treatment to the individual characteristics of each patient. It does not literally mean the creation of drugs or medical devices that are unique to a patient, but rather the ability to classify individuals into subpopulations that differ in their susceptibility to a particular disease or their response to a specific treatment. Preventive or therapeutic interventions can then be concentrated on those who will benefit, sparing expense and side effects for those who will not. (PCAST 2008) This term is now widely used, including in advertisements for commercial products, and it is sometimes misinterpreted as implying that unique treatments can be designed for each individual. For this reason, the Committee thinks that the term "Precision Medicine" is preferable to "Personalized Medicine" to convey the meaning intended in this report.

As I said, "precision medicine" is a marketing term, but it's actually a better marketing term than "personalized medicine" because it is closer to what is really going on. That's why I actually prefer it to "personalized medicine," even though I wish there were a better term. Whatever it is called, however, the overarching belief that precision medicine is the future of medicine has led to what has been called an "arms race" or "gold rush" among academic medical centers to develop precision medicine initiatives, complete with banks of NGS machines, new departments of bioinformatics and genomics, and, of course, big, fancy computers to analyze the many petabytes of data produced, so much data that it's hard to have enough media upon which to store it and we don't know what to do with it. Genomic sequencing is producing so much data that IBM's Watson is being used to analyze cancer genetics. It's not for nothing that precision medicine is being likened to biology's "moon shot"—and not always in a flattering way.

So what is the real potential of precision medicine?

Complexity intrudes

I discussed some of the criticism of precision medicine when I discussed the 21st Century Cures Act three weeks ago. I'll try to build on that, but after a brief recap. Basically, I mentioned that I was of a mixed mind on the bill's emphasis on precision medicine, bemoaning how now, at arguably the most exciting time in the history of biomedical research, the dearth of funding means that, although we've developed all these fantastically powerful tools to probe the deepest mysteries of the genome and use the information to design better treatments, scientists lack the money to do so. I even likened the situation to owning a brand new Maserati but there being no gasoline to be found to drive it, or maybe having the biggest, baddest car of all in the world of Mad Max but having to fight for precious gasoline to run it. I also noted that I thought precision medicine was overhyped (as I am noting again in this post), referencing skeptical takes on precision medicine in recent op-eds by Michael Joyner in The New York Times, Rita Rubin in JAMA declaring precision medicine to be more about politics, Cynthia Graber in The New Yorker, and Ronald Bayer and Sandro Galea in The New England Journal of Medicine. Basically, the number of conditions whose outcome can be greatly affected by targeting specific mutations is relatively small, far smaller than the impact likely would be from duller, less "sexy" interventions, such as figuring out how to get people to lose weight, exercise more, and drink and smoke less. The question is whether focusing in the genetic underpinnings of disease will provide the "most bang for the buck," given how difficult and expensive targeted drugs are to develop.

Over the weekend, there was a great article in The Boston Globe by Sharon Begley entitled "Precision medicine, linked to DNA, still too often misses", that gives an idea of just how difficult reaching this new world of precision medicine will be. It's the story of a man named John Moore, who lives in Apple Valley, UT. Moore has advanced melanoma and participated in a trial of precision medicine for melanoma. His outcome shows the promise and limitations of such approaches:

Back in January, when President Obama proposed a precision medicine initiative with a goal of "matching a cancer cure to our genetic code," John Moore could have been its poster child. His main tumors were shrinking, and his cancer seemed to have stopped spreading because of a drug matched to the cancer's DNA, just as Obama described.

This summer, however, after a year's reprieve, Moore, 54, feels sick every day. The cancer — advanced melanoma like former president Jimmy Carter's — has spread to his lungs, and he talks about "dying in a couple of months."

The return and spread of Moore's cancer in a form that seems impervious to treatment shows that precision medicine is more complicated than portrayed by politicians and even some top health officials. Contrary to its name, precision medicine is often inexact, which means that for some patients, it will offer false hope rather than a cure.

On the other hand, in the Intermountain study, after two years, progression-free survival in the group with advanced cancer treated using precision medicine techniques was nearly twice what it was in those who underwent standard chemotherapy, 23 months versus 12 months. Moore himself reports that with a pill he had one year of improved health and quality of life before his cancer started progressing again. It's not yet clear in this trial whether this will translate into an improvement in overall survival, the gold standard endpoint, but it's a very promising start. It is, however, not a miraculous start.

Here's the problem. I've alluded to it before. Cancer genomes are messed up. Really messed up. And, as they progress, thanks to evolution they become even more messed up, and messed up in different ways, so that the tumor cells in one part of a tumor are messed up in a different way than the tumor cells in another part of the tumor, which are messed up in a different way than the metastases. It's called tumor heterogeneity.

Now enter the problem in determining which mutations are significant (commonly called "driver" mutations) and which are secondary or "just along for the ride" (commonly called "passenger" mutations):

But setbacks like Moore's show that genetic profiling of tumors is, at this point, no more a cure for every cancer than angiogenesis inhibitors, which cut off a tumor's blood supply, or other much-hyped treatments have been.

A big reason is that cancer cells are genetically unstable as they accumulate mutations. As a result, a biopsy might turn up dozens of mutations, but it is not always clear which ones are along for the ride and which are driving the cancer. Only targeting the latter can stop a tumor's growth or spread.

Knowing which mutation is the driver and which are passenger mutations is so complicated that the Intermountain researchers established a "molecular tumor board" to help.

Composed of six outside experts in cancer genomics, the board meets by conference call to examine the list of a patient's tumor mutations and reach a consensus about which to target with drugs. Tumor profiling typically finds up to three driver mutations for which there are known drugs, and the board reviews data on how well these drugs have worked in other patients with similar tumors.

And:

The next difficulty, Nadauld said, is that "the mutations may be different at different places in a tumor." But oncologists are reluctant to perform multiple biopsies. The procedures can cause pain and complications such as infection, and there is no rigorous research indicating how many biopsies are necessary to snare every actionable mutation.

But a cancer-driving mutation that happens to lie in cells a mere millimeter away from those that were biopsied can be missed. Similarly, cancer cells' propensity to amass mutations means that metastases, the far-flung descendants of the primary tumor, might be driven by different mutations and therefore need different drugs.

Or, as I like to say: Cancer is complicated. Really complicated. You just won't believe how vastly, hugely, mind-bogglingly complicated it is. I mean, you may think it was tough to put a man on the moon, but that's just peanuts to curing cancer, especially metastatic cancer. (Apologies to Douglas Adams.) Because of this, precision medicine as it exists now can lead to what Dr. Don S. Dizon calls a new kind of disappointment when genomic testing fails to identify any driver mutations for which targeted drugs exist because "discovery is an ongoing process and for many, we have not yet discovered the keys that drive all cancers, the therapies to address those mutations, and the tools to predict which treatment will afford the best response and outcome—an outcome our patients (and we) hope will mean a lifetime of living, despite cancer."

Too true.

None of this is to say that precision medicine can't be highly effective in cancer. I've already described one patient for whom it was. It's also important to consider that even extra year of life taking a pill with few side effects is "not too shabby," either, if the alternative is death a year sooner. Prolonging life with good quality is a favorable outcome, even if the patient can't be saved in the end.

What is precision medicine, anyway?

As I thought about precision medicine during the writing of this post, one thing that stood out to me is that, although precision medicine is rather broadly defined, in the public eye (and, indeed, in the eyes of most physicians and scientists) its definition is much narrower. This narrower definition of precision medicine is the sequencing of patient genomes in order to find genetic changes that can be targeted for treatment, predict the response to therapy of various pharmaceuticals or dietary interventions, or predict disease susceptibility. In other words, it's all genomics, genomics, genomics, much of it heavily concentrated in oncology. (I concentrated on oncology for this post because it is what I know best.) If you reread the definition from the National Academy of Sciences Committee report, you'll see that precision medicine is defined much more broadly. Other similar definitions include metabolomics, environmental factors and susceptibilities, immunological factors, our microbiome, and many more, although even a recent editorial in Science Translational Medicine emphasized genomics more than other factors.

In fact, in the most recent JAMA Oncology, there are two articles, a study and a commentary, examining the effect of precision medicine in breast cancer. What is that "precision medicine"? It's the OncoType DX assay, which is generically referred to as the 21 Gene Recurrence Score Assay.

Basically, this assay is used for estrogen receptor-positive (i.e., hormone-responsive) breast cancer that has not yet spread to the axillary lymph nodes. Twenty-one different genes related to proliferation, invasion, and other functions are measured, and an empirically derived formula is used to calculate a "recurrence score." Scores below 18 indicate low risk of recurrence as metastatic disease and insensitivity to chemotherapy. Patients with low scores generally receive hormonal therapy but not chemotherapy. Scores over 30 indicate high risk and greater sensitivity to chemotherapy. For such patients, chemotherapy and hormonal therapy are recommended. Patients who score in the "gray" area from 18-30 remain a conundrum, but clinical trials are under way to better define the cutoff point for a chemo/no chemo recommendation. In any case, this study indicates that the use of OncoType DX is associated with decreased use of chemotherapy but because of limitations in the Surveillance, Epidemiology, and End Results (SEER) data set with linked Medicare claims, it wasn't clear whether this decline was in appropriate patients. In any case, there's no reason why genomic tests (like the Oncotype DX test) that are rapidly proliferating shouldn't be considered "precision medicine," and they are in practice already. Contrary to the image of oncologists wanting to push that poisonous chemotherapy, OncoType DX was designed with the intent of decreasing chemotherapy use in patients who will not benefit. Imagine that.

Conclusion: Medicine that works is just medicine

In the end, I don't really like the term "precision medicine" that much. It seems to be a term that reminds me, more than anything, of Humpty Dumpty's famously scornful boast, "When I use a word, it means just what I choose it to mean—neither more nor less." It's a sentiment that definitely seems to apply to the term "precision medicine." To me, when new tests or factors that predict prognosis or response to therapy or suggest which therapies are likely to be most effective are developed and validated, it's an artificial distinction to link them to genomics, proteomics, or whatever, as well as "big data" and refer to them as "precision medicine." To me, medicine that works is just "medicine."

More like this

The Academic Woo Aggregator

Note: The Aggregator was updated on May 18, 2008.

"Complementary and alternative medicine": Not just one thing

I've been on a bit of a tear criticizing the National Center for Complementary and Alternative Medicine (CAM).

Reclaiming the linguistic high ground: Renaming "complementary and alternative" medicine and the power of language

A few days ago, I was amused by a term coined by Dr. R.W.

Boiling "integrative medicine" down to its essence in 34 words

Yesterday was a rough day for me; so I'll be uncharacteristically brief today.

"Precision medicine" is an unfortunate term for a variety of reasons, among which is that "precision" is not the same thing as "accuracy". In a medical context, we can take "accuracy" to mean "treatment that addresses, if not cures, the underlying condition." Some diseases have well-known cures that don't need to be precise, and with others, like cancer, precision does not always help.

I like to illustrate the distinction between precision and accuracy by quoting Archbishop Ussher's estimate for the creation of the Earth. He names a precise time (6 PM local time at the Garden of Eden; the often-quoted 9 AM, which contradicts "the evening and the morning were the first day", is apocryphal) on a precise date about 6 ka ago, with the main uncertainty being due to not knowing exactly where the Garden of Eden was. That's on the order of a part in 10^8, which is extraordinarily precise for Ussher's day. But it's not accurate; the actual age of the Earth is closer to 4.5 Ga.

I don't know of any case in medicine where "precision" leads to a "precisely wrong" treatment in the way Ussher's Bible studies led him to his precisely wrong answer about the age of the Earth, but the field is yet young, and there will likely be plenty of opportunities to make such a mistake.

Orac does stipulate that he is first discussing this in relation to his specialty, which is understandable. But for more general application-- if this 1K figure is indeed realistic, we could sequence every child born in the US in one year for the cost of a few dozen F-35's or similar tradeoffs.

Now that would be a study.

It might actually tell us something useful about autism and obesity and such, as well as more physically acute conditions, as we follow the cohort through life.

Gets my vote.

Ah, but medico-marketing can be so persuasive--if you practice precision medicine, then ain't you a "precise doctor?" Yup, a sad state when marketing/propaganda trumps science.

we could sequence every child born in the US in one year

It seems to me that would be an expensive way of generating vast amounts of data that no one would have any idea what to do with.

From Sam Kean's 'The Violinist's Thumb' (Venter and Collins were big players in the HGP):

Most human geneticists aim to cure diseases, and they felt certain that the HGP would reveal which genes to target for heart disease, diabetes, and other widespread problems. Congress in fact spent $3 billion largely on this implicit promise. But as Venter and others have pointed out, virtually no genetic-based cures have emerged since 2000; virtually none appear imminent, either. Even Collins has swallowed hard and acknowledged, as diplomatically as possible, that the pace of discoveries has frustrated everyone.

It turns out that many common diseases have more than a few mutated genes associated with them, and it’s nigh impossible to design a drug that targets more than a few genes. Worse, scientists can’t always pick out the significant mutations from the harmless ones. And in some cases, scientists can’t find mutations to target at all. Based on inheritance patterns, they know that certain common diseases must have significant genetic components—and yet, when scientists scour the genes of victims of those diseases, they find few if any shared genetic flaws. The “culprit DNA” has gone missing.

I think we may have to wait until computer processing power has become even cheaper before that kind of venture would be worthwhile.

I thought this was also interesting (op cit):

In addition, a comparison between Venter’s genome and the Platonic HGP genome revealed far more deviations than anyone expected—four million mutations, inversions, insertions, deletions, and other quirks, any of which might have been fatal. Yet Venter, now approaching seventy years old, has skirted these health problems. Similarly, scientists have noted two places in Watson’s genome with two copies of devastating recessive mutations—for Usher syndrome (which leaves victims deaf and blind), and for Cockayne syndrome (which stunts growth and prematurely ages people). Yet Watson, well over eighty, has never shown any hint of these problems.

It looks as if knowing a person's genome isn't quite as useful as one might think.

Krebiozen once again quotes out of context so he can reply with a non-sequitur.

Yawn.

One major problem with precision medicine is that it relies on the false idea that a complex disease requires a complex treatment. This is not true, since many tumors can be treated by surgery without extensive molecular knowledge.
The same principle can apply to chemotherapy, by using the fact that all cancers have a common feature, uncontrolled growth. So the future of cancer treatment, immunotherapy aside, will come from G2 checkpoint inhibitors and protection of normal cells by cell inflation, associated to chemotherapy targeting dividing cells.
http://www.ncbi.nlm.nih.gov/pubmed/24156014

zebra@5

Krebiozen once again quotes out of context so he can reply with a non-sequitur.

He had a valid point regarding processing power. That much data would be difficult just to store let alone process. Why bother collecting all that data when we don't even have the infrastructure to store it, let alone analyze it? From the conclusion of the study talked about in an article Orac linked to:

Genomics clearly poses some of the most severe computational challenges facing us in the next decade. Genomics is a “four-headed beast”; considering the computational demands across the lifecycle of a dataset—acquisition, storage, distribution, and analysis—genomics is either on par with or the most demanding of the Big Data domains. New integrative approaches need to be developed that take into account the challenges in all four aspects: it is unlikely that a single advance or technology will solve the genomics data problem.

It seems to me that would be an expensive way of generating vast amounts of data that no one would have any idea what to do with.

Let me calculate, by Fermi problem methods, how much information is involved here. For purposes of this post, cows are spherical, etc. (I am a physicist, after all).

A human has a few tens of thousands of genes. (Probably more than a zebrafish, which has about 20k, but probably less than 100k.) Let's call it 30k, just to keep things in round numbers. Each gene codes for a protein that has between a few hundred and a few thousand amino acids--let's take 1000 for an average figure. At three base pairs per amino acid, that's around 100 million base pairs per genome, not counting junk DNA. There are four bases, so we are discussing something like 30 MB of data for one person. The US has a population of a bit over 300 million, which implies about 5 million births per year. So we are looking at 100-200 TB of data for a single year's birth cohort, or around 100 PB of data for the entire US population. That's large but not excessive for a Big Data project these days (many physics research projects will produce hundreds of terabytes per year, and some discard large amounts of data to keep the total that low).

But I agree that it won't do any good to collect the data if you don't know what you are going to do with it. At least with a Big Data physics project, you have some well-defined science question, and to undertake it you need to convince a leading funding agency that your project is worth funding. Zebra's proposed data collection effort sounds (if I may put on my reviewer's hat for a moment) like a solution in search of a problem. I can't answer for NIH, but I know that NSF and NASA do not like to fund fishing expeditions like that.

Eric Lund,
It wasn't storage I was thinking of so much as the processing power to find associations between groups of mutations and specific conditions. I thought it was interesting that Venter and Watson (as in Crick) both had mutations that are associated with serious physical conditions that they did not have - presumably this is some epigenetic phenomena that turns the relevant genes off (or on). When you add in the epigenetic data that would be required to figure out what is going on I don't think current computers have the necessary power.

Krebiozen@9
From that PLoS Biology paper:

For population and medical genomics, identifying the genomic variants in each individual genome is currently one of the most computationally complex phases. Variant calling on 2 billion genomes per year, with 100,000 CPUs in parallel, would require methods that process 2 genomes per CPU-hour, three-to-four orders of magnitude faster than current capabilities [42].

It goes on to say that this is an issue not necessarily solved by Moore's law:

Improvements to CPU capabilities, as anticipated by Moore’s Law, should help close the gap, but trends in computing power are often geared towards floating point operations and do not necessarily provide improvements in genome analysis, in which string operations and memory management often pose the most significant challenges. Moreover, the bigger bottleneck of Big Data analysis in the future may not be in CPU capabilities but in the input/output (I/O) hardware that shuttles data between storage and processors [44], a problem requiring research into new parallel I/O hardware and algorithms that can effectively utilize them.

Krebiozen's point was entirely valid, not a "non-sequitur".

Eric Lund #8,

Eric, this would just be a database, like census information, that could be used by individual research projects. The sooner we have the data, the sooner that can begin to happen. Is the census a "fishing expedition"?

The numbers are big because this is a big country; maybe we could pay Canada to do it?

capnkrunch #10

Still having language problems? Look up "non-sequitur" and look up "valid".

Eric Lund, I run a CLIA NGS lab, and you're underestimating :) We don't just sequence each base once; we do it 30x, on average, for statistical power. Our latest exome data (which is what you've described) is ~15-20GB in it's raw form, not to mention any files that are made for analysis purposes. Genomes are closer to 120 GB.

As everyone's been describing though, the big reason we don't just "sequence babies" - or the whole population, as some have suggested - now really is because sequencing is the easy part. We can generate genomes, and even store them (though that's a pain), until we're blue in the face, but determining the needle in the haystack causative mutation in sick people is hard enough, much less predicting what may go wrong in healthy folks. We just haven't done enough of the (much harder) genetics work to figure out how to appropriately interpret the data.

Daniel, how's that search for evidence demonstrating cell inflation represents a effective treatment for cancer coming? Got anything that approximates proof of concept, let alone that it has the potential (as you claimed when you first appeared on RI a couple of years ago that it represented a ‘universal’ treatment for cancer.

The numbers are big because this is a big country; maybe we could pay Canada to do it? Oh, yes, please! Our little hamlet will get right on that.

zebra@12

Still having language problems? Look up “non-sequitur” and look up “valid”

non-sequitur noun
: a statement that is not connected in a logical or clear way to anything said before it

valid adjective
: fair or reasonable

zebra@2

But for more general application– if this 1K figure is indeed realistic, we could sequence every child born in the US in one year for the cost of a few dozen F-35’s or similar tradeoffs.

Krebiozen@4 (responding)

I think we may have to wait until computer processing power has become even cheaper before that kind of venture would be worthwhile.

me@10

Krebiozen’s point was entirely valid, not a “non-sequitur”.

Swapping in definitions:
Krebiozen’s point was entirely reasonable, it is a statement that is connected in a logical or clear way to what was said before it.

I believe you misused "non-sequitur" to simply dismiss a valid point without addressing it.

Still having language problems? Look up “non-sequitur” and look up “valid”.

It would help if you looked up the former, as your attachment to that erroneous hyphen is quite grating.

^ Grumble grumble blockquote grumble.

JGC
What you do not seems to realize is that CIAC represents a lot of investment with no financial return. Fot the evidence, to make a comparison, it is like saying that gene therapy is the way to treat genetic disease: you don't really need a proof of concept. The only thing is enough money to make things work. For me it's better to work on G2 abrogation because, with drugs, you can attract investors, but I think that CIAC is safer. And I am quite optimistic that it will be done in one country or another.

capnkrunch #15

"Krebiozen’s point was entirely valid, not a “non-sequitur”."

Is itself a non-sequitur.

"Krebiozen's point is not a non-sequitur, [because it is] entirely valid."

But "being valid" does not refute the claim of something being a "non-sequitur".

Why don't you just admit that you learned something again instead of wasting bandwidth? I already know how to suck eggs.

Are you really trying to correct someone over a word you can't even spell correctly, or am I have a stroke?

Why bother collecting all that data when we don’t even have the infrastructure to store it, let alone analyze it?

While the computing power needed to analyze the data would be expensive, I'm not sure the algorithms needed for bulk comparison are there, and we likely don't even know where to look, getting the data would be a first step when its cheap enough. We might not be able to use the full data set it in any good way for 10 or 20 years, but then the children won't have developed all of the conditions one might correlate to the genetic data, either.

Other issues of such a study would be providing the ongoing follow-up and maintaining adequate confidentiality. Each genome would need to be linked to the individual's medical records for the patient's lifetime in order to get the best data. This has the potential for abuse or inadvertent disclosure.

@ Mephistoles
Another problem with big data is that the current academic reward system is based on experimental papers. If you just come up with a new interpretation of published data you will have hard time to publish.

sorry MephistoPHEles

Mephistopheles O'Brien #21

Basically correct on the first point. If we had the data tomorrow, it would be possible to begin preliminary sorting, which might well reduce computational cost later on as "sick" or otherwise characterized populations are identified over the decades.

Issues of confidentiality don't seem that big a problem to me, unless is gets hacked and published on the internet... oh wait... . But realistically, no, assuming reasonable care and stiff penalties for misuse, I can't see any real risk.

Mephistopheles O'Brien

While the computing power needed to analyze the data would be expensive, I’m not sure the algorithms needed for bulk comparison are there, and we likely don’t even know where to look, getting the data would be a first step when its cheap enough.

One thing I would be concerned about is that I/O speed might be the bottleneck (see my post #10). It seems like a waste of resources to store data we can't use and that will likely need to be copied to new hardware just to be usable. The logistics involved in a project like that would be a nightmare. Heck with that amount of data software changes would be a nightmare as well. With the amount of resources required it's really best to have the proper container prepared before trying to fill it. I think it would be a far more efficient use of resources to sort out the hardware and software requirements prior to mass collection of data.

I totally agree with you about confidentiality. The government, insurers, hospitals, etc haven't been inspiring much faith in their ability to protect confidential data.

Is the census a “fishing expedition”?

No, because there are specified uses for that data set. In order to properly apportion representatives, we need to know how many people there are and where they live. In the US, the Constitution provides for an "actual Enumeration" every ten years. (Details are different in other countries, but any halfway functional democracy needs similar data at regular intervals for this purpose--the alternative is "rotten boroughs" and "pocket boroughs" such as existed in the UK at various times in history.) Other data collected by the Census Bureau is routinely used for a number of studies: demographics, wealth distribution, and many others of this kind. It's also a much smaller data set than individual genomes: these days, you could store all of that data on one commercially available hard drive. There are also laws (at least in the US) requiring that the data remain confidential for a period of time (72 years IIRC), after which they are made public--those census records are handy for people who do genealogical research since they can often be used to track where certain people move.

A population-wide genome database would presumably also be confidential--HIPAA either would or should cover it. But it involves quite a bit more storage. As others note, data analysis gets a lot trickier; e.g., you would have to have some way of tying it to medical records for it to be useful in any way (as MO'B notes above). Comparisons are a hard problem as well; if you are not careful about how you design the algorithm, you will get something that scales as N^2 where N is the number of people in the database (because with N people you have N[N-1]/2 pairs of people), or worse if you are doing multi-way comparisons.

To me, the confidentiality issues MO'B brings up are sufficient reason not to collect the data any sooner than it would be of practical use, because IME any sufficiently large database of confidential information eventually will be abused in some fashion. Think hackers stealing credit card info, or the NSA's dragnet collection of telephone metadata, but on a much larger scale.

Since this does begin to intrude on my area of expertise - I can say that Zebra has no idea exactly how much data he's talking about - both storing and processing.

The data alone would take years to process - even the initial collection and export into a database. There are also few, if any, current database technologies that allow for the storing and analysis of datasets even approaching the size in question & none that would allow for the connection of so many disparate individual items of comparison.

By the time the technology "might" be available to do it, it is, in all likelihood, probably that better diagnostic tests would already be available to solve many of the issues that would have made this data interesting in the first place.

The Census is also a relatively small amount of data - meaning that the variables are both known and quantifiable (there are only a set number of questions on the forms & check boxes for the individual person).

Full gene sequencing for hundreds of thousands, if not millions, involves an unknown number of variables - and no idea, how any or all of the variables might be connected to each other.

Eric Lund@26

No, because there are specified uses for that data set.

Therein lies the issue I was trying to get at in #25. I was once involved in a project to upgrade server hardware and software as well as migrating our code base from VB 6 to VB.NET. It was a mess. There's no reason to set ourselves up for that when there's no current use for the data. It is so mcuh easier to build the proper infrastructure (both hardware and software) the first time around than to have to upgrade later.

Lawrence #27,

What "disparate individual items of comparison" are you talking about?

I get the sense that people are projecting some complicated scenario onto a simple suggestion. You record the genome-- admittedly a long bit of information-- along with, say, social security number.

What's the problem?

Lawrence@27

...I can say that Zebra has no idea exactly how much data he’s talking about...

This is kind of zebra's MO. I wouldn't be surprised if he comes back and claims that he knows better than you.

Shorter Daniel @18

"No, I don't have any evidence because MONEY."

Where else have I heard this? Oh, yes--every alt-med proponent arguing that there are no funding for studies proving what they know to be true and that vitamin C/baking soda/aromatherapy etc. cures cancer because they aren't patentable.

@zebra - and exactly how big is a single genome? How do all of the genes relate to one another & in what context?

You've put forth the idea that this data could be collected simply and just stuck somewhere - but by what method would you do so?

Having the data is merely one thing, but that data must be processed, stored and retrieved in some fashion for that information to be valuable - I am merely pointing out that the storage systems (i.e. databases) don't exist to be able to handle this quantity of information in any meaningful way to allow for searching or using the results.....not in any sense that would take less than years - by which time, the information is no longer valuable.

Lawrence #33,

I just don't understand what it is you are imagining.

Let's begin like this: Can you tell me the maximum length a sequence would have to be in order for you (you imply you are an IT person) to be able to handle it?

JGC
The money argument is valid whoever uses it. And yours is obviously a fallacy:
https://en.wikipedia.org/wiki/Faulty_generalization

capnkrunch #32

"This is kind of zebra’s MO. I wouldn’t be surprised if he comes back and claims that he knows better than you."

Given how easily I got him to fold, maybe I do.

Some of you may have missed malia's comment at #13 which was held up in moderation. It's worth reading, I think.

My point, which I thought was obvious, was that most people seem to think it's simply a matter of getting lots of genomes and correlating it with physical illnesses, including autism and obesity, and figuring out which genes differing from the 'standard' version* are responsible. It is a great deal more complex than that, with some mutated genes being turned off or on by other genes and by other epigenetic factors we do not yet understand. I think there are better ways the NIH or whoever** could spend $5 billion (assuming $1,000 per genome and five million births per year).

* How do we establish what is the normal human genome? The current 'standard' HGP genome is an average of a number of different people's genome, but since we all have dozens of serious mutations this is a somewhat moot point.

** Oddly it was the US Energy Department that started on the HGP, the rationale being that they were investigating the effects of radiation on DNA.

Sorry about the grammar fails in my last comment - rushed editing.

malia #13

It's great to have a real expert pitching in.

"We just haven’t done enough of the (much harder) genetics work to figure out how to appropriately interpret the data."

For those of us who are not experts (and don't pretend to be), could you explain what the "genetics work" entails?

A quick question - we each have two copies of each chromosome (apart from the y chromosome in men, of course). Presumably both are sequenced - does anyone know how that works?

@ zebra - the reason the genetics work is hard is that there's lots of ways to skin a cat, but I'll try.
- A lot of it is basic genetics - we "break" a gene in a model organism and see what happens phenotypically. But this only works for genes that change a phenotype. Krebiozen's post above, where he mentions epigenetics, talks about some of the reasons why it's hard to correllate genotype and phenotype; but there are a myriad of others that we have to take into account - gene families where one gene can "rescue" another can hide a gene's function, for instance, or genes that have such a subtle phenotype that we can't pinpoint a change by looking.
- Some of it is looking at patterns in large cohorts of people - but again, a lot of that information can be masked by the same issues I mentioned above, and in many cases, when we start with "phenotype first", there may be lots of different genes creating something that *looks* the same to us on the outside, which can confuse the issue.
These types of research aren't sexy, nor do they use fancy machines, so they're rather under-funded, to the frustration of every working geneticist, ever.

@Krebiozen - being diploid is one reason why we sequence at depth rather than just 1x. We physically chop all 46 chromosomes (23 pairs) in to manageable bits, and sequence them, presumably equally; then to put them back together, we compare the sequence data to the "human genome" (called hg19, which is really a mix of about 6 people) to find canonical differences from that reference, and we look for any differences in our patient's sequence - where we have 50% of one nucleotide, and 50% of the other nucleotide, we know we have a difference between the maternal and paternal chromosome. But, what we can't easily do is "phase" these differences - so, we don't know whether a set of mutations that are physically close to one another live together on a single chromosome, or whether they're dispersed between a chromosome pair. (FYI - this problem with phasing is also a contributing factor in determining if a mutation profile is "disease causing" or benign in some cases)

malia #43.

Your reply is greatly appreciated. But I remain puzzled as to why anyone would object to my suggestion, since I am offering what you appear to need. Let's do this with Canada as a more manageable source of data:

Yearly births of 385,000 times 1,000 per genome is 385,000,000.
An F-35 (USAF, the cheaper model) is about 150,000,000.

So the US could buy 4 fewer of these (in a projected fleet of 450 plus) and easily cover data collection costs for our friendly neighbors to the north with their more rational health care system. Need more data, say from a larger country, lose a few more planes.

So this is what I don't get. You say you are underfunded, but here I am giving you (and geneticists everywhere) free access to all the data you could possibly use. I understand that you need computing power to work with it, but since you aren't paying for the sequencing, you have more funds to do the cat-skinning.

And the data will be useful even after every baby has grown old and died, assuming other records are maintained. The processing is only going to get easier and cheaper over time.

So really, what is the problem?

Thanks malia, that answers my question perfectly. It seems to me that phasing is going to be a serious issue in the future, unless/until someone finds an ingenious way of figuring it out.

And the data will be useful even after every baby has grown old and died, assuming other records are maintained. [emphasis added, obviously]

And that the gene data is collected and labeled properly, and the other data is accurate, and that the other data can be perfectly correlated with the gene data, and that the other data is actually the data that is needed for the unforeseen analyses, and ...

Think it through, don't just go off half cocked, like you seem so fond of doing. Do like adults do: foresee problems and address them before they bite. This is your brainfartstorm; be responsible rather than defensive.

@ Krebiozen
Performing single sperm sequencing in the same male individuals?

assuming other records are maintained

That, historically, has been a very generous assumption. Storage media age--yes, even hard drives, but this was even more of an issue with magnetic tapes, which were standard for decades. Interface technologies change. Software that was designed for a particular computer architecture may not be maintained as computers using that architecture age out of service. Et cetera. Dealing with these issues takes time and resources. Only recently have people in my (relatively data-heavy) field begun to devote the necessary time and resources to addressing this problem.

Show of hands here: how many of you who are over 30 can read every single computer file you have created over the last 20 years? I certainly can't. At least two pieces of software I used extensively in the late 1990s (ClarisWorks and Canvas) no longer exist, but I still have files I created with those programs. In 1995 my home machine was still a Mac Plus (new in 1988) with a 400k floppy drive and a 20 MB external hard drive with SCSI connection--I still have the machine in a box in my basement somewhere, but I have no way of exporting data on those media to something my current machine can read, without paying major bucks to somebody who has maintained such a machine. I also have zip disks.

I work with people who have decades worth of data collected, some of it on nine-track tapes. Nine-track tape readers were once ubiquitous in this field; today the number of still-operating readers in the world can be counted on the fingers of your hands. And many of those tapes are too brittle to read. Even when data are on media we can read, we have to hope that documentation was kept of the data format (the data is generally in a binary format, because data storage was at a premium in those days). In some cases software to read the data exists, but is in some ancient version of Fortran that won't necessarily compile on a modern computer, even if it had a Fortran compiler (they are no longer automatically included with many operating systems). A significant fraction of the data were never examined more than superficially.

Now scale this problem up to the size that Malia mentions for the human genome. And ponder the question of who is going to maintain such a database, and who is going to cover the costs of maintaining it--which are likely to be the same order of magnitude, per year, as it would cost to collect all of that data in the first place.

There is no shame in underestimating how much of a problem this is--lots of people do that. I have had occasion to recommend rejection of a proposal that I thought was making that mistake. But it's better to understand the magnitude of the problem before we collect a bunch of data we will never be able to use.

Eric Lund #48,

This is very strange reasoning. We would "be able to use" the data immediately-- my suggestion that it might still be usable 100 years from now is only to illustrate that this is a long-term project.

I also think your experience with magnetic tapes and consumer-type software is truly irrelevant-- this would be a serious scientific endeavor backed by world governments and scientific institutions. (In the 21st century-- no "stone knives and bearskins"; no slide rules and punch cards.)

So I still await an objection from someone (who one hopes would be an actual expert) who can explain why he or she would not like to have this resource available for research.

Eric Lund,

After having just spent hours transferring files on floppy disks using a borrowed external floppy disk drive (because they no longer come installed in computers) I can very much relate to what you are saying. It also reminded me of how painfully slow the buggers are. Now all I have left to do is find a way to get files off of a zip disk I have. Yay.

zebra, I'm sure everyone would like this as a resource. However, yours is a hollow victory unless you can will it to happen or pony up the money and resources to make it happen.

Daniel Corcos,

Performing single sperm sequencing in the same male individuals?

Good thinking, but I'm not sure that would help. Even if a single sperm contained enough DNA to sequence (it doesn't*) it will carry chromosomes randomly selected, so you wouldn't know if a gene was from a maternal or paternal chromosome. Also you would only sequence half the man's DNA, and sequencing multiple sperm to get the full genome would lead to the same problem i.e.not knowing which genes came from which chromosome. Until we can isolate a single chromosome and extract enough DNA to sequence that, I see no way of overcoming this, yet.

* Genome sequencing requires 250 ng DNA. A single sperm contains only about 3 pg i.e. 0.003 ng of DNA. That's 4 orders of magnitude difference, which will doubtless take a few years to overcome.

Krebiozen
http://www.cell.com/abstract/S0092-8674(12)00789-1

Not a troll #50,

"zebra, I’m sure everyone would like this as a resource."

Ummm.... no. Apparently several people think it would be a Bad Idea. Including Eric.

"Bad idea?" No, but it lands very much in the impractical category......

Daniel Corcos #52,
Wow! That's impressive. I wondered about PCR but dismissed it. It still doesn't really solve the problem though, since we still don't know which genes came from which copy of each chromosome. Or am I missing something?

It's the $5 billion cost that makes it a bad idea. I'm all for collecting data just in case it comes in useful, but not when it costs lots of money that could be spent on something with immediate practical uses.

Krebiozen
I would say that with enough sperms, you would be able to say that the genes come from the same chromosome and answer the question of whether several mutations are on the same chromosome or not.

zebra@49

We would “be able to use” the data immediately–

No we wouldn't. For a number of reasons that have been explained already.

I also think your experience with magnetic tapes and consumer-type software is truly irrelevant– this would be a serious scientific endeavor backed by world governments and scientific institutions.

Eric Lund's comparison is apt. Recall the paper I linked to said I/O speed is likely to be a major bottleneck. Our current storage media is inadequate. To be able make practical use of the data it would need to be moved onto faster media when it is available. This too has already been explained.

Krebiozen is absolutely correct in #56

It’s the $5 billion cost that makes it a bad idea. I’m all for collecting data just in case it comes in useful, but not when it costs lots of money that could be spent on something with immediate practical uses.

I would also add that doing the data collection now would also unecessarily incur additional future costs to upgrade the software and hardware infrastructure as better technology becomes available. As I said before, it would make much better use of resources to take those funds and put them towards creating the necessary big data technoogies before collecting the data.

capnkrunch #58,

We would “be able to use” the data immediately–

No we wouldn’t. For a number of reasons that have been explained already.

No reasons have been "explained" at all.

Are you saying that malia is some kind of psycho troll who is claiming to use genomes when in fact she isn't? Or any of the other "working geneticists" she mentions? You must be really out of touch with the 21st century, just like Eric appears to be.

But I remain puzzled as to why anyone would object to my suggestion, since I am offering what you appear to need.

Could you point me toward the part of Malia's comment @#43 that you thought showed an apparent need for sequencing every child born in the US for a year?

Because I don't see one. On the contrary. She seems to me to be saying that the data they already have is still way too much of a research imperative all on its own for there to be any need or use for more.

ann #60,

By "you", I am (obviously to me at least) referring to malia and all those working geneticists she invokes, and all future geneticists who might be able to do research because this data is freely available. As I very clearly pointed out, the costs saved on sequencing should allow for expanding the more substantive research activity.

[Yes, obvious to me, but of course, we can always distract from the topic by now discussing whether I should have said "you geneticists", or "y'all", or something else, and then go on to whether y'all is properly used as singular or plural, and so on. Or is there a hyphen in there somewhere?]

@zebra: There is a definite way that you can settle this dispute in your favor. To wit: write a proposal to the NIH or NSF (or equivalent body if you are outside the US) in which you will describe how you will collect the data, and what science question you will use the data to answer. You will need to convince the funding agency that you can do it within the constraints of the program to which you propose, and that your science question is of sufficient interest that the agency should fund your proposal rather than one of the competing proposals of comparable merit that they would otherwise fund. Then go out and achieve the proposed goal. If you can do this, you will have proved yourself right. The various people who are skeptical of your proposal, myself included, have given reasons why we think you won't be able to achieve the objective within the allotted resources.

Right now what you are proposing is an "underpants gnomes" scheme: 1. Collect large-scale genome data. 2. ??? 3. Science! I know from experience, as do several others in the commentariat, that funding agencies aren't going to fund proposals like that when there are already many more proposals with an explicit step two than they can afford to fund.

No reasons have been “explained” at all.

Just because you have hand waved them away doesn't mean we didn't explain. First there's the technology issues I've brought up numerous times. Read the PLoS Biology paper I linked to for more detail but here are some of the issues:

Protecting confidential data:

But in addition to tailoring genomics applications for the cloud, new methods of data reliability and security are required to ensure privacy, much more so than for the other three domains.

CPU speed:

Variant calling on 2 billion genomes per year, with 100,000 CPUs in parallel, would require methods that process 2 genomes per CPU-hour, three-to-four orders of magnitude faster than current capabilities.
...
Aligning all pairs of the ~2.5 million species expected to be available by 2025 amounts to 50–100 trillion such whole genome alignments, which would need to be six orders of magnitude faster than possible today.

I/O speed:

Moreover, the bigger bottleneck of Big Data analysis in the future may not be in CPU capabilities but in the input/output (I/O) hardware that shuttles data between storage and processors [44], a problem requiring research into new parallel I/O hardware and algorithms that can effectively utilize them.

Database size and search speed:

Similarly, efficient compression and indexing systems are critical to make the best use out of each available byte while making the data highly accessible.

On the other hand there's the issue that malia brought up in #13:

We just haven’t done enough of the (much harder) genetics work to figure out how to appropriately interpret the data.

Throwing more data at the problem isn't going to solve it. Just because you didn't understand malia's explanation in #43 doesn't make it wrong.

As I very clearly pointed out, the costs saved on sequencing should allow for expanding the more substantive research activity.

And as she very clearly pointed out, since there is no present need to spend any time, money, or energy sequencing more data, that actually wouldn't be a saving costs. It would be a waste.

It's basically like saying:

"I don't have enough clothes. I need to buy some, which I can barely afford to do. But I know! I can save money by buying the clothes I'll wear in thirty years now!"

all those working geneticists she invokes, and all future geneticists who might be able to do research because this data is freely available.

Let me walk you through this.

(1) All those working geneticists she invokes have unfunded priorities right now.

(2) Having more data "freely available" would not help achieve them.

(3) It also wouldn't be free. You'd have to spend money creating the database and making it available.

(4) That money is presently needed for other priorities.

(5) So spending it on something else now would detract from rather than aid presently ongoing research.

(6) Furthermore, there's no way to even say whether the research being done now will result in findings that could be better translated to practical applications in the future if such a database were "freely" available.

(7) So the whole thing might easily be a great big waste of money, now and always. Because:

(8) Spending money supplying people with something for which there's no demand just about always is.

Eric Lund #62,

Since I never proposed getting funding from any of those agencies, such a test would be irrelevant. I merely did a first approximation a la Enrico, see #44 for the less ambitious version.

If you would read carefully, you would see that I consistently have implied financing from the general fund, by invoking a metric-- the much-maligned F-35-- which is often used for this kind of analysis. If the right people got the contracts, I could imagine even this US Congress finding a way to fund the project. Probably by cutting food stamps and not airplanes, but you never know.

So the real issue is whether such a database would be useful.

"Probably by cutting food stamps and not airplanes, but you never know."

"(4) That money is presently needed for other priorities."

You heard it here first: Let them eat genome database entries.

Basically correct on the first point. If we had the data tomorrow, it would be possible to begin preliminary sorting, which might well reduce computational cost later on as “sick” or otherwise characterized populations are identified over the decades.

"Sorting"? Sorting what into what?

You're magically* going to get 4 million human genomes each with over 3 trillion base pairs, and then.... do 8 trillion whole-genome comparisons? Why? Babbling about "computational cost," a subject that you may reliably be assumed to know nothing whatever about (what word size for the sequences?) doesn't cut it.

What sort of data structure do you imagine resulting from this exercise?

Issues of confidentiality don’t seem that big a problem to me, unless is gets hacked and published on the internet… oh wait… . But realistically, no, assuming reasonable care and stiff penalties for misuse, I can’t see any real risk.

That's because you're a simpleton. Remember:

You say you are underfunded, but here I am giving you (and geneticists everywhere) free access to all the data you could possibly use

No, the security would be on the level of that for access to the VSD. Eveything has to be deidentified, which is no small feat when one has a lifetime of medical records.

Speaking of which, how precisely do you figure that's going to happen?

* You forgot about consent, now didn't you?

So the real issue is whether such a database would be useful.

And @#13, you have a person who's in a position to know telling you that it wouldn't, then explaining why @#43.

see #44 for the less ambitious version

Where you underestimated the size of the U.S. birth cohort by an order of magnitude?

Capnkrunch,

Listing out-of-context [out of context] quotes doesn't constitute an "explanation", and anyway, whether I call it a non sequitur or a strawman or a Gish Gallop, this has nothing to do with my suggested project.

And, saying "it's hard" is not an explanation. Nor is "it's not perfect".

And in particular, when we are talking about basic research, "but it might not yield useful results" is not just a poor argument, but actually stupid.

If this database existed, people would use it. It's absurd to suggest otherwise. People would choose genetics as a career exactly because of the opportunity. And almost certainly, it would spur the kind of innovation and development of tools that you are talking about. Kind of like DARPA's little experiment with connecting computers in different locations, you know...

I really do think a lot of people here are simply "stuck" with their attachment to 20th and even 19th century paradigms. You can't imagine a different way of doing things, or it makes you uncomfortable, or you feel threatened. Too bad.

@Krebiozen/Daniel Corcos

Single cell sequencing is a thing; you can (and we have) sequenced a sperm, if one really wants to - and lots of animal science researchers want to. However, for humans, not super practical. First off, only useful in biosex males. Second off, those are germline cells, which are arguably different than the somatic cells (cells that make up the rest of your body) that you'd be interested in if you were doing diagnostic sequencing. We're getting closer to being able to do phasing - there are some nifty biological tricks we can use with standard sequencing, and there is an adorable little sequencer, called the MinIon, that's in beta testing right now - our lab got 30KB fragments off of it - but neither of these options are very cost or time effective....yet.

@zebra
Would such a database be useful in the real world?? Honestly, the answer is that we don't know, because we don't have enough of the biological groundwork to make that decision yet. Might it be useful in the future? Perhaps.
IF (and only if) money and time and storage space and processing power were not an limitation - AND doctors took a thorough, objectively defined, descriptive medical history that was always coded properly in an EMR that followed each person through their life so we had good phenotype data on every condition on every human ever - sure, future human genetic researchers would theoretically love a resource that could interrogate the genomes of all humans everywhere.
However, @ann has a pretty good run-down of the ideas for why it's not being prioritized. Avenues of research that are arguably more fruitful are currently underfunded, and there are lots of technical hurdles that have been mentioned up-thread that would need to be innovated and properly managed in order to make such an endeavor feasible. Beyond that, there are some HUGE ethical considerations that are still being hammered out. Lots of people are categorically NOT OK with having their genome stored somewhere. Lots of insurance companies are looking to make unsubstantiated claims for denying coverage based upon preliminary genomic data as "pre-existing conditions". There are questions about whether adults should be making decisions about whether a baby's genome should be sequenced, rather than that baby when they reach majority. Etc, etc.

You can call something out of context but it doesn't make it so. That paper quite clearly explains the technological challenges involved in big data genomics.

Similarly you can say people would use the data and they certainly would want to, but both the technology and our understanding of genetics is not at the level where this data would be useful. You have yet to provide any counterpoint to any beyond "I don't think so."

Our CPU's, storage media, and database technology is not fast enough. Our security is not good enough. And even if it was we don't understand enough to make use of the data. I provided references about the technology and malia is an expert in the field who told you how our fundamental genetics knowledge is lacking. You on the other hand have other nothing in defense of your idea.

Should be "...provide any counterpoint to any criticism beyond..."

@malia - a question almost, but not completely, off topic if you please.

Do you have any opinions on the commercial DNA testing companies that scan for genealogical 'roots'? Any opinions on the validity of the results?

The word is that there is native American on both sides of my family tree, but as near as I can tell, it would be so far back, I'd probably be eligible for the Mayflower Society. I'd like to try to settle it one way or another.

Listing out-of-context [out of context] quotes

It's positively darling that Z. is now asshurt over this and doesn't understand the error.

Yes, obvious to me, but of course, we can always distract from the topic by now discussing whether I should have said “you geneticists”, or “y’all”, or something else, and then go on to whether y’all is properly used as singular or plural, and so on. Or is there a hyphen in there somewhere?

We've (tinw) been through this, but a careful prescriptivist would omit the apostrophe.

malia #72,

My comment about 20th century thinking wasn't intended for you, but..."pre-existing conditions"* ?? I thought we were eliminating that little canard here in the USA.

But I get the same sense of Nirvana Fallacy/Grandiose Strawman from you that I do from others. Let's say we collect the data as I described in 44, and sure there will be some opting out, and sure there will be errors-- as happens now, but you keep working.

What I really don't understand is why you think this only has utility for some great project involving all of humanity in the future. Assume the data is available; why will there not be young scientists (or old corporations) picking out sub-populations to study? I just don't see the constraints described in any concrete form. What do I need more than a statistically significant sample, and just enough computing power to handle it?

*(Or "preexisting", I'm sure Narad will be chastising you about that any time now.)

If this database existed, people would use it. It’s absurd to suggest otherwise. People would choose genetics as a career exactly because of the opportunity.

And thus comes the fantastic megalomania.

@Johnny

The current crop of geneological tests are pretty good at giving you an idea of ancestry, so I don't see any reason why folks shouldn't try it if they're curious. They don't, however, have good demographics for some smaller outgroup populations (and some Native populations are included in this statement), either because of lack of genotyped members that can be used in the database, or because the group has an aversion to genetic testing (as is the case in some SW-USian Native populations).
I've personally never done one (because I'm cheap); but others in my family have, and the emotional responses to the results have, in all cases, been the most interesting parts. My step grandma called me in tears after hers, thinking she must have misread it because it only told her she was of German-ish descent, and she *already knew that*! And my Uncle's test started a small Facebook-family-feud because there was too much Celtic genetics, and not enough French/Gallic genetics. My family may need a new hobby.

(Or “preexisting”, I’m sure Narad will be chastising you about that any time now.)

That's a stylistic choice, not rank ignorance.

capnkrunch #73,

See my response to malia. But maybe you can try explaining this.

What exactly does "our understanding of genetics is not at the level where this data would be useful" refer to???

How can genetic data be "not useful" in the study of genetics???

Not to repeat myself, but...

And in particular, when we are talking about basic research, “but it might not yield useful results” is not just a poor argument, but actually stupid.

Not when it's a huge cost- and labor-intensive fishing expedition for something the potential uses of which nobody even knows how to recognize, identify, define or describe yet.

Because in that case, you'd just be stating one of the reasons that putting the cart before the horse is a bad idea by saying it.

It would actually be -- and is! -- stupid to pretend otherwise.

How can genetic data be “not useful” in the study of genetics???

When people can't use it to study genetics.

Assume the data is available; why will there not be young scientists (or old corporations) picking out sub-populations to study?

Because, as she already clearly stated @#43, they don't yet know what to look for and/or how to look for it in the data they already have.

Research is not being held up due to a lack of sequenced sub-populations As she also already clearly stated, sequencing is the easy part. The present imperative is interpreting the data they already have.

If this database existed, people would use it. It’s absurd to suggest otherwise.

It's actually the other way around. Supply does not create demand. It's absurd to suggest otherwise.

People would choose genetics as a career exactly because of the opportunity.

What opportunity? People already -- right now! -- have more access to sequenced sub-populations than they know what to do with. Literally.

And almost certainly, it would spur the kind of innovation and development of tools that you are talking about. Kind of like DARPA’s little experiment with connecting computers in different locations, you know…

I don't see how. Please elaborate.

I really do think a lot of people here are simply “stuck” with their attachment to 20th and even 19th century paradigms. You can’t imagine a different way of doing things, or it makes you uncomfortable, or you feel threatened. Too bad.

Are you kidding? If there's a single idea in existence that's more 19th-century than definitively quantifying the genetic destiny of humankind in order to use it to build a better future through science, I don't know what it is.

Storing and using huge amounts of data is a major technical challenge, one that both big tech companies (like Google) and government agencies (the NSA) are working very hard on right now.

I can only think of two uses for a database of just genomes (as opposed to genomes + medical data): for identifying bodies, and for identifying criminals.

Would sequencing everyone answer some scientific and medical questions? Probably yes. Are we capable of analyzing that data now? No.

Because, as she already clearly stated @#43, they don’t yet know what to look for and/or how to look for it in the data they already have.

To expand upon this further: Even if the data were available, there are resource costs (labor, if nothing else--people don't work for free!) to searching the database. To pay for those costs, potential users would have to write a proposal demonstrating that (1) they have a well-posed research question, (2) the question is of sufficient importance to merit funding, and (3) the proposers have a methodology that they can show is likely to answer the question. At least that's how the funding agencies I am familiar with work, and I have no reason to think NIH is different. Ann's point is that researchers in the field don't know what they are looking for, so they cannot write a proposal that satisfies the third point. It would be worse than looking for a needle in a haystack, because at least we know what a needle looks like.

I have been on review panels. I know what happens to proposals that don't have a coherent methodology (or, as I described them above, "underpants gnomes" proposals). If you're lucky, you'll give the panel a good laugh for a couple of minutes or so before getting a rating that tells the program manager, "Don't even think of funding this proposal, even if you have infinite resources." For programs that practice proposal review triage, such as the NIH R01 program, you won't even get that satisfaction. So nobody gets funding to look at the database, which just sits there consuming resources (as I pointed out above, there is a nontrivial cost to maintaining the data once you have collected it).

Maybe in a decade or two, once scientists in that field have figured out how to ask meaningful questions that can exploit such a database, we can talk about building that database. In the near- to mid-term, that money is better spent looking at data already in hand and trying to figure out what we can do with it. Only then will we be able to supply the second step in that business model.

Eric Lund@87

To expand upon this further: Even if the data were available, there are resource costs (labor, if nothing else–people don’t work for free!) to searching the database.

Not to mentiom that every action performed on a database has resource costs. In normal situations you don't notice which is probably why zebra apparently is unaware. However, for such a large database every single query will cost a significant amount of CPU cycles, disk reads, etc. Until there's a need to analyze such large amounts of data it is far more efficient to work with smaller datasets.

From what malia said it seems like research has not progressed to the point where such large amounts of data are necessary or even usable. So, I would say that even if there were such a database, researchers wouldn't be scrambling to use it. It doesn't make sense to use a huge database with its inherent latency when a much smaller dataset would suffice and be much easier to manipulate.

Heh, redundancy. Important for databases, not so much comments.

To add on to capnkrunch and Eric Lund: it is entirely likely (even probable) that doing the kind of Big Data deep-database search we're talking about here will require not only new hardware and new software, but new mathematics.

Looking at data sets as large as the genome of every child born this year (for a small subset) will need new algorithms at the very least. My mathematician friends tell me this is an exciting area of research, but while math is faster than pretty much any other science, it will still take time to develop these new methods.

Eric Lund #87,

Let's review what malia actually says at #43:

Krebiozen’s post above, where he mentions epigenetics, talks about some of the reasons why it’s hard to correllate genotype and phenotype; but there are a myriad of others that we have to take into account – gene families where one gene can “rescue” another can hide a gene’s function, for instance, or genes that have such a subtle phenotype that we can’t pinpoint a change by looking.
– Some of it is looking at patterns in large cohorts of people – but again, a lot of that information can be masked by the same issues I mentioned above, and in many cases, when we start with “phenotype first”, there may be lots of different genes creating something that *looks* the same to us on the outside, which can confuse the issue.
These types of research aren’t sexy, nor do they use fancy machines, so they’re rather under-funded, to the frustration of every working geneticist, ever.

That doesn't sound like someone saying "we don't know what we're doing we just run around like headless chickens". It does sound like someone contending with agonizingly slow processing and horrendously "noisy" data and eye-straining poor resolution and stuff like that.

I find it difficult to reconcile this with what you are saying-- that "these people don't even know what they are looking for".

What I am suggesting is that if we pick up the tab for the sequencing-- at the cost of a few airplanes we don't really need-- and build a resource like the Census or NOAA/NASA and other such entities-- there will be competent people applying for grants to make the data more usable, at least, and others with a cogent proposal for basic research, and others with medical and other applications. You are free to disagree, but you have to offer more than generalities that fit with your experience, which simply may not be applicable here.

just a tech, capnkrunch:

"the kind of Big Data deep-database search we’re talking about here"

Since you are obviously not talking about the same thing I am, I guess we're both right.

As I said to malia, if I am looking for a correlation for example, I only need a statistically significant sample to work with. Do you guys not understand that it is trivial to extract that from the greater database?

The issue, which is probably above your heads, is what constitutes a significant sample, because of the noisiness of the genomic information. But, we are far more likely to figure that kind of thing out if we get multiple people working on multiple problems in the context I describe to Eric.

zebra@92

Do you guys not understand that it is trivial to extract that from the greater database?

Two issues here. First, why waste all the resources building this database when we only ever need a small subset?

Second, querying a database is not zero cost, there costs for the retrieving information from the logical data structure as well as reading from the physical disk both of which scale with size. With a database that large even generating an arbitrary subset is not trivial. If you have specific parameters in mind it only gets worse. It's not as simple as cutting a slice of pie. Count computer science as another topic zebra has no grasp of.

capnkrunch,

Did you really say "why waste all the resources building this database when we only ever need a small subset" ??

I hope the rolling doesn't pop my eyes out of their sockets.

OK, tomorrow I will look for at least marginally rational responses from malia and Eric.

OK, let's back this train up a bit, because I feel like I am missing something.

zebra@92: Please tell me if I am understanding your proposal correctly. You want to search our genome database a correlation. A correlation to what? Are you talking about search a subset of the database (say, children with a very specific genetic disease) to find common gene patterns?

Or are you searching the whole dataset for a specific coding sequence?

And how can we know how big a statistically significant sample will be before we know what we are looking for? If it is something common we might need a relatively small sample, while if it is something very rare, we might need a very large sample.

Usually you decide the level of statistical significance you want, then from that calculate the power you need to find that, and that in turn will let you calculate the necessary sample size. Is that what you mean?

What I am suggesting is that if we pick up the tab for the sequencing– at the cost of a few airplanes we don’t really need– and build a resource like the Census or NOAA/NASA and other such entities– there will be competent people applying for grants to make the data more usable, at least,

By doing what? What would they be proposing to do to the data, using what methodology to accomplish what?

and others with a cogent proposal for basic research,

Which is what, exactly? As in "I propose to establish the genetic basis for..." And how do you imagine they would be cogently proposing to establish it? By looking for correlations?

Between what and what else? How would they look? What would they be looking for?

and others with medical and other applications.

And you know this because you see it in your Magic 8 Ball?

Did you really say “why waste all the resources building this database when we only ever need a small subset” ??

I hope the rolling doesn’t pop my eyes out of their sockets.

Perfectly sensible question.

As I said to malia, if I am looking for a correlation for example, I only need a statistically significant sample to work with. Do you guys not understand that it is trivial to extract that from the greater database?

ORLY? How? I mean, it would be more straightforward to say what database?, given that all you have is yet more sophomoric posturing, but just for fun, show everybody what the query would be here.

The issue, which is probably above your heads, is what constitutes a significant sample

Seriously, go fυck yourself. Your dismal performances here have only demonstrated that you seem to know nothing whatever about any topic that you choose to spout off on. I can scarcely imagine what, if anything, you have postsecondary training in.

because of the noisiness of the genomic information.

OK, let's see what's over whose head. Define your terms.

^ Don't forget "statistically significant sample," either.

Count computer science as another topic zebra has no grasp of.

Oh, it's vastly worse than that – remember the "preliminary sorting" bit? He has no understanding whatever of data structures or algorithms. (Fun fact: multiple sequence alignment with sum-of-pairs scoring is NP-complete.)

I hope the rolling doesn’t pop my eyes out of their sockets.

OK, tomorrow I will look for at least marginally rational responses from malia and Eric.

Oh dear, hoofbeats fading into the distance. I'd say "not enough carrots" if only Z.'s psychological dynamics weren't so painfully obvious.

Behold, Z.'s comment 91:

What I am suggesting is that if we pick up the tab for the sequencing– at the cost of a few airplanes we don’t really need....

This, of course, is yet another repetition of his ciphering in comment 44:

Yearly births of 385,000 times 1,000 per genome is 385,000,000.
An F-35 (USAF, the cheaper model) is about 150,000,000.

So the US could buy 4 fewer of these (in a projected fleet of 450 plus) and easily cover data collection costs for our friendly neighbors to the north with their more rational health care system. Need more data, say from a larger country, lose a few more planes.

Note that the issue has been pointed out indirectly by others, and I did it explicitly in comment 70. Add "order of magnitude"* to the pile of things that Z. doesn't understand.

* And, ironically, cost analysis, given the context for this idée fixe of a "yardstick"; he in fact posits $4 billion as the total initial cost from inception to end of initial sequencing. Absolutely classic Z.

here I am giving you (and geneticists everywhere) free access to all the data you could possibly use

In the absence of any notion of what to do with it, a hundredth or a thousandth of the data would meet the same criterion.
All this big-data talk always brings to mind the Maxwell's Demon of the Second Kind from Cyberiad.
Spoiler alert: "The demon prints out this information on a long paper tape, but before the pirate realizes most of the information is completely useless (although strictly factual) he is buried under the endless rolls of tape, ceasing to bother anyone."

herr doktor bimler@103

In the absence of any notion of what to do with it, a hundredth or a thousandth of the data would meet the same criterion.

Don't say that, zebra will roll his eyes right out of their sockets.

On that note, here's an interesting case of zebra revealing his ignorance. In #93 I made two points. One was rather weak but required some knowledge of databases to rebut. zebra chose to ignore this one and instead opted to handwave away the much stronger, albiet simpler, argument (the same one herr doktor bimler made here).

zebra proving his lack of knowledge about a subject in one breath while insulting others in the next is nothing new. However, I think the way it came across in this case is somewhat novel.

it is entirely likely (even probable) that doing the kind of Big Data deep-database search we’re talking about here will require not only new hardware and new software, but new mathematics

The NP-hard problems are very likely going to remain NP-hard. New heurististics may be new mathematical applications, but I'm not so sure about the "new mathematics" part.

There's a list of problems that were in need of attention a decade ago here (and many of the reference links are broken, rotted, or both), but I just don't have the time at the moment.

Narad: Add “order of magnitude”* to the pile of things that Z. doesn’t understand.

I'm beginning to think he doesn't understand anything at all. I mean, I don't really understand computers, but I get the practical objections, not to mention the idea of obsolescence. I mean, does he seriously think computer tech is going to stand still for a hundred years? The computer I had ten years ago is vastly different from the one I have today, for example.

Eric and Justatech: I feel your pain. I'm trying to move files around, and even with two relatively close in age computers, it's a gigantic pain. And, hey, I still have some floppies around. Don't know what I'm going to do with them.

I have been a lurker here for years, yet never felt compelled to post before now. Sorry my first post is such a long rant. I realise that trying to dismount Zebra from his high horse is a futile task. But as a life-long geneticist/genomicist/molecular biologist or whatever it's called these days, I just want to add my support to those who have already tried to explain how futile/wasteful Zebra’s idea is, at present. Better idea is to work on getting the ethics/consents and funding to collect high quality DNA (or at least blood) and immortalised cell lines from all births (or as many as consent), then when both the sequencing technologies plus data storage and analysis infrastructure are more mature, you can do all the sequencing AND make some sense from it. By then you might even have some decent phenotype data from this birth cohort which will help you select good subsets of the data to answer specific questions. I would also suggest that re-collecting samples from the cohort over time will yield even more useful data to track changes in epigenetic, RNA and protein expression over time. Ideally you would also want to track environmental and sociological data to work on gene-by-environment questions. Now we are talking about really REALLY big data, but potentially much more useful than just sequencing everyone born using current technology which is going to improve greatly over time. I am pretty worn out from having to re-impute genomic data from stuff done in the past. Imputation should only be done when you have no other choice.

I also see Zebra does not have any grasp of how big the computational and statistical challenges are. Think about 3 billion base pairs (that’s 6 billion bases) in an average human. Now think that each human has about 10 million of the base pairs as SNPs, plus CNVs, methylation differences and a whole host of other potential genomic differences between individuals. Let’s leave phasing out of it for the moment, though this is also a big problem. Now you need to screen all 3 billion base pairs to get the basic data that Zebra is talking about (keep in mind that is maybe up to 100 billion reads of DNA for decent filtering and mapping), then you need to do multivariate analysis of all the cross-wise comparisons between each base of “normal” DNA and affected individuals (whatever the effect is you are looking for), and do an enormous amount of correction for multiple comparisons to maybe get an idea of genetic difference that may be driving the effect. The more genes involved in a particular phenotype, the smaller the effect sizes and this is just the start of an answer to why it just doesn’t make any sense to do this right now. Do it when you have a specific question to answer, using the best technology you have available (and can afford) at the time to answer that question. And never, ever assume it is just about the plain text sequence of 3 billion base pairs that you can store in a database. That is a defunct idea spread by the hype of the HGP, which even people like me bought into at the time.

Or to maybe to put it in terms that Zebra can relate to, would you buy 5 million F-35s just because they were $1000 each, when you may not need them at all, or not for 30 years? Maybe you will need 100,000 of them before then, would you still buy all 5 million, just in case? Where will you store them, who will maintain them, and will they be of any use in 30 years’ time? Or will the war technology have changed so much by then that they are basically of little use except at air shows…Now keep in mind that genomic technology is evolving much faster than war technology at the moment, and you can see there are a few problems with just stock-piling for the sake of it, even if someone stumped up the money for it (which is unlikely right now). Using the F-35 analogy, wouldn’t it be better to put that money into buying what you need right now to fight your current wars, and investing some of it into research for future war technology improvements (if war was your business). That is all genomic researchers are saying. There are better uses for the money and technology right now, in both diagnostics (fighting the current wars) and research (improving technology for the future).

Narad@105

New heurististics may be new mathematical applications, but I’m not so sure about the “new mathematics” part.

Personally, I like to dream that P=NP but realisticly you're probably right. That said the work being done in compressive genomics is pretty interesting [1][2]. I don't know enough to differentiate "new mathematics" from " new application". I would guess that new compression algorithms are more on the side of new application though.

Personally, I think the most interesting stuff is the work being done on homomorphic encryption [3]. That's some seriously cool cryptography. Again though, not sure it constitutes "new mathematics".

[1] h[]p://www.nature.com/nbt/journal/v30/n7/full/nbt.2241.html
[2] h[]p://m.genome.cshlp.org/content/21/5/734.full
[3] h[]p://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.309.1513

zebra chose to ignore this one and instead opted to handwave away the much stronger, albiet simpler, argument

The SDSS pipeline, as of 2001, represented 25% of the shebang. The raw data aren't really comparable (as the PLOS item notes; data noise has a dedicated garbage channel early in the chain), but, I mean come on, already.

I get the sense that people are projecting some complicated scenario onto a simple suggestion. You record the genome– admittedly a long bit of information– along with, say, social security number.

There's something simple here, and it ain't everybody else.

@Retro Pump:

I have been a lurker here for years, yet never felt compelled to post before now.

I'm confident that unanimity will be found as to your doing it more often.

I think the fallacy of Zebra resides in the airplanes' argument. Very few of us need these airplanes actually, and, personally, if I had the money of these airplanes, this would change my life; much more than if I had all these sequences, which I cannot say if they will be useful one day. Maybe we can use this blog to say that the money of the airplanes should be given directly to us. We certainly know better what to do with it (except maybe for Zebra), but I fear that some very powerful people really need these airplanes.

Retro Pump #107,

An excellent (and obviously knowledgeable) critique of the proposal. Before I respond, I’ll just ask-- why didn’t you jump in right at the beginning and say: “Zebra, I see where you’re going, but here’s how it would have to work.”

The whole point of putting the idea out there is to promote this kind of discussion. Why lurk when you have real information to contribute?

Anyway, as to the substance. I don’t recall that I ever suggested that we would get all the samples and then have a crash program to sequence them all in one year. Even I am not that crazy, and to suggest that is what I call a grandiose strawman. The point, which you appear to understand and have no problem with, is to establish the cohort

So, what are the steps or components?

Getting the samples is trivial. Will some parents opt out? Stipulated. Will that skew the sample? Highly unlikely, except perhaps where you have a somewhat inbred religious or social group, and we would be aware of it

Developing a registry of phenotypes? Well that’s why I suggested a country with a rational health care system and cooperative citizens like Canada. Yep, I will also stipulate that in the USA there would be all kinds of irrational paranoia about such a project. Bad idea to do it here, numbers aside.

Why do it sooner rather than later? OK, that’s where I think we disagree.

As I’ve suggested, the first reason would be to stimulate research and the advancement of tools and techniques and lowering costs. It’s a big contract.

But the real benefit to me would be getting as much done before people in the cohort start manifesting whatever we are interested in. Are you saying that it wouldn’t help to have the genomes to hand (whatever subset of the data has been processed so far) in order to inform further action and study? This I don't get.

If the data is there, and someone says "I think there may be a (genetic) correlation with x", this can be examined initially without any patient interaction at all-- that is, by people outside the clinical setting where confounding factors abound.

As to the analogy, no, it doesn't work. How about "why subsidize rooftop solar when everyone knows how to make and distribute electricity from fossil fuels, and solar technology will be better in 20 years anyway." (Or any number of examples like pc's and internet and electric vehicles and so on.)

You don't get the progress by sitting on your hands; solar panels et al are cheaper and better exactly because we started buying them even before they were "perfect". Which is why I mentioned the Nirvana Fallacy along with grandiose strawmen.

Zebra @2

we could sequence every child born in the US in one year

Zebra @112

I don’t recall that I ever suggested that we would get all the samples and then have a crash program to sequence them all in one year

D'oh!

The strategy in the UK and funded by the NHS (Government)is to target specific groups rather than just the population.

"The 100,000 genomes project."

The project will sequence 100,000 genomes from around 70,000 people. Participants are NHS patients with a rare disease, plus their families, and patients with cancer.

Zebra @2

we could sequence every child born in the US in one year

I understood that to mean "a year's worth of births."

zebra will roll his eyes right out of their sockets

You say that as if it were a bad thing. The Dunning-Kruger is strong in this one.

As has been mentioned overnight, another area in which major advances would be needed before a large-scale genome database could be useful is algorithms for sorting and correlating the data. For instance, if the best available algorithm today has an O(N^2) run time, but an algorithm with an O(N log N) run time were possible at least in principle, it would be well worth spending money to develop the latter. For N of the order of the annual US birth cohort, the difference between N^2 and N log N is five orders of magnitude, or roughly the difference between having a query take one second (as our equine-pseudonym commenter seems to think) and one day (because 2^20 = 1048576). In a genome database, a more realistic N would be in the billions if not trillions--what would take the N log N algorithm one second would take the N^2 algorithm anywhere from a few years to a few thousand years.

We can't count on Moore's Law to hold forever. There are physical limits: signals cannot travel faster than the speed of light (about one foot per nanosecond), and a logic gate cannot be smaller than a few atoms in size. Maybe we can do better with quantum computers than with conventional computers, but that, too, is a long development road--the current state of the art in actual quantum computing hardware is a benchtop experiment that is barely able to tell us that 15 = 3 * 5. (I'm not being quite fair here: that computer was intended to demonstrate a factoring algorithm which is much faster for large numbers--O(log N) instead of O(N^1/2) where N is the number to be factored. Which is a big deal because encryption algorithms depend on the fact that factoring large numbers with non-quantum computers is computationally hard.) And of course, even if quantum computers are developed to a level where they might help, the data migration problem strikes again, in spades.

As to the analogy, no, it doesn’t work. How about “why subsidize rooftop solar when everyone knows how to make and distribute electricity from fossil fuels, and solar technology will be better in 20 years anyway.” (Or any number of examples like pc’s and internet and electric vehicles and so on.)

Those are all products that were developed after the technology and understanding that made them possible, not before.

You don’t get the progress by sitting on your hands; solar panels et al are cheaper and better exactly because we started buying them even before they were “perfect”. Which is why I mentioned the Nirvana Fallacy along with grandiose strawmen.

The Nirvana Fallacy applies to things that are non-ideal, not things that are useless and unnecessary.

It just dawned on me that zebra's thinking is essentially that if you put a bunch of stuff that uses the same code together and connect it up, it will be like the internet. (Hence the 21rst-century snark.)

Maybe everybody else already grasped that, though.

zebra@112

But the real benefit to me would be getting as much done before people in the cohort start manifesting whatever we are interested in.

Hmmm. This doesn't quite fit with what you were saying before.

zebra@31

You record the genome– admittedly a long bit of information– along with, say, social security number.

And once again, I'll bring up the confidentiality issue that you previously handwaved away:

zebra@23

Issues of confidentiality don’t seem that big a problem to me, unless is gets hacked and published on the internet… oh wait… . But realistically, no, assuming reasonable care and stiff penalties for misuse, I can’t see any real risk.

Just because you think something doesn't make it so. Are you beginning to see a theme here? Data breaches aren't even the biggest issue, it's ablut properly deidentifying the information. There's a reason genomics is driving some real cutting edge crypto*. Since I apparently can't quote without it being out of context here's some required reading so you don't embarass yourself again: Routes for breaching and protecting genetic privacy.

*It's probably over your head but for extra credit go read the paper on homomorphic encryption I cited in #103.

capnkrunch,

That's a dead link in your last comment, please repost.

There are already some interesting privacy issues emerging from sequencing people's genomes. James Watson wanted his apoliprotein E status (associated with Alzheimer's)withheld when his genome was published, but some geneticists pointed out that it was easy to figure out from the surrounding genes that had not been redacted. These and similar issues will doubtless become more prominent in the future.

capnkrunch #119,

No idea what you are getting at with your first two quotes. You need to elaborate.

But, since you appear to be part of the USA paranoid group I mentioned, let me ask something about "genetic privacy".

I have a list that matches social security number with a genome. I have a list that correlates some medical condition with social security number. The social security number is encrypted, if you like.

Now, if it we go back on the momentum to eliminate the concept of pre-existing conditions as a way to deny insurance coverage or payment, I could imagine someone hacking the medical condition registry and selling the info to some nefarious insurance executive. I don't think this will happen, but arguendo.

But could you please explain what possible deranged reasoning makes you think someone would hack the genome data, which you keep freakin telling me is impossibly difficult to turn into useful information in any useful time span??

My eye rolling is nothing compared to my headshaking at this point. [head-shaking]

Me, I think it would be neat to wear a t-shirt printed with my genome in tiny little letters-- someday I expect one will be able to order such a thing from Amazon. What's the problem?

Krebiozen@120
Oops. Here's the link: http://www.nature.com/nrg/journal/v15/n6/full/nrg3723.html

Ann @115

I understood that to mean “a year’s worth of births.”

If zebra were sensible then your inference would be reasonable. However, since zebra was responding to a point made by literally nobody, whilst ironically complaining that it was a "grandiose strawman", I'm not convinced that your interpretation of his intentions is correct.

DrBollocks 123,

It was in response to Retro Pump pointing out (apparently obvious only to the two of us) that the genetic information is stored in the samples. He says basically "keep it stored that way until we reach some future capability" and I say no, lets get to work right away turning it into accessible data. For the reasons covered in 112.*

I never suggested, as some seem to keep implying, that we were going to have the entire databast digitized immediately, and that we would then do analysis on the entire thing. Hence "grandiose".

*Note also that this is why all the nonsense about deteriorating media and changes in software is nonsense. The samples are there with the information to be extracted in the future, with future better tech, if the data gets corrupted.

So, I have to apologize, I was wrong about the mechanics of data storage.

According to my database systems engineer (I'll call him DSE), you can store 1 petabyte (Pb) of data on one server rack of hard drives ( a server rack being the size of an overgrown filing cabinet). If you don't care about your I/O, you can do that for about $100K. If you want a searchably useful I/O rate it will be more like a million dollars. (The DSE considered this chump change but then I reminded him this was science and not Silicone Valley.)

The much, much bigger problem in the opinion of my DSE is the issue of data entry. The USA does not currently have a unified EMR or EHR system (electronic medical record or electronic health record). While a universal EHR system has been on the WHO's to-do list since at least 2012, not a lot of progress has been made on that front.

So it doesn't matter than we can store the data and search the data, we can't get the data *in* in the first place.

So zebra, I apologize, I was wrong to say that the storage problems would make your plan unfeasible. It's the data entry that makes it unfeasible. Maybe a country like Canada could do it (if they had the money) but I still believe that there are issues beyond funding.

(A side note about starting with infants: for late-onset diseases like obesity, type II diabetes, CAD and dementia we would have to wait an awfully long time for diseases to manifest before we can determine anything about the genetic basis.)

zebra@121
That was my bad for botching the link. Do some reading. At the very least take a look at Figure 1. The issue isn't exfiltrating the entire database, it's that genomics can potentially be used to identify otherwise deidentified protected informtion. As to why anyone would want to do that, it doesn't particularly matter because HIPPA says you need to protect that information. As I said, there's a reason why there is serious cryptography research going on to allow queries without exposing the raw data.

As to the social security thing, I was under the impression you were suggesting that we only collect genome data because Lawrence said the database would get too complex if it included all the medical data. Apparently I misunderstood but your actual idea still doesn't make sense. Using SSN's to index the database unnecessarily exposes additional identifying information. Not to mention, it was the genome itself that Lawrence was referring to as being composed of huge amounts of variables (go reread Retro Pump's #107; try reading for comprehension this time).

Justatech #126,

Remember, we, the wealthy USA, and I would hope the rest of the world, are paying the [Canadians] to do this. I would throw in even a couple more F-35's if need be. And someone mentioned that the Brit NHS was doing something along these lines with phenotypically distinct populations. Rational healthcare systems do have advantages.

But your point about late-onset conditions is quite correct. Note that my initial comment referred to obesity and autism, the former unfortunately affecting more and more children. And that follows from Orac suggesting, IIRC, that genomics might be better applied to those kinds of less-acute-public-health-chronic issues.

So yes, maybe we will have figured out the genetic components of diseases of old and middle age before the cohort gets there. But I am also arguing that this project will help accomplish that indirectly by stimulating R&D.

capnkrunch 127,

I am really trying to comprehend what you are saying but it ain't working.

There's a government database of genomes indexed by ss#.
There's a government registry of conditions or diagnoses indexed by ss#.

If I want to do research on condition x, the government sends me a list of the genomes indexed by a completely arbitrary set of numbers which the government indexes to ss#, so I can follow up if some need arises, but they could even leave that out and I could still do my research.

Where do I go to use the genomic information to find out what about what about any individual?

zebra@123

If I want to do research on condition x, the government sends me a list of the genomes indexed by a completely arbitrary set of numbers which the government indexes to ss#, so I can follow up if some need arises, but they could even leave that out and I could still do my research.

Again, there's no reason to use SSN's as the index. This system you're proposing requores that every request requires the an additional column mapping the new arbitrary index to the SSN. There's no need to use the SSN. You just set up a hash table for the index and you can then reuse the hashes as the index for a queried subset and there's no need to map the subset's index to the master databases.

Where do I go to use the genomic information to find out what about what about any individual?

Again, I'll refer you to the paper I linked to in #123. There's attacks that can be used to deindentify genomes or that can link a known person's genome to their PHI. I'm not going to explain it all for you but here's a very simplified explanation of one type of attack. Markers from a deidentified genome are referenced against information in publically available genealogical databases. These databases contain personally identifiable information. Because you requesting genomes from people with a certain condition you have correlated PHI with a specific person. HIPAA demands that not be possible.

This is not entirely theoretical either:

An empirical analysis estimated that 10–14% of US white male individuals from the middle and upper classes are subject to surname inference on the basis of scanning the two largest Y-chromosome genealogical websites using a built-in search engine.

And researchers actually successfully used this method: Identifying Personal Genomes by Surname Inference (pdf).

Correction for #130
"The system you’re proposing requires that every request an additional column mapping the new arbitrary index to the SSN."

I thought about it and there could also be just one additional column mapping the arbitrary index to the SSN. But then there's no need for the SSN because this is the same I mentioned in #130 except it unnecessarily includes SSN's as well.

Zebra:what possible deranged reasoning makes you think someone would hack the genome data, which you keep freakin telling me is impossibly difficult to turn into useful information in any useful time span??

First of all, they are assuming this can be turned into useable data, which in turn makes it vulnerable to hackers. As to why hackers would try to break it..uh, because it's there? Because it could be used to embarrass and discredit people? I could imagine a lot of ways a break-in could get nasty fast.
Why would you even want your genetic data hanging out for all to see? What if you had schizophrenia or bipolar depression or a personality disorder running in the family?

There’s a government database of genomes indexed by ss#.
There’s a government registry of conditions or diagnoses indexed by ss#.

I would certainly hope that there are no such government registries or databases, because as capnkrunch@127 notes, it would be illegal to maintain such databases in the US. There is a law (HIPAA) which requires medical information used for any form of research to be stripped of any and all identifying information. Social security numbers constitute identifying information, because they are unique numbers associated with specific people. For instance, anybody who knows your SSN can obtain credit in your name. (This is one of the ways that identity theft works.) If your medical history were also available to somebody who knows your SSN, he could use it to, e.g., legally obtain certain drugs that are not available over the counter, and sell them at a profit. Which leads to a nightmare scenario in which you can't get the medication you need, because a program intended to sniff out prescription fraud has noticed that somebody using your identity is obtaining said medication in quantities well beyond what is needed for personal use.

Any database with financial information linked to individuals is a potential target for hackers. You need only pay attention to the news to see this: such databases are hacked on a regular basis. A database with everybody's social security numbers and genomes/medical history in it might as well have a big sign on it saying "HACK ME" in letters big enough that, were it physically located in Washington, you could read it from Perth without reading glasses. No criminal penalty could deter all hackers, because the reward is so great.

There would have to be *some* unique identifier connecting the individual with his/her genome; otherwise there'd be no way to connect the individual's medical records with his/her genome. Without that, all you have is a massive pile of bytes from which you could not learn anything about health.

But this whole project would require a database connecting every individual in the cohort with his/her medical records and that *would* be a target for hackers even if the genome info were not useful. Imagine the blackmail possibilities for a hacker who knew about treatment for venereal disease, drug addition, schizophrenia or bipolar depression or a personality disorder in the family as PGP suggested.

capnkrunch et al:

The hospital knows the ss# of the baby.
The hospital sends the ss# and the sample to the government.
The government attaches an arbitrary number (xss) to the sample and to the digital form of the data. But it must keep a table of ss# and xss#.

The doctor knows your ss#.
The doctor sends the information ss# and diagnosis.
The government converts the ss# to xss using the table.
It stores the diagnosis indexed by xss.

So, as the person doing the research, when the government sends me a list of genomes associated with the condition I am studying, I have no contact with ss#, and as I said, the government doesn't even have to tell me the xss.

Somehow, you think this is less secure than the traditional method you are defending, where the subjects are directly involved with the researchers? Where "everybody knows your name", and you can be secretly photographed or fingerprinted, and your credit card information and other easily readable health data is in the office, and your facebook page discusses your condition, and you order drugs online....

Yawn.

zebra@12

But it must keep a table of ss# and xss#.

Nope. You still don't understand what a hash table is. It stores the hashes without any need to keep the raw data (i.e. SSN). In any case, using the SSN is unnecessarily. It makes more sense to have a unique identifier known to healthcare providers (think MRN) that is standardized. That way it minimizes consequences in case there is leakage.

Somehow, you think this is less secure than the traditional method you are defending, where the subjects are directly involved with the researchers?

The traditional methods don't work here either. As I told you before even just a genome is potentially personally identifiable. Hence the work on homomorphic encryption that would allow comparisons without exposing raw data. This is not advanced enough nor are computers powerful enough for this to be practical on a large scale. We can argue about your misunderstanding of how database indexing works until we're blue in the face but this is the more difficult issue and it's the one you haven't even attempted to address.

Ooh! I just thought of a much better application of a genome database. I said earlier that there are issues with having to wait for diseases of adulthood to manifest in our hypothetical cohort.

Why not instead take an existing multi-generational study and sequence their genomes? Imagine the data you could learn from the Framingham Heart Study, with three generations of CV data! And since everyone there already has a study ID number (totally separate from any other identifying number), we eliminate all the SSN stuff.

I only see three issues with that: 1)a lot of the first generation are dead, so no genomic data, 2) getting people to volunteer their genomes, 3) it's not all that diverse a place (which has always been a problem with the FHS) so there are a lot of population groups you would miss.

Somehow, you think this is less secure than the traditional method you are defending, where the subjects are directly involved with the researchers?

There is only one way to make a database like that secure: put it on a computer that is never connected to the internet. That's how Los Alamos protects the computer that runs their bomb explosion simulations. But if you make it secure in that fashion, then it becomes a great deal less useful. If you try to make it useful to other researchers by putting it on the net, hackers will find a way to penetrate the security.

Admittedly, at the database sizes we are discussing, FedEx has more bandwidth than the internet. But you still have to pay the army of data technicians who input the queries and transfer the results to the hard drives which get shipped to researchers. And you have to pay FedEx (or UPS, or USPS, or whatever courier service is involved) to ship the disk to its destination. That gets expensive really fast--most likely, proposal budgets would have to include these things, which means less money to do the actual research. And when all is said and done, you have to make sure that the hard drives you are shipping to researchers all over the US (if not the world) don't have any identifying data that somebody might accidentally put on a networked computer.

There are lots of problems like this that have to be solved. Feel free to argue that we (meaning NIH, NSF, and other funding agencies) should fund the research to address these issues--as noted in various posts, some of them have practical application in other areas, too. Don't waste our time and money collecting the data before you have solved these problems.

If I want to do research on condition x, the government sends me a list of the genomes indexed by a completely arbitrary set of numbers....

This is a truly breathtaking level of cluelessness. No, you don't get handed entire genomes for random conditions, because then – in your fantasy system – you can start doing reverse queries and not just deidentify them, but also piece together their complete medical histories in whatever noble land where adults have no control over their PHI.

There's more, but I'm on deadline. Somebody with more time might want to look at how Denmark handles the information that it keeps; e.g., here one finds the following:

"Access to Danish registry data and data linkage requires authorization by the Danish Data Protection Agency (Datatilsynet) and in some cases, additional authorization from the Danish Health and Medicines Authority, typically when medical charts are to be accessed,[46] and/or authorization from the National Committee on Health Research Ethics (Den Nationale Videnskabsetiske Komité) if biological specimens are to be used or if living persons are to participate in clinical studies. The Danish privacy laws on the use of personal data are stipulated in The Act on the Processing of Personal Data (Act Number 429; May 31, 2000).[47,48]"

And once again, the slow loris demonstrates he has no clue how things work. Let's go through these one by one.

1. in a research study, you absolutely cannot be photographed or fingerprinted without your consent. I have provided a few photos for research studies; in both cases, the doctors ASKED me, and assured me that my name would not be attached. Fingerprints were not taken, and very few medical studies would need them anyway.

2. Doctors do not accept credit cards, unless they're named Sears, Byrzenski, or Gordon*. There's this thing called insurance, or, you know, Medicaid. In any other country, there are national health systems. As for health information, it's securely filed away. Yes, a sufficently motivated hacker could get into those files, but I don't see why they'd want to, as it's generally dullsville and only of interest to other doctors. There's a huge glaring difference between office files and big fat database sitting there on the web.

3. Most of the time, you don't order drugs yourself. You phone or click for a renewal and the pharmacist orders the drug. Unless you're dealing with some fly-by-night compounding outfit, and like with Gordon or Sears, one could argue that anyone dealing with those outfits is already being robbed anyway, so they shouldn't be surprised when additional chicanery occurs.

4. No one ever claimed facebook was confidential. Even Facebook doesn't claim the pages are confidential, and there have been numerous warnings to that effect. If your facebook page has info on your medical condition, that's because YOU put it there.

Anyone else like to weigh in?

capnkrunch 136,

I'm sure whatever involvement you have with data entry has interesting challenges, and requires some problem-solving skills, but you really are doing poorly at communicating. Which may be why I used the term non-sequitur earlier; I don't see how any of this is related to whether my proposal is a good idea or not.

I can't "address" some problem if you will not tell me what it is.

Anyone can access some random human dna in any number of settings, and, again, accepting your unlikely premise arguendo, learn something about some individual. In fact, the workers in the hospital where our cohort babies are born could scoop up all kinds of bodily fluids, and run a parallel zebra database. Wasn't there a Thomas Pynchon novel along those lines, with a parallel post office? Is this the kind of thing that keeps you from sleeping at night?

PGP@140: Well, my doctor takes credit cards (for the co-pay). And yes, most medical files are super boring, but imagine if you wanted blackmail dirt on someone? "Does your wife know about your new case of herpes?" "Anti-retrovirals? Your mother will be so dissapointed." Etc, etc.

There is also a law in the US that prohibits employers from sequencing you because that could be used to discriminate against you in hiring (or firing). "Oh, you have a very high likelyhood of cancer? We can't have people here who won't be able to give it their all...clean out your desk."

And it is always possible that as our understanding of the human genome expands, what you can tell about a person from their genome will expand too. I'm not thinking GATTACA bad, but squicky.

This one is too good to pass up:

The hospital knows the ss# of the baby.

Hi, my name is Z., and I don't know how Social Security numbers are issued.

The previous one seems to have vanished into the ether, so I'll try again with minor variation.

I’m sure whatever involvement you have with data entry has interesting challenges, and requires some problem-solving skills, but you really are doing poorly at communicating.

Oh, Christ, not this sh*t again.

BTW:

My eye rolling is nothing compared to my headshaking at this point. [head-shaking]

Double fail by competent style authority. Clearly, Z. still fails to grasp the different nature of its original error.

I don’t recall that I ever suggested that we would get all the samples and then have a crash program to sequence them all in one year. Even I am not that crazy....

Why? That's the easy part, genius. What continues to perfectly elastically bounce off your skull is everything else.

The Crying of Lot 49

That's what I was thinking of. There is a vast conspiracy of neonatal nurses tucking away bits of bodily fluid to create a Parallel Zebra Genomic Database.

Or at least, that's what people should be more worried about than the actual Zebra Genomic Database, which is under strict (Canadian) government supervision [What could be more non-threatening than that, once they get rid of the current buffoons?]

It was called The Tristero Conspiracy, and it was either a parallel postal service or a paranoid fantasy.

In fact, the workers in the hospital where our cohort babies are born could scoop up all kinds of bodily fluids, and run a parallel zebra database.

Ah, I had previously missed the part where Z.'s fantasy involves the entire birth cohort being born in a single hospital. You can't make this kind of desperately confused flailing up.

Narad: Ah, I had previously missed the part where Z.’s fantasy involves the entire birth cohort being born in a single hospital. You can’t make this kind of desperately confused flailing up.

Not to mention the outliers who get born on the road or at home. What sort of country just has one maternity hospital? I think even Luxembourg has multiple hospitals.

Justatech: And yes, most medical files are super boring, but imagine if you wanted blackmail dirt on someone? “Does your wife know about your new case of herpes?” “Anti-retrovirals? Your mother will be so dissapointed.” Etc, etc.

Yeah, but most offices are fairly well protected, and there are a lot of penalties for people who try that. Wasn't there a case recently where the law came down really hard on someone who was trying to sneak records from Planned Parenthood?

Yes, there are differing state laws regarding Social Security numbers as unique identifiers. I'm not clear on the federal law for medical records since I think SSN's are still being used as such in many podunk medical practices.

However, anything using SSNs would be a no go for me. A few years ago someone from the VA took a laptop containing veterans medical data home* to work on during off-hours. It was stolen and the way the government was able to locate me to let me know this was by using my tax records with SSN. It was almost 20 years since I had been in the military.

Moral of the story. TMI applies here. Too much information is at hand with SSNs already. No need to add more.

*Which I totally disagree with but totally understand if they are anything like business where they expect you to work in your sleep.

**Apologies if I posted this here before; my memory fails me, and often.

Thank you, malia, that's the kind of answer I was looking for. I'll be off to visit family later this month, so it looks like a few of us will be spitting into test tubes.

There is only one way to make a database like that secure: put it on a computer that is never connected to the internet.

There is another way - you can connect it to an internet, just not the Internet. See https://en.wikipedia.org/wiki/SIPRNet
for an example. There are several others.

I made quite a good living helping to build a few private networks back in the day. But your other point is very valid. We had a saying 'never underestimate the bandwidth of a truck full of mag tape'. There were times that a private armed courier with a pouch was faster and cheaper than 3 or 4 years of circuit charges. Ping times sucked.

Another point about Zebra's silly plan - most people don't have genetic anomalies, and beyond the number needed for a really, really good base line, I could see how, maybe 60 to 80% of the database would never need to be used (numbers pulled from my backside, maybe someone can offer up a better range if necessary). All the superfluous data would just sit there sucking up resources, both in the acquisition and the storage. The problem is that you don't know which part you can leave out. Much better to sequence what you need as you need it.

I'm of two minds about the other half of his plan - his medical records database. Clearly, it violates the HIPAA laws, and mostly I think HIPAA is a good idea. But I also like to go out and spend an afternoon poking holes in a piece of paper (and sometimes tasty animals) with a few friends. But every once in a while, some nut case will decide to shoot up a church/ restaurant/town/policeman/school (and yes, you do have to be mentally broken on some level to use other humans for target practice). Everyone agrees that the mentally ill shouldn't be allowed to buy guns. All we need is to have the government keep nice list of the 'afflicted', and Zebra has given us the rational (such as it is).

and yes, you do have to be mentally broken on some level to use other humans for target practice

If by "mentally broken" you mean human, sure; I don't see how it has anything to do with mental illness necessarily, though, especially given that people with psychiatric conditions are no more likely to be violent than "normal" people.

It's an uncomfortable fact that pretty much any human is capable of atrocities, given the right set of circumstances, etc. The "just plain folks" soldiers in the Wehrmacht were just as awful, in fact, as the really committed SS guys. Then there are "crime of passion" impulsive murders, etc., which are also usually committed by "sane" individuals with a gun at hand during a critical moment.

Everyone agrees that the mentally ill shouldn’t be allowed to buy guns.

I mean, if we're going to profile people who buy guns based on how likely they are to commit a violent act, we shouldn't be selling guns to men, actually.

Johnny @150. Here's a better answer

http://real-psychiatry.blogspot.com/2015/08/anger-and-projection-are-no…

Everyone agrees that the mentally ill shouldn’t be allowed to buy guns.

That's all very well, but it gives the power to someone like me to decide if you're mentally ill and to take away what was purportedly a constitutional right.

The doctor knows your ss#.
The doctor sends the information ss# and diagnosis.
The government converts the ss# to xss using the table.
It stores the diagnosis indexed by xss.

I'd estimate that, at a minimum, about fifty percent of the population will opt out when they hear the part about all their personal health information being sent to the government.

zebra@14 1
Here's what I think is the most daunting confidentiality issue. The genome itself can be personally identifiable (i.e. by cross referencing against public genealogy databases). Giving researchers raw sequences is no go, and only giving them certain markers is risky because it's not entirely certain what can be used to identify someone. This is an issue that already exists with our current data, as evidenced by how researchers were able to expose the identities of participants in the 1000 Genomes Project (see #130).

Right now there is no good solution to this problem. Homomorphic encryption is one potential solution. It allows comparisons to be done on encrypted data but with current technology the overhead is too high for it to be practical for large datasets. I'm sure there are other technologies being worked on but the problem of protecting genetic information is not solved.

I'll say it again, if you want to talk about confidentiality issues it would behoove you to at least skim the article I linked to in #123. Here it is again for convenience: Routes for breaching and protecting genetic privacy.

The Crying of Lot 49

That’s what I was thinking of. There is a vast conspiracy of neonatal nurses tucking away bits of bodily fluid to create a Parallel Zebra Genomic Database.

Z.'s grandiosity is again noted. Of all the things I recall of Lot 49, though, this isn't one of them.

Anyway, regarding my immediately preceding comment, maybe Z. didn't mean "sequencing" at all, but rather "annotating" or something.

Or maybe he had no f*cking idea what he was talking about in the first place.

I’d estimate that, at a minimum, about fifty percent of the population will opt out when they hear the part about all their personal health information being sent to the government.

The Zebra Genome Project has already offshored its implementation. The fashion in which it's "done so" is pretty embarrassingly funny, but I have get back to work.

Somebody with more time might want to look at how Denmark handles the information that it keeps

I guess that Icelanders hold the record for the fraction of the population with sequenced genomes, but IIRC they sold all the IP rights to a private company, so the company can do all their research away from the Intertubes and never have to worry about sharing data outside.

Wasn't it the Mormons? IIRC they are busily collecting genealogical data so they can baptize everyone right back to Adam. Or is that a myth?

capnkrunch #155,

"This is an issue that already exists with our current data,"

Exactly.

If you were sincerely trying to answer my question (rather than offering a non sequitur because it's a zebra suggestion and you want to be oppositional), you would explain why you think my project changes anything in a negative rather than positive way.

"Giving researchers raw sequences is no go"

Then, how are malia and Retro Pump doing their research?

How is what I suggest different from what people are doing now?

I would argue that my approach provides more, not less, security, for the reasons I explained.

So, unless you can actually demonstrate a net negative security effect from my approach, I will consider the security issue closed.

jp #151,

"I mean, if we’re going to profile people who buy guns based on how likely they are to commit a violent act, we shouldn’t be selling guns to men, actually."

As a man, who has fired some guns in his time, I couldn't agree more.

ann#154,

"I’d estimate that, at a minimum, about fifty percent of the population will opt out when they hear the part about all their personal health information being sent to the government."

Especially the people on Medicare and Medicaid.

sigh

Right. Like I didn't think of that.

Sigh.

Medicare and Medicaid are covered by HIPAA. What you're proposing would require a waiver of that protection.

I do appreciate the concession of every single other point I've made however.

Speaking of which:

If this database existed, people would use it. It’s absurd to suggest otherwise.

Do you mean people like malia and Retro Pump? Because they've both suggested otherwise. As in:

I just want to add my support to those who have already tried to explain how futile/wasteful Zebra’s idea is, at present.

^^Retro Pump

And:

Would such a database be useful in the real world?? Honestly, the answer is that we don’t know, because we don’t have enough of the biological groundwork to make that decision yet. Might it be useful in the future? Perhaps.
IF (and only if) money and time and storage space and processing power were not an limitation –

^^malia.

And (following up on malia's "IF (and only if)":

I also see Zebra does not have any grasp of how big the computational and statistical challenges are. Think about 3 billion base pairs (that’s 6 billion bases) in an average human. Now think that each human has about 10 million of the base pairs as SNPs, plus CNVs, methylation differences and a whole host of other potential genomic differences between individuals. Let’s leave phasing out of it for the moment, though this is also a big problem. Now you need to screen all 3 billion base pairs to get the basic data that Zebra is talking about (keep in mind that is maybe up to 100 billion reads of DNA for decent filtering and mapping), then you need to do multivariate analysis of all the cross-wise comparisons between each base of “normal” DNA and affected individuals (whatever the effect is you are looking for), and do an enormous amount of correction for multiple comparisons to maybe get an idea of genetic difference that may be driving the effect. The more genes involved in a particular phenotype, the smaller the effect sizes and this is just the start of an answer to why it just doesn’t make any sense to do this right now. Do it when you have a specific question to answer, using the best technology you have available (and can afford) at the time to answer that question. And never, ever assume it is just about the plain text sequence of 3 billion base pairs that you can store in a database.

You haven't addressed the substance of any of that, except by straw-manning. As in:

That doesn’t sound like someone saying “we don’t know what we’re doing we just run around like headless chickens”.

^^

Nobody alleged that anybody was doing that.

It does sound like someone contending with agonizingly slow processing and horrendously “noisy” data and eye-straining poor resolution and stuff like that.

^^

There's not one word in the quote you're discussing about "agonizingly slow processing" or "eye-straining poor resolution."

"Horrendously "noisy" data" is arguably approximately true. But your proposal doesn't have the power to ameliorate that problem anyway. So moot point.

I find it difficult to reconcile this with what you are saying– that “these people don’t even know what they are looking for”.

Yes. But your inability to reconcile two of your own straw men with each other doesn't actually mean anything.

Malia did say that such a database would be useless, because they don't yet know what to look for or how to look for it. Several times. For instance, right here:

Honestly, the answer is that we don’t know, because we don’t have enough of the biological groundwork to make that decision yet.

And here:

We just haven’t done enough of the (much harder) genetics work to figure out how to appropriately interpret the data.

How much clearer could that possibly be? I mean, what use do you expect people to make of data they don't yet know how to interpret?

^^That second one's a serious question. Please answer it.

Blockquote disaster. Here's that last part again, starting with malia's quote:

We just haven’t done enough of the (much harder) genetics work to figure out how to appropriately interpret the data.

How much clearer could that possibly be? I mean, what use do you expect people to make of data they don’t yet know how to interpret?

^^That second one’s a serious question. Please answer it.

@ Krebiozen:

It is the Mormons.

My cousin, who is researching our *interesting* family and her father's (perhaps) even more intriguing clan, has used their resources for ( at last count) 3 countries- US, UK and Ireland- amongst other free on-line material she found. As we already knew, both families have been in the distilled hootch business ( gin and Irish whisky respectively). There are other businesses we knew about ( haberdashery) but some oddities ( a fruit merchant?) There may also be someone ( her family) who went to NSW .

So far though, no Norman Conquest.

And about Denmark:

a prof I knew used it for research about depression in families c. 1975-80.

ann163,

OK. We'll use France or UK instead of Canada, and get about the same sample size.

Although I think you are incorrect that half of Canada's citizens would be uncooperative. I've already said that USA is a bad choice because it is full of paranoid kooks, but nations with rational universal health care approaches I doubt would have such a rate of rejection.

To be clear, the Danish material didn't involves the genome then.

AND I don't think that the Mormon story is a myth.
My atheistic ancestors would not be thrilled.

ann

You haven't made any points, so I can hardly answer them.
Cherrypicking quotes isn't "making a point".

As far as I know, the only commenter who has made a cogent response is Retro Pump.

And as far as I can tell, the only point about which we disagree is how quickly we would convert the data contained in the samples to digital format. I would love to have RP answer my query on that at #112. But you nor apparently anyone else has enough expertise to do that, as best I can tell.

Iceland, people, Iceland:

http://www.un.org/esa/population/publications/WFD%202008/Metadata/LB.ht…

You haven’t made any points, so I can hardly answer them.
Cherrypicking quotes isn’t “making a point”.

Malia says that they don't yet know how to appropriately interpret the data. That's not cherry-picking. It's her main point. She's stated it concisely twice. And she's also explained it at greater length twice.

You've ignored it. You've also misrepresented it by characterizing what she said as being about eye-strain, noisy data, and slow processing, none of which she mentioned. And when that's been brought to your attention, you've mischaracterized the objection as being about chickens with their heads cut off.

I asked a question that you absolutely can answer. It is this:

What use do you expect people to make of data that they don't yet know how to interpret?

This is also not cherry-picked. And it is a valid point:

Let me walk you through this.

(1) All those working geneticists she invokes have unfunded priorities right now.

(2) Having more data “freely available” would not help achieve them.

(3) It also wouldn’t be free. You’d have to spend money creating the database and making it available.

(4) That money is presently needed for other priorities.

(5) So spending it on something else now would detract from rather than aid presently ongoing research.

You would not be saving costs on presently ongoing research by spending money on something that researchers presently don't need and can't use.

That's a significant obstacle. To most people, it would be a dealbreaker, in fact.

How do you propose to overcome it?

Ann, I hesitate to respond quickly in case you are going to have one of your sequential commenting fits and stuff will be crossing on the wires, but anyway.

Look at the reference to the Icelandic study. Look at the number of "working geneticists" who apparently thought it was a good and useful idea. Go figure-- is it some peculiar Icelandic genetic derangement that caused such agreement, or is it just scientists who want to move their discipline forward?

Now, I've dealt with the funding issue more than once, so you will have to go back and read more carefully. The data will be provided free or at a nominal fee to qualified researchers, and it will not take funding away from existing funding agencies. (And I still haven't seen any evidence that operating budgets would be unmanageable at all.)

But as to malia and RP-- the data isn't for them specifically, it is for an expansion of research, so that people can figure out "what to do with the data" sooner. That's how basic science works, by lots of little grad students working on lots of little projects and figuring out better ways to do things and contributing to the whole. Malia and RP can go on with their 20th-Century approach, and we will see which bears more fruit.

@#168 --

Canada:

Genetic Privacy

Fifty-two percent of Canadians surveyed expressed strong concern that, if their doctors recommended that they undergo genetic testing, they might be asked to provide the results for non-health-related purposes. In total, 71% of those who expressed significant concern (scores of 5 -7 on a 7-point scale) said their concerns would likely affect their willingness to undergo genetic testing. More than half of these, 43% said that this was definitely the case.

Another 28% said that it would probably be the case.

Additionally, seven in ten Canadians think that protecting the personal information of Canadians will be one of the most important issues facing the country in the next ten years.

That was as of 2013.

>PDF link.

Per an even more recent survey, a majority of Canadians are also just distrustful of government generally.

Iceland, people, Iceland:

What Iceland where their project to collect genetic data was ended thanks to precisely the kind of privacy issues you have dismissed out of hand?

Ann#175,

Since the information would not be available to the participants, so they couldn't be asked for it, what possible relevance does this have?

Once again, like capnkrunch, you are providing non sequitur with respect to the specific plan I am proposing.

zebra@160

capnkrunch #155,

“This is an issue that already exists with our current data,”

Exactly.

So the solution is to expand our data gathering efforts when there are already confidentiality issues with our current data? How exactly does that make anything better?

Then, how are malia and Retro Pump doing their research?

Yup, I was wrong here. It's this data that can't be publicly disclosed, not that it can't be disclosed to researchers. Ideally, even researchers should not be accessing personally identifiable information. It's a tenuous system already. Greatly expanding the amount of data will only worsen things.

How is what I suggest different from what people are doing now?

For one, scale. On small scales some security measures work fine. For example there have been successful implementations of homomorphic encryption on datasets of 20 or so. Resource consumption scales with size. We've been telling you this the entire thread and somehow you still can't grasp the concept.

More importantly, it doesn't matter whether it is different. The current scheme has issues. You even agreed at the beginning of the post.

Look. I've given you plenty of references and encouraged you to read them multiple times. At this point it's clear your ignorance is entirely by choice.

Krebiozen@176
Very interesting. o

Further @#168 --

The UK:

It's already failed.

They actually proposed it. But they have just as much opposition to governmental invasion of privacy there as we do here, and it's much better organized. Plus they have a very bad track record when it comes to keeping patient information private.

So the idea met with a lot of opposition due to widespread privacy concerns and expert opinion to this general effect:

The IVF pioneer, Professor Lord Robert Winston, told a literary festival over the summer that the hype around the human genome was "complete balls".

Dr Stuart Hogarth, of the department of social science, health and medicine at King's College London, pointed out that the UK already had a DNA database, the UK biobank, which has more than half a million records.

He added that the government did not have a good track record when it came to running major NHS IT projects. The £15bn patient record database was scrapped last year and many projects have come in over budget. "The reality is that much of the data won't be useful," Hogarth said.

"Before we start using genetic data in clinical practice we need to demonstrate that it's going to improve clinical outcomes."

So it never happened.

They do have a databank of half a million people, though.

zebra@177
Do you know what a non sequitur is? There are legitimate confidentiality concerns with current datasets. There are proof of concept attacks to deidentify genomic information. Starting a mass data collection campaign without solving these issues first is irresponsible in the extreme. And it makes even less sense to put the PHI of so many people at risk when you have not one but two experts in the field telling you they don't even need the data.

Since the information would not be available to the participants, so they couldn’t be asked for it, what possible relevance does this have?

My point was that almost half of the Canadian population has indicated that it would refuse to participate in the collection of genetic information due to privacy concerns, in which case they wouldn't be participants.

This is relevant to your argument that "paranoia" in the US wouldn't be an obstacle in countries where the government provides universal health coverage.

That's untrue in both the UK and Canada. There are very high levels of distrust and protest regarding the databanking of personal health information generally and genetic information in particular in both.

France:

The direct transmission of genetic information by health-care providers to other persons or institutions is forbidden, apparently without qualification.

So strike three. It wouldn't work there, either.

Any other ideas?

There's a 186-page .pdf summarizing pertinent national regulations worldwide here, if it helps.

capnkrunch,

Yeah, I think you are dodging the question.

Is it a bad idea to expand genetic research?

Is it a bad idea to expand genetic research using Zebra's Database?

And you can't say "both".

The fact that you can't directly respond to that question is why everything you've written is a non sequitur. It has no relevance to whether my way of doing it is better or worse for security.

If a researcher has N genomes, the security issue does not depend on whether she obtained them from the central database or by taking samples and sending them out for sequencing or doing the sequencing in her lab.

The fact that you refuse to acknowledge that demonstrates that you are not trying to make a sincere argument.

Also, wrt to this:

Since the information would not be available to the participants, so they couldn’t be asked for it, what possible relevance does this have?

I'm not sure I even understand what you're saying.

If they wouldn't be asked to provide their personal health and/or genetic information, how could it be legally obtained by researchers?

If you're suggesting that the government just hand it over, it's precisely the fear of that kind of thing that makes almost half the Canadian population unwilling to get genetic tests.

So it's still a relevant point.

And to avoid any purposeful misunderstanding, that's "N genomes in digital format on her computer."

I've already indicated that I think my method is more secure than either of the others in total.

zebra --

Please explain how you plan to overcome the widespread popular objection to genetic databanking in the UK and Canada.

Please also explain how you plan to overcome the legal obstacles to creating such a databank in France.

Or, alternatively, please suggest a nation where your project would be feasible.

PS --

If a researcher has N genomes, the security issue does not depend on whether she obtained them from the central database or by taking samples and sending them out for sequencing or doing the sequencing in her lab.

Whether or not data is secure actually does depend on where it comes from, who collects it and where it goes.

zebra@183

Is it a bad idea to expand genetic research?

Or

Is it a bad idea to expand genetic research using Zebra’s Database?

And you can’t say “both”.

Both you asshat. Until there are better privacy protections in place I think any mass collection of genomic data is irresponsible. Do I think we should stop using what data we have already? No. Do I think it would be reckless to collect additional sequences on a large scale? Hell yes.

If a researcher has N genomes, the security issue does not depend on whether she obtained them from the central database or by taking samples and sending them out for sequencing or doing the sequencing in her lab.

Scale, scale, SCALE! you twat. There's a difference between exposing a couple thousand people's (about what the biggest datasets are now currently) PHI and exposing hundreds of thousands. Most of the current datasets were collected before the problem was well known. Those reseaechers had an excuse, no one knew the privacy imications of collecting genomics data. You have no such excuse.

There's also a difference between privately held data (i.e. a researcher sequences genomes in her own lab) and shared data (i.e. a database that needs to be accessible by many different organizations). The attack surface is much greater in the latter scenario.

Using our current datasets, people can already be deidentified using their genome. Why the hell would we want to expose hundreds of thousands of more people to that? The fact that you choose not to answer that question means you are just being deliberately obtuse.

@185

I’ve already indicated that I think my method is more secure than either of the others in total.

Ah yes. I remember you saying that but you never explained why. I can't imagine why you think that given that the attack surface is much greater. Please do enlighten me, great security guru zebra.

This thread is getting very tiresome. Barring a reply with some substance from zebra I'm bowing out.

Look at the reference to the Icelandic study. Look at the number of “working geneticists” who apparently thought it was a good and useful idea. Go figure– is it some peculiar Icelandic genetic derangement that caused such agreement, or is it just scientists who want to move their discipline forward?

No. It's that you're wrong to say that.

Flatly wrong.

The national Icelandic database was a government-backed commercial endeavor, and it was not supported by scientists or doctors:

Even Hoffmann-LaRoche clearly distances itself from the database project. "deCODE pursues two distinctly separate and independent business plans," explains company spokesperson Peter Herrmann. The first plan consists of standard family-based searches for genetic markers, including informed consent and bioethics committee approval, with the data sent to pharmaceutical and academic partners to develop diagnostics and therapeutics. The second plan is the national database. "Hoffmann-LaRoche's research collaboration with deCODE is structured exclusively and entirely along the first business plan model, and is completely independent of the establishment of the overall database," Herrmann clarifies.

Got that? The national database was not the one that was going to be used for diagnostic/therapeutic research.

And it was opposed by the people who do such research:

Medical organizations, fearing loss of access to existing, small databases such as patient registries, are aligning behind Mannvernd. The Icelandic Medical Association (IMA) advises its members to refuse to submit patient records. "A large number of physicians have signed a declaration to that effect," reports Bogi Andersen, an assistant professor of medicine at the University of California at San Diego.3 And at a meeting in Santiago, Chile, in April, the World Medical Association officially backed the IMA.4 At the meeting, Tomas Zoega, chair of the IMA's ethics committee, warned that if the project were not halted, similar endeavors would begin here.

Link.

That's why the project failed, just as it did in the UK.

As to this:

Malia and RP can go on with their 20th-Century approach, and we will see which bears more fruit.

You have yet to produce an example of so much as one, single 21rst-century thinker who wants such a database besides you.

All the attempts to start one have been corporate-government partnerships the (ostensibly) scientific justifications for which were decried by scientists in terms such as "complete balls."

Now, I’ve dealt with the funding issue more than once, so you will have to go back and read more carefully. The data will be provided free or at a nominal fee to qualified researchers, and it will not take funding away from existing funding agencies. (And I still haven’t seen any evidence that operating budgets would be unmanageable at all.)

I've read what you said. That the data would be provided free does not save costs unless people need and want but can't afford it. And there's not a whisper of a hint that that's the case. On the contrary, all the evidence suggests -- or even states -- that such a database would be useless for research purposes. For example, as the article linked above says:

This everything-but-the-kitchen-sink approach is at odds with traditional genetic research, which has focused on single-gene disorders, where Mendel's laws predict recurrence risks. In contrast are multifactorial or "complex" traits for which predictions are less precise, if not impossible, because of the input of multiple genes and the environment. Deriving information on such complex diseases from pedigrees, as one would for a single-gene trait, may be difficult, if not fundamentally flawed.

Got that? The kind of research you have in mind is presently impossible. So you would not be saving costs on it. It can't be done.

That leaves you with the expense of compiling a database to make useless information freely available.

And if the money would not come from existing funding agencies, where would it come from? Setting up and staffing a new agency would just add to the costs.

Given that zebra can't name a country where his idea is feasible or produce any evidence at all indicating that there's any need, wish, or use for his project I think that unless he does, its uselessness has been too fully demonstrated to require further proof.

So I too am bowing out until then.

Take it away, z. The field is yours.

capnkrunch:

If you oppose expanding all genetic research because of security concerns, that's fine, but you could have said that way back. I happen to disagree with your risk assessment, and that of the numbers of people ann cites as having concerns. It is often the case that people misperceive relative risks; I think that has been discussed here in the past.

I think I've expressed my take on this but lets see if I can briefly revisit it. There are "hundreds of thousands" (millions, actually) of medical records extant in many locations-- not genomes, but actual medical records with easily understandable information about individuals and their health issues. So the "attack surface", if I understand you meaning, of my database represents a trivial relative risk.

If there are any Willie Suttons out there looking to exploit and monetize medical information, they would be attacking Medicare and Medicaid, the VA, and the private insurance companies.

Why would they go after this not-very-liquid asset, which requires much more work before they can identify individuals, and which is tiny in "scale" compared to what they can get from any of the other large institutions, with many more avenues of access?

I've pointed this out in various forms more than once. I've also pointed out that the actual mechanics of my project keep things as arms-length as is possible, which ann is clearly befuddled by, and she needs to review how I've said it would work.

Relative risk-- people just aren't very good at thinking these things through.

Ann190,

I already mentioned the UK which has a large government-owned database which uses phenotypical grouping.

The Iceland study I talked about is here:

http://www.nature.com/ng/journal/v47/n5/full/ng.3247.html

Since your link doesn't work, I am not sure what you are referring to.

@#192 --

That study only did whole-genome sequencing of 2,636 Icelanders, not the whole population. If it demonstrates anything, it therefore demonstrates that the kind of database you're proposing isn't necessary for genetically homogenous populations.

However, as it happens, your idea was attempted in Iceland. And it fell apart for the same reasons it did in the UK. Scientists and physicians opposed it, and there were privacy concerns.

The busted link is here:

http://www.the-scientist.com/?articles.view/articleNo/19517/title/Icela…

^^As you might gather from that, the public was not initially opposed. This was in 1999. However (as explicated at krebiozen's link) the Supreme Court put the kibosh on it in 2003. And concerns about privacy have greatly increased since then.

krebiozen's link, in the event tht you need it, is here:

http://www2.law.ed.ac.uk/ahrc/script-ed/issue2/iceland.asp

SHORTER VERSION: Iceland didn't need a national genome database to do that study. And when one was proposed, it failed due to the objections being raised here and everywhere else in the world that a corporate-government partnership tried to rustle one up.

In all cases I've seen so far, scientists and researchers didn't want or favor the project. That includes Iceland. This:

Look at the reference to the Icelandic study. Look at the number of “working geneticists” who apparently thought it was a good and useful idea. Go figure– is it some peculiar Icelandic genetic derangement that caused such agreement, or is it just scientists who want to move their discipline forward?

is therefore flatly wrong. That study sequenced fewer than 3000 people. It did not require a national database of genomes. And the working geneticists who did it apparently were perfectly well able to move their discipline forward without one.

which ann is clearly befuddled by, and she needs to review how I’ve said it would work.

Ha.

I'm not the one who suggested that a study in which working geneticists were able to reach conclusions about the population of Iceland by whole-genome sequencing a mere 2636 people constituted proof that scientists wanted and needed a population-wide genome database in order to move their discipline forward.

ann,

First, the UK project here

http://www.genomicsengland.co.uk/

is alive and well, or at least its website is.

There was a commercial Iceland project that was sold off multiple times after not making money, but that is nothing like what I am suggesting, and I don't know what connection there might have been with the Nature paper I referenced. That was the first thing I came across.

But as I said earlier, you don't seem to grasp the concepts very well.

I am not suggesting that the entire population of Canada should be included in the database.

Nor have I ever suggested that any individual study would use the entire database.

I already went over this multiple times and refer to these ideas as grandiose strawmen, which accusing me of suggesting is ridiculous even by the standards of this blog's comment threads.

You obviously haven't comprehended what are essential elements, like the fact that the genetic information is completely isolated, including from the people contributing it.

You seem to think that if half the population refuses to participate, we can't collect the data from the other half. It's like gay marriage, ann, if you don't believe in it, the law doesn't require you to participate, but you can't stop other people from doing it. And anyway, you and capnk keep complaining that there is too much data-- ok, you should be happier with half as much, right? That's something around 190K genomes.

So if you just want to keep repeating this stuff which is either wrong or irrelevant, I don't know how to further help you.

First, the UK project here

http://www.genomicsengland.co.uk/

is alive and well, or at least its website is.

Yes, I know. The one that failed was the one that was like what you're proposing -- government sponsored, all citizens, cross-referenced through the NHS, etc.

Read the link.

There was a commercial Iceland project that was sold off multiple times after not making money, but that is nothing like what I am suggesting, and I don’t know what connection there might have been with the Nature paper I referenced. That was the first thing I came across.

It's the same project. The company is called DeCODE. It made a tremendous amount of money hyping itself, then crashed, leaving individual shareholders with trash.

For example:

Before the cyclist hit him in 1996, Hinrik Jonsson was a strong, skilled, energetic craftsman, a 35-year-old with a young family, whose hand-made furniture could be seen in houses all over Reykjavik. In an instant, the accident destroyed his livelihood. He suffered brain damage which left his mind fully functional but his body reluctant to obey its commands. Some days, he cannot walk. It took him two and a half years to wring compensation out of a web of insurance companies while his wife took multiple jobs to feed the family.

He was finally awarded 23m Icelandic kronas - about £180,000 at today's exchange rates, not much for 35 lost working years (Icelanders retire at 70). Anxious to invest it wisely, he thought carefully, discussed the matter with his family, and decided to put a hefty chunk of the money - five million - into the stock market. In the spring of 2000, he walked into a state-owned bank, the National Bank of Iceland, where a broker swapped his money for shares in the hottest Icelandic company on the country's thriving but unregulated grey market, Decode Genetics, at $56 a share.

Earlier this year, Jonsson returned to the bank to sell his shares. They were worth not much more than a 10th of what he had paid for them - just under $6 apiece. Jonsson had lost tens of thousands of pounds of the compensation that had to see him through a lifetime of disability.

http://www.theguardian.com/science/2002/oct/31/genetics.businessofresea…

But as I said earlier, you don’t seem to grasp the concepts very well.

Ha.

You may now troll yourself.

I'm bowing out.

IDK, zebra, maybe you need to ask those here to look at your genome db proposal while on...weed.

You're beginning to sound like a communist. "Yeah, communism historically has caused untold misery but you just haven't tried my communism yet."

And you then you take it personally that they are against it because you mentioned it. You know, I originally thought the scientists here would take a database of this information if it auto-magically dropped in their laps. Not that this was ever in anyone's power to do anyway, but they made a good case why they wouldn't even want that. Data is not information. Throw in all the other unpractical reasons and this idea is dead in the water.

But don't worry, it always was because there is no will, money or defining interest in it yet. The efforts are better spent elsewhere. Like, maybe, feeding people whose genes may or may not hold ticking time bombs. But as I have learned in resisting the never ending worry about my hypertension and cholesterol by MDGPs and naturopaths, it doesn't matter to me what is going to kill me in ten years if I can't survive in the present time.

zebra,
OK. There's no "there" there. I'm following ann. Some parting comments:
-I'd challenge you to reexamine this belief that "I don't agree" constitutes a valid argument.
-Comparing relative risk of storing personal medical data and of centralized collection of research data is meaningless. I'll leave it as an exercise for you to figure out why (hint: you're using the wrong risk assessment tool).
-The prevailing (and most prudent) paradigm in information security is to minimize all unnecessary exposure, not that additional exposure is ok so long as it is trivial compared to another.

But as to malia and RP– the data isn’t for them specifically, it is for an expansion of research, so that people can figure out “what to do with the data” sooner. That’s how basic science works, by lots of little grad students working on lots of little projects and figuring out better ways to do things and contributing to the whole.

Which, of course, is why Z. proposes a monolithic government project. Add in completely whimsical, magically free data access, and Problem Solved!

Malia and RP can go on with their 20th-Century approach, and we will see which bears more fruit.

Oh dear, oh dear, Z. has demoted poor Malia from "real expert" to "20th century thinker." How will her NGS lab go on?

(For those keeping score, I'd rank the juxtaposition of these two quotes as being a genuine non sequitur.)

It's truly amazing the amount of hubris that one person who has no practical or theoretical understanding of any of the scientific disciplines involved* – indeed, who can't even recognize a basic arithmetic error – much less the issues surrounding privacy and ethics in human subjects research can single-handedly generate.

* "capnkrunch 136, I’m sure whatever involvement you have with data entry has interesting challenges, and requires some problem-solving skills...." Sweet Jesus.

So the issues with the "birth cohort genome database" are:
1) We can't get the medical records we would need to know if any given gene actually has a relationship with a disease.
2) We don't have the processing bandwidth (technical or human) to actually analyze the data.
3) The cost of storing 1 cohort's data is about $1million. Yes, I know Z said we could get it by not making F35s, but since when has DoD ever given up a single penny of their budget to someone else? See bake sales for bombs.
4) It would take generations to be able to learn anything about adult-onset diseases.
5) Acquiring the genetic material to sequence requires both an approved study and the informed consent of every single person in the cohort (or their parents, and given how protective people are of their children I think that a 20% enrollment rate would outstanding).
6) We already know that we cannot keep the data private, as is required *by law* for any human-subject study in the US (and most other countries).

So aside from all that, it's a brilliant plan.

There was a commercial Iceland project that was sold off multiple times after not making money [...] and I don’t know what connection there might have been with the Nature paper I referenced

Perhaps you should have read the paper.

So the issues with the “birth cohort genome database” are

That list has some omissions IMHO, but I'm certainly not going to fault anyone with losing interest in this Z. misadventure.

Not a Troll @198: I can't speak for anyone else in the commentariat, but I suspect it would take LD50 levels of tetrahydrocannabinol to get me high enough to think that our equine friend's database might be a good idea. Several commenters have raised a number of objections, any one of which should have been sufficient to sink the project, yet he remains as impervious to rebuttal as an old-fashioned vinyl record. (Or CD, in case you are too young to remember vinyl records.)

He reminds me not so much of Communists, as of political figures like Reagan and G. W. Bush. They would get certain ideas in their heads, and no amount of evidence to the contrary could get them to abandon those ideas. He seems to disdain those of us in the "reality-based community", to use a phrase coined by an official in Bush's administration.

There was a commercial Iceland project that was sold off multiple times after not making money […] and I don’t know what connection there might have been with the Nature paper I referenced

Perhaps you should have read the paper.

Indeed, and not just for the punchline. Reference 18 looks totes kewl, as well, even if it is seven years old.

But this episode highlights, for me at least, Z.'s serious, ah, "oppositional" take on "20th century" notions regarding the order of carts and horses.

This one seems to mark the beginning of Z.'s realizing that Malia was getting uppity, or something:

My comment about 20th century thinking wasn’t intended for you, but…”pre-existing conditions”* ?? I thought we were eliminating that little canard here in the USA.

But I get the same sense of Nirvana Fallacy/Grandiose Strawman from you that I do from others. Let’s say we collect the data as I described in 44, and sure there will be some opting out, and sure there will be errors– as happens now, but you keep working.

What I really don’t understand is why you think this only has utility for some great project involving all of humanity in the future. Assume the data is available; why will there not be young scientists (or old corporations) picking out sub-populations to study?

Then again, the whole single-cohort routine does at least do away with the hoary 17th century dualist notion of comparing the size of the cart and the horse and thinking about maybe putting one on top of the other instead.

For those of you who think that z*bra is unable to learn may I point out a fact which TOTALLY destroys that hypothesis: z*bra changed from writing "non-sequitur" to "non sequitur" midway in this thread! Nothing else has penetrated its addled pate but s/he is demonstrably better informed than when this conversation began.

And yet, he's dazzled unto blindness by the 21rst-century-ness of an idea that people have been using to fleece suckers, marks and governments since 1999 -- eg, DeCODE -- precisely because it elicits that response.

Nothing else has penetrated its addled pate but s/he is demonstrably better informed than when this conversation began.

You're leaping to the assumption that he's figured out why.

A note on the quality of the responses, ann at 196

First, the UK project here

http://www.genomicsengland.co.uk/

is alive and well, or at least its website is.

Yes, I know. The one that failed was the one that was like what you’re proposing — government sponsored, all citizens, cross-referenced through the NHS, etc.

1) This is government sponsored and government owned and obviously they have access to the NHS data.

2) If anyone is smoking or imbibing something, it seems to be ann, who still hasn't figured out-- despite the fact that I just said it-- I never suggested anything like "all citizens". Grandiose strawman.

And various others presenting various strawmen and gish gallops too numerous to mention.

The good news is that RP, who is actually qualified, "got it", and that makes it worth the trouble.

zebra,

The good news is that RP, who is actually qualified, “got it”, and that makes it worth the trouble.

You wrote something so idiotic it annoyed a long-term lurker enough to delurk and write:

I realise that trying to dismount Zebra from his high horse is a futile task. But as a life-long geneticist/genomicist/molecular biologist or whatever it’s called these days, I just want to add my support to those who have already tried to explain how futile/wasteful Zebra’s idea is, at present.

Yet you somehow interpret this to mean RP agrees with you? Astonishing.

Question for non-zebra posters, what would the ethical implications of such a database be? I'd imagine that collecting medical information past the age of consent would require a new consent from the patient themselves but at that point could they request that information gathered with their parent's consent be deleted? It seems a little questionable to me to allow a parent's consent to be enough to collect the information and store it for an arbitrary amount of time. Maybe if it was collected with parent consent and couldn't be used until the patient was at the age of consent and provided it.

Also, forgot to say it before, thanks malia and Retro Pump for chiming in. If nothing else, zebra at least stimulated some new people to delurk. I agree with Narad's sentiment in #110.

Krebiozen #210,

"Yet you somehow interpret this to mean RP agrees with you."

And yet another bizarre strawman, based on yet another cherrypicked quote.

Thanks for demonstrating my point.

I could have done without the high horse insults but I actually learned a great deal from this discussion. And for those who had a fair grasp of the subject before, I hope it strengthened your thinking on it.

1) This is government sponsored and government owned and obviously they have access to the NHS data.

Yes, I know.

But you're proposing a government-funded DNA database of the entire national birth cohort for one year, with a view to creating a resource that's large and nationally representative enough to provide researchers with the subpopulation sample of their choice.

The UK proposed something equivalent. And as I said @#179, it failed.

Read the link I provided. Here it is again.

The project you're linking to now is also a government-sponsored large DNA database. But its exclusive research focus is cancer and rare disease. It recruits exclusively from a pool of patients who have already been diagnosed, plus their relatives.

Fergus mentioned it back @#115. And you ignored it. Because it's not remotely like what you're suggesting.

2) If anyone is smoking or imbibing something, it seems to be ann, who still hasn’t figured out– despite the fact that I just said it– I never suggested anything like “all citizens”. Grandiose strawman.

The entire point of your project is to assemble a database that represents something like all citizens -- an entire birth cohort.

You directly admitted that it was akin to a population-wide*** project yourself. ("Iceland, people. Iceland.")

***Or actually what you wrongly believed was one. But same difference.

zebra @#174:

Malia and RP can go on with their 20th-Century approach, and we will see which bears more fruit.

And zebra @#209:

The good news is that RP, who is actually qualified, “got it”, and that makes it worth the trouble.

Ann

"Equivalent" is just your (yet again) ridiculous strawman. But putting that aside:

"The UK proposed something equivalent. And as I said @#179, it failed."

In 1992 or so, the Clinton administration tried to institute a universal healthcare system. It failed.

So I guess that for you, that would be evidence that universal healthcare is a terrible idea. And that no progress could possibly be made towards that goal, under different conditions and incorporating lessons learned from the first attempt.

As I said, strawmen, non sequitur, gish gallop, yadda yadda.

But you’re proposing a government-funded DNA database of the entire national birth cohort for one year, with a view to creating a resource that’s large and nationally representative enough to provide researchers with the subpopulation sample of their choice.

After they somehow are able to a priori define meaningful subpopulations based on combinations of subsequence queries.* Unless, of course, the "results" are going to require genetic screening of everybody to have clinical utility.

* Again, nobody's going to be handing out entire genomes willy-nilly based on fishing-expedition–level proposals.

Question for non-zebra posters, what would the ethical implications of such a database be?

There look to be a lot more questions than answers. Pages 323–24 here (PDF) are on point. (The Elger & Kaplan is here; "supra note 133" in n.143 is a typo for 135.)

One key issue is that the sample donation, by definition, does not provide any benefit to the donor, although it does entail ongoing risk. For that matter, if things go swimmingly, the end result will likely be a private company reaping a handsome financial reward from all the donated samples.

^^ In fact, now that I think about it, Z. seems to have been awfully short on that "clinical utility" thingamabob in general.

So I guess that for you, that would be evidence that universal healthcare is a terrible idea. And that no progress could possibly be made towards that goal, under different conditions and incorporating lessons learned from the first attempt.

Since it's your bright idea and you're invoking them, what would those "lessons" be?

capnkrunch,
In the UK the Nuffield Council on Bioethics has done a lot of work in this area. It is difficult for many reasons, for example because a person's genome also gives information on their relatives' health, thus impinging on their privacy too. There are also issues around informed consent, as we don't really know yet what information might emerge from the genome in the future. Some embarrassing or even criminal characteristics might be connected to some constellation of genes - imagine if someone found an association with pedophilia, for example (as unlikely as that may be).

Narad #220,

Better politics. Duh.

Let's see--

Death panels, keep the government's hands off my medicare (ann might have been one of those), your giant corporate employer will drop your coverage, blah blah blah...

Here it's some kind of ninja codebreakers....

1. Accessing the digitally encoded genome of a small random part of the population, either by getting into a secure database (or I think someone was worried about Fedex? trucks carrying hard drives being hijacked?)

2. Figuring out the medical information contained in the genome, which our experts say is excruciatingly difficult.

3. Determining who the individuals are from the genome.

4. Somehow turning all that into a profitable venture....

5. Instead of paying off some minimum-wage data entry person to get them access to medical records at the insurance company.

So yes, just like one set of irrational concerns was overcome, good politicians convincing enough people not to be stupid is what you need. Just like with the vaccination exception problem, as an even more recent example.

Re #222

I think we are up to ~221 posts trying to convince you not to be stupid. It's no easy task when we are confronted with a D-K sufferer who is showing all signs of being in the top 1/10 of 1% of all those so affected.

Equivalent” is just your (yet again) ridiculous strawman. But putting that aside:

As well you might, since it is equivalent, by your own admission.

“The UK proposed something equivalent. And as I said @#179, it failed.”

In 1992 or so, the Clinton administration tried to institute a universal healthcare system. It failed.

So I guess that for you, that would be evidence that universal healthcare is a terrible idea. And that no progress could possibly be made towards that goal, under different conditions and incorporating lessons learned from the first attempt.

Why, no. It would not. Because there's abundant evidence that universal healthcare is a good idea, which can and does succeed time and time again.

But speaking of gish-galloping, cherry-picking, non sequiturs, and straw men:

As I'm sure you know, when I first raised the point it was in the context of pointing out that there were, are and have been such numerous and widespread objections to your tired old "21rst-century" idea everywhere in the world that anyone's ever tried introducing it for the last sixteen years that that's how long it's been failing!

As I said, strawmen, non sequitur, gish gallop, yadda yadda.

And as I said: Ha.

Narad@218 and Krebiozen@221
Thanks. That's exactly what I was looking for. I made it through Narad's links and have been skimming Krebiozen's. It definitely seems like an interesting area in bioethics. Interestingly (to me at least) is that there doesn't seem to be any issue with parents consenting to their children's data being stored, as long as the child has the ability to opt out once they reach the age of consent.

I thought that would be more of a problem, especially in light of what Narad said:

One key issue is that the sample donation, by definition, does not provide any benefit to the donor, although it does entail ongoing risk.

I always thought the principle behind allowing parents to consent for children was that they could make decisions that are best for the child but the child wouldn't make for themselves (i.e. no child would choose to get injections). In this situation where the benefit to the child is more unclear (there's benefits for society but not necessarily the individual) I had figured the same principle might not hold. I'm not an ethicist though and it seems like the experts think otherwise.

So I guess that for you, that would be evidence that universal healthcare is a terrible idea. And that no progress could possibly be made towards that goal, under different conditions and incorporating lessons learned from the first attempt.

Since it’s your bright idea and you’re invoking them, what would those “lessons” be?

Better politics. Duh.

You would have done yourself a favor to pretend that you had never seen the question.

I've already openly wondered here what on earth your college or university training was in; in point of fact, I often have a hard time convincing myself that you're not a none-too-bright high-school student.

There was a commercial Iceland project that was sold off multiple times after not making money, but that is nothing like what I am suggesting, and I don’t know what connection there might have been with the Nature paper I referenced..

I confess that when I pointed to the Iceland red herring back in comment #158, it seemed too much to hope for that anyone would take the bait.

Opus,

Re #222
I think we are up to ~221 posts trying to convince you not to be stupid. It’s no easy task when we are confronted with a D-K sufferer who is showing all signs of being in the top 1/10 of 1% of all those so affected.

That's my comment at #222 - I do hope you aren't referring to me.

HDB,

You did kindly point out that bear trap, but zebra just blundered on in anyway.

I do find it hilarious that zebra is now simply dismissing the enormous problem of privacy that hundreds of the best minds in genomics and bioethics are currently wrestling with.

Krebozien @ 229

My comment was to z*bra's at 223.

Memo to self: check for typos before noting someone else's stupidity.

Narad #227,

I was really surprised that an experienced drive-by troll like yourself would pitch me such a softball.

You would have done yourself a favor to pretend that you had never seen the question.

Although if it's better politics he's looking for, I'd suggest business-friendly Qatar. It's an absolute monarchy without much in the way of civil liberties and a partnership with Weill-Cornell Medical College. The politics couldn't be any better, really.

There's even a regional precedent of sorts:
Kuwait, people. Kuwait.

Krebiozen #230

Why would any mature, rational (non-paranoid), non-Authoritarian adult do anything but dismiss this kind of silliness about "ethics" and "privacy"?

I also don't pay much attention to the ruminations of very bright, very well educated Roman Catholic theologians, who are equally discussing meaningless concepts. Is that also "hilarious"?

To the extent that society is actually democratic, decisions can only be made on the basis of group self-interest. Crudely put, something like greatest good for the greatest number.

(I have no moral/ethical position like Utilitarianism, that's just a familiar form of expressing a pragmatic approach.)

Sorry. That link should have been:

Kuwait.

Why would any mature, rational (non-paranoid), non-Authoritarian adult do anything but dismiss this kind of silliness about “ethics” and “privacy”?

If you'd like to ask some who don't, the names of the ones who serve on the advisory panel of national leaders in medicine, science, ethics, religion, law, and engineering over at the Presidential Commission for Bioethical Issues are right here.

Ann @236,

Ah. The answer to a non-Authoritarian thinking something is unimportant is Appeal to Authority.

Well OK, I guess that's settled.

Ann, you need to take a breath and think. Once in a while you demonstrate that you can write a complete paragraph that makes sense, but this kind of thing inclines one to not bother reading your comments.

Kuwait: I'm not going to waste my time researching the intricacies of the Kuwaiti political system, but I think it isn't very democratic. Also, the actual citizens (not foreign labor) tend not to care that much because they are petro-wealthy. But maybe I will be corrected.

You should read about the very long battle to institute Universal Health Care in the US and the forces that opposed it. You should also, again, pay attention and think-- 1992 was only 23 years ago, and everyone said UHC was never going to happen. And then there was don't ask don't tell, which was considered the best possible compromise at about the same time, and look how far we've come with respect to marriage equality. Not one person would have predicted where we are today.

So, I wouldn't be so impressed that some attempts to do genetic research and collect data have been blocked; it's early days.

It is also the case that GMO have been blocked in Europe. Does that mean GMO are a Bad Idea, and they will never be accepted?

For someone who complains about non-sequiturs....Zebra throws out quite of few of them, doesn't he?

zebra,

Why would any mature, rational (non-paranoid), non-Authoritarian adult do anything but dismiss this kind of silliness about “ethics” and “privacy”?

I suspect explaining anything that a "mature, rational (non-paranoid), non-Authoritarian adult" might do would go right over your head. Let's try this: how would you like it if it your genome was publicized and it was revealed that you had a constellation of genes associated with tiny-penis-syndrome or intractable-pig-headed-disease?

Concerns about privacy are not about being "Authoritarian", they are about people being rightly worried that their privacy might be compromised by government and big business. That's the opposite of "Authoritarian".

What you seem to have a problem with is the idea that some experts know better than you do. Perhaps you have never studied any subject in enough depth to realize just how little you know and the need to have people specialize so they can advise those of us who have other areas to specialize in. I spent two years studying genetics, passed exams on the subject and have read dozens of books on the subject over the years, enough to know just how little I know. I would never have the hubris to accuse two geneticists of 20th century thinking, I am quite certain they know better about that area than I do. You appear to have no concept of the vastness of your ignorance.

I also don’t pay much attention to the ruminations of very bright, very well educated Roman Catholic theologians, who are equally discussing meaningless concepts. Is that also “hilarious”?

You seriously think that privacy concerns about data that can be used to predict a person's medical future and have real tangible consequences are "meaningless concepts" of a similar ontological status as souls and sins? If you do you are even dimmer than I thought. Once again you equate scientists with other groups, as if science is just like religion, or woo. Have you drunk from the PoMo cup?

To the extent that society is actually democratic, decisions can only be made on the basis of group self-interest. Crudely put, something like greatest good for the greatest number.

What are you babbling about now? Sequencing the genome of millions of people isn't going to give us useful data, for several reasons that have been explained. That isn't group self-interest or the greatest good for anyone. Any normal person would have admitted they were wrong and joined a discussion about the best way to go forward instead of embarking on a moronic pantomime as is your usual MO. Not for the first time I find myself seriously wondering what is the matter with you?

(I have no moral/ethical position like Utilitarianism, that’s just a familiar form of expressing a pragmatic approach.)

You have been told, by two people who work in the field, why your suggestion is utterly impractical, yet your ego won't let you admit you were wrong.

Not only that, but you are rude to people who have been perfectly polite to you, and accused those whose explanations have been easily understandable and make good sense of being bad at communicating, or of being on drugs. In short you are a twit of the highest order.

Ah. The answer to a non-Authoritarian thinking something is unimportant is Appeal to Authority.

No. The answer to blind solipsistic bias is exposure to the thoroughly considered views of others, conveniently available online in .pdf form.

Also, I suggested that you ask them, not that you take their mere authoritative existence as proof of anything.

Well OK, I guess that’s settled.

No, once again you're using a straw man to avoid addressing the substance of the objections.

Ann, you need to take a breath and think. Once in a while you demonstrate that you can write a complete paragraph that makes sense, but this kind of thing inclines one to not bother reading your comments.

You wound me to my very core.

Kuwait: I’m not going to waste my time researching the intricacies of the Kuwaiti political system, but I think it isn’t very democratic.

Was the fact that it has compulsory DNA testing your first clue?

Also, the actual citizens (not foreign labor) tend not to care that much because they are petro-wealthy. But maybe I will be corrected.

Technically, it's semi-democratic, which -- if you ask me -- means it's an autocracy that can afford to allow a superficial and largely cosmetic veneer of freedoms in some areas.

The problem with talking about what the actual citizens care about is that it presumes that the concept of citizenship plays a part in the way Kuwaitis think about themselves, others, politics and/or where they live. It's not that kind of place.

However, since I'm not sure what you're saying they don't care about, exactly, I guess it doesn't matter anyway.

You should read about the very long battle to institute Universal Health Care in the US and the forces that opposed it.

I'm a politically active person who has not only lived through but actually fought in some parts of that battle, which is -- incidentally -- not actually over, because guess what?

We don't actually have universal health care in this country.

You should also, again, pay attention and think– 1992 was only 23 years ago, and everyone said UHC was never going to happen.

Yeah, well. I'm an optimist myself.

But I can see how 23 years of dealing with implacably ill-informed, self-regarding, and overly complacent bozos who evidently don't even know what UHC is could make a person kind of dispirited.

And then there was don’t ask don’t tell, which was considered the best possible compromise at about the same time,

Not by me.

and look how far we’ve come with respect to marriage equality. Not one person would have predicted where we are today.

I think you mean "not one person who knows so little about the fight for marriage equality that he thinks "don't ask/don't tell" had something to do with it would have predicted where we are today."

Because although the outcome of Obergefell v. Hodges wasn't certain, it was among the reasonably foreseeable options.

Speak for yourself, IOW.

So, I wouldn’t be so impressed that some attempts to do genetic research and collect data have been blocked; it’s early days.

Enjoyable as it's been batting them around, those analogies are straw men. There's no constituency fighting for the right to establish a DNA database of the entire birth cohort for one year. There's no popular political demand for one. There's no scientific demand for one. And not only that:

Widespread, well-grounded and rational political and scientific objections are actually what's prevented private-sector corporate interests seeking profits from doing an end-run around democracy by partnering up with advocates for a surveillance state in order to set one up.

So get back to me when you can make a case for it being a cause worth fighting for that's a little more persuasive than "I can see the future because I say so."

Or make it now, if you have one.

It is also the case that GMO have been blocked in Europe. Does that mean GMO are a Bad Idea, and they will never be accepted?

No. But my point has never been that your proposal is a bad idea because it's been blocked. It's that it's been blocked because it's a bad idea.

So that's just straw plus straw.

I was really surprised that an experienced drive-by troll like yourself would pitch me such a softball.

Your somehow managing to hit it directly into your foot was certainly impressive.

On the other hand, the fact that your grab-bag of language failures includes the word 'troll' is merely dirt-common.

^ Actually, it was closer to deciding on a sacrifice bunt with no runners on.

Concerns about privacy are not about being “Authoritarian”, they are about people being rightly worried that their privacy might be compromised by government and big business. That’s the opposite of “Authoritarian”.

I tried to make this point back in comment 139, but Z. must have been too busy with some other facet of his brilliance to notice.

Krebiozen 239,

The issue is with "privacy" and "ethics" as concepts-- they are just as vacuous as souls and sins. How is "being unethical" different from "sinning"? In either case, some authority has created a set of rules which don't have legal consequences. If they do have legal consequences, we call them...you know...laws.

If you are asking whether I would vote for laws that restrict access to my medical information, that's a different question. Of course I would, as a practical matter. But other than through the interpretations of SCOTUS in the USA, I have no "right to privacy" that I can characterize-- I would have to go to court with standing in a specific case to find out what that would entail. Just like I suppose I would have to die to find out if I have an immortal soul.

Does that help with understanding my ontological perspective?

And of course, as should be clear from multiple comments on the topic, my vote concerning genomes would be influenced by what I consider the relative risks of any particular program. Since my clearly understandable medical information, not to mention my penis size, is on record with multiple entities, I consider the relative risk from those ninja hacker codebreaker criminal geneticists I described in 223 to be negligible. If you have concrete evidence (not speculation) to the contrary, let's hear it.

By the way, if you don't want to be characterized as befuddled (for whatever reason), or just being childishly dishonest yet again by taking quotes out of context and MSU:

You seriously think that privacy concerns about data that can be used to predict a person’s medical future and have real tangible consequences are “meaningless concepts” of a similar ontological status as souls and sins? If you do you are even dimmer than I thought. Once again you equate scientists with other groups, as if science is just like religion, or woo. Have you drunk from the PoMo cup?

See, I never said any of those thing. I equated ethics with morality. Nothing about scientists. Nothing about the practical matter of access to medical records being restricted. Grow up.

Ann

Kuwait: I’m not going to waste my time researching the intricacies of the Kuwaiti political system, but I think it isn’t very democratic.

Was the fact that it has compulsory DNA testing your first clue?

This is what I mean. What does one have to do with the other?

But other than through the interpretations of SCOTUS in the USA, I have no “right to privacy” that I can characterize– I would have to go to court with standing in a specific case to find out what that would entail. Just like I suppose I would have to die to find out if I have an immortal soul.

Does that help with understanding my ontological perspective?

The howlers just keep coming.

If you are asking whether I would vote for laws that restrict access to my medical information, that’s a different question. Of course I would, as a practical matter.

If you're not a member of congress, he probably wasn't asking you whether you would vote for any laws.

But other than through the interpretations of SCOTUS in the USA, I have no “right to privacy” that I can characterize– I would have to go to court with standing in a specific case to find out what that would entail. Just like I suppose I would have to die to find out if I have an immortal soul.

How -- how, ffs, how? -- is dying like going to court to find out if you're safe from illegal search and seizure in your own home?

I mean, that you'd have to [do something] in order to find [something] out is really not a unique enough distinction to be categorical.

Does that help with understanding my ontological perspective?

No. I'm not sure I even see any ontology.

And of course, as should be clear from multiple comments on the topic, my vote concerning genomes would be influenced by what I consider the relative risks of any particular program.

Unless you're only weighing the risks to yourself, that would be an ethical consideration, by definition. BTW.

Since my clearly understandable medical information, not to mention my penis size, is on record with multiple entities,

...

Well. That voting-for-laws thing makes more sense if you're Anthony Weiner.

I suppose you might also be Iggy Pop. But it seems unlikely. He's very articulate.

And there are, no doubt, other possibilities, too. Still. If you don't mind my asking:

Exactly what entities keep records of your penis size? And in what form? (Meaning "in what form are the records?" not "in what form is the size of your penis recorded?")

I consider the relative risk from those ninja hacker codebreaker criminal geneticists I described in 223 to be negligible. If you have concrete evidence (not speculation) to the contrary, let’s hear it.

There's no such thing as concrete evidence that a straw man you invented yourself @#223 does or does not pose a negligible risk. And can't be.

Ontologically speaking.

@#245 --

A law requiring compulsory universal nationwide DNA testing with no exceptions, no informed consent, and no regulatory limits is so completely incompatible with the basic democratic principle of respect for human rights that any country that has one is, necessarily and self-evidently. not very democratic.

Human rights? Democracy? Vacuous concepts.

Human rights? Democracy? Vacuous concepts.

I couldn't really tell whether his point was that all concepts are vacuous per se or whether it's just the ones that can't be instantiated he rejects or...I don't know. I was baffled by that.

Because this part here...

The issue is with “privacy” and “ethics” as concepts– they are just as vacuous as souls and sins. How is “being unethical” different from “sinning”? In either case, some authority has created a set of rules which don’t have legal consequences. If they do have legal consequences, we call them…you know…laws.

If you are asking whether I would vote for laws that restrict access to my medical information, that’s a different question. Of course I would, as a practical matter.

...seems to suggest that because laws serve a practical purpose they are, though conceptual, not vacuous.

However, the same not only could be said of ethics, it would almost have to be if it was also said of law. They're not entirely conceptually discrete.

I'm not going to get into whether that also applies to religious morality, because murkiness. But a case could be made.

The really confusing thing, though, is that starting here...

But other than through the interpretations of SCOTUS in the USA, I have no “right to privacy” that I can characterize– I would have to go to court with standing in a specific case to find out what that would entail. Just like I suppose I would have to die to find out if I have an immortal soul.

...it all of a sudden turns out that it's the concepts that have legal consequences -- or, you know, "laws," as we call them -- that are as vacuous as the wages of sin.

So I guess I would say that zebra just scorns the concept of concepts, generally. But I know that can't be true. Because:

But as I said earlier, you don’t seem to grasp the concepts very well.

So apparently the thing that distinguishes a vacuous concept from the good kind is whether zebra is capable of conceiving of it unaided. Which is conceptually problematic, from an ontological perspective.

The whole thing left me befuddled, I don't mind frankly confessing.

The issue is with “privacy” and “ethics” as concepts– they are just as vacuous as souls and sins. How is “being unethical” different from “sinning”?

Speeches like this really need to be delivered in a laboratory, in the bowels of a castle or an extinct volcano. With lightning crashing overhead.

YOUR PETTY HUMAN 'LAWS' and 'MORALITY'! THEY CANNOT BLOCK THE ADVANCE OF SCIENCE!!

There's something oddly appealing about zebra's child-like naïveté around human nature (and human dignity). Yet since he is an adult, a case could be made that it is more like childish ignorance.

Ann,

I'll leave you with this and check back tomorrow to see how you do.

Prior to the US Civil War, it was "illegal" to aid an escaped slave--punishable by imprisonment and a hefty fine.

Was it "unethical" or "immoral" to aid an escaped slave?

Prior to the US Civil War, it was “illegal” to aid an escaped slave–punishable by imprisonment and a hefty fine.

Was it “unethical” or “immoral” to aid an escaped slave?

What the everloving f*ck is this even supposed to mean in this context? Z. is trying to be all didactic about the difference between "legal" and "ethical" after just having declared the entire concept of ethical and unethical actions as being equivalent to "sin" in the religious sense, and therefore empty and meaningless? The mind boggles</b at such inanity.

What I'm wondering, actually, is just who Z. imagines his achingly tiresome sh!tfit of a performance is for. Is he trying to "educate" the lurkers, of whom a couple* have delurked specifically to explain just how wrong and ignorant Z. is? Is he trying to "educate" the regulars? Who does he imagine is laughing with him at home?

It all betrays a "See Noevo show" level of idiotic grandiosity.

*who happened to actually know what the f*ck they were talking about, to boot.

^Multiple tag fails are parse-able, I hope.

Another non-sequitur from Z, why am I not surprised?

Zebra,

I get your point that ethics, sin and law are malleable. Although I happen to be one who thinks that certain truths are self-evident, you can go on with your idea because you don't get to say what my inalienable rights are.

Interestingly enough (or not) a couple of months ago I provided consent and some of my DNA for a study. However, I knew going in what it was for, that it is with one company, and under all the rules and regs regarding study participant's information. But even I wouldn't go in on your 'master' database. And not because it was your idea. It is because of all the other reasons commenters have raised here.

And, I'll leave you with this riddle. If you convince the politicians to pass a law for my genetic material for your DB and I refuse to provide it, is that immoral or unethical or just illegal?

@#254 --

Unless you can tell me why we're suddenly having the kind of deep conversation about ethics, morals and the law that most people leave behind when their YA-fiction-reading years are over, I'm not sure I see how it's relevant.

I said "not entirely conceptually discrete" not "invariably one and the same, with no possibility of conflict either in theory or in fact ever, ever."

In case that's where this is coming from.

Does that help with understanding my ontological perspective?

No. I’m not sure I even see any ontology.

I do suddenly wonder whether someone has earned an advanced degree by elaborating the modern history of the use of scare quotes, given that they would have been of obvious value here to distract from the painful mimicry.

YOUR PETTY HUMAN ‘LAWS’ and ‘MORALITY’! THEY CANNOT BLOCK THE ADVANCE OF SCIENCE!!

"They're nihilists, Donny, nothing to be afraid of."

Come on guys, don't give him too hard of a time. We all know ethicists are just secular theologists and the Universal Declaration of Human Rights is their version of the Bible. When you think about it, isn't the concept of human rights just as vacuous as transubstantiation?

That's a good idea.

I sometimes idly wonder why reflexively throwing "teleological" in front of nouns like "perspective" and "argument" has never gotten vogue-ish. I mean, I personally prefer to use the word "f*cking" when I feel the need to preface my nouns with a couple-few empty adjectival syllables. But there's certainly nothing wrong with "teleological."

I'm a fool for idle thought, though.

^^That was @Narad, #261.

Kuwait: I’m not going to waste my time researching the intricacies of the Kuwaiti political system, but I think it isn’t very democratic.
Was the fact that it has compulsory DNA testing your first clue?

This is what I mean. What does one have to do with the other?

Better politics. Duh.

But there’s certainly nothing wrong with “teleological.”

I've been know to use the T-word.

^ "known"

Not-very-democratic though it might be, you know what Kuwait's got that we don't?

Universal health care. That's what.

I’ve been know[n] to use the T-word.

Sometimes it's T-word-ologically justified. That's not what I meant.

I just read through this thread. Zebra, your ignorance and naivete are astonishing, and not in a good way. You know Jack Freaking Spit about IT. In #223 you say:

Here it’s some kind of ninja codebreakers….

1. Accessing the digitally encoded genome of a small random part of the population, either by getting into a secure database (or I think someone was worried about Fedex? trucks carrying hard drives being hijacked?)

2. Figuring out the medical information contained in the genome, which our experts say is excruciatingly difficult.

3. Determining who the individuals are from the genome.

4. Somehow turning all that into a profitable venture….

5. Instead of paying off some minimum-wage data entry person to get them access to medical records at the insurance company.

Here's the thing. Substitute your Point 5 for Point 1. The idea of a team of super-hackers breaking systems and stealing data has been wrong for years. It is far faster to bribe or blackmail someone who has access to the database.
As capnkrunch mentioned in #187, the size of such a database combined with the number of researchers who would want access means a huge attack surface. Sooner or later, a determined criminal would locate someone with access who was open to bribery or at risk of blackmail. From there, it would be a simple matter to get the data. in addition, in order for data to be useful, it would have to be in an easily processed format. It would take a fair amount of digging, but eventually one could gain enough to identify (and blackmail) specific individuals.

Not A Troll #259,

Since I am not suggesting a law mandating that anyone, much less everyone, contribute DNA, it would definitely not be illegal.

I can't very well be asked to apply labels like unethical or immoral since I have stated my position that they are vacuous.

But it would be like not vaccinating your child if you perceived that your child had a negligible risk of contracting the disease in question, and perceived a higher risk from the vaccination itself.

(You would be acting in your perceived self-interest, as all humans do.)

Since in the vaccine case, the perceived negative outcome (physical harm from the vaccine) is of greater negative consequence than the perceived negative outcome from submitting your DNA*, one might argue that you would be acting less rationally than the anti-vaxxer.

But as you say, this kind of decision is "malleable" or subjective. Who am I to judge?

*A perceived harm for which there is zero evidence, unlike the perceived harm from the vaccine.

capnkrunch #263

We all know ethicists are just secular theologists and the Universal Declaration of Human Rights is their version of the Bible. When you think about it, isn’t the concept of human rights just as vacuous as transubstantiation?

Well yes, we all know that if we look at reality, but some engage in self-delusion, just like the people who think that they have an immortal soul.

Neither your soul nor your "human rights" mean anything to a firing squad. Your "right to privacy", which is such a big issue for you apparently, means nothing when you are stripped naked in a cell and a broomstick is shoved up your ass.

But carry on, someone has to man the barricades against the possibility that one's genome will reveal embarrassing information. And then there's the threat of Reiki to deal with...

Zebra @272:
"Who am I to judge?"

That's an easy one. As a buffoonish, petulant, truculent blatherskite, you're the best excuse I've had in a while to combine some of my favourite words.

So thank you for that.

Neither your soul nor your “human rights” mean anything to a firing squad.

(a) If that's the criterion for conceptual vacuity, creating a DNA database for an entire year's birth cohort is a vacuous idea.

(b) If you believe you have an immortal soul, it doesn't matter what it means to a firing squad. That's not the point.

(c) Execution by firing squad is not necessarily a human rights violation. I'm 100 percent opposed to capital punishment in all circumstances. It's one of the issues I feel most strongly about. And even I don't think so.

Your “right to privacy”, which is such a big issue for you apparently, means nothing when you are stripped naked in a cell and a broomstick is shoved up your ass.

In my experience -- which does not include the precise scenario described above, I admit -- the reverse is true. Human rights never mean more than they do when they're being violated. Or when they have been.

I mean, being subjected to sustained, systematic atrocities with no hope of escape or relief for a long period of time sometimes breaks people so that they no longer care. But assuming that there's a semblance or possibility of human rights in the picture to begin with, what you're saying is just false.

For example:

Abner Louima, whose experience does include something like what you describe, became an activist against police brutality because he survived it.

Not that it's really material to the instant point, but fwiw, but he evidently also found meaningful aid in his belief in an immortal soul, according to wiki:

In a rare interview, Louima said he's convinced he can make a difference in his impoverished homeland: "Maybe God saved my life for a reason, I believe in doing the right thing."

And finally:

A DNA database of an entire year's birth cohort genuinely does mean nothing when you are stripped naked in a cell and a broomstick is shoved up your ass.

So what's your point?

zebra@273
What the heck are you even talking about? My right to not be a slave is also a big issue to me and is similarly meaningless at gunpoint. For a guy who loves to complain about non sequiturs you sure make good use of them.

zebra # 272 says that not contributing your DNA for genome studies
"... would be like not vaccinating your child if you perceived that your child had a negligible risk of contracting the disease in question, and perceived a higher risk from the vaccination itself"

and you are right if you believe that

"...the perceived negative outcome (physical harm from the vaccine) is of greater negative consequence than the perceived negative outcome from submitting your DNA".

This sort of reasoning probably would be justified if the person who decides is the same person who must bring the consequences of a wrong choice. May be this is understandable in a climate of "my son, my property". But in today's civilized countries this is no more considered acceptable.
Different ethics, perhaps?

Ann, now you are joining Krebiozen in out-of-context quotes with silly word games and irrelevancies.

I was responding to capnkrunch. The meaning is clear. There is no difference between UDHR and a bible. Neither matters. What matters are the choices of those who have control of what happens, which ultimately means the sovereign entity. "God is on the side of the big battalions" or some such quote. Napoleon?

And I never said people who delude themselves about souls and rights don't benefit from the delusion; it is exactly my point that feeling better is obviously why they do it. That doesn't make it less of a delusion.

#227 perodatrent,

"This sort of reasoning probably would be justified if the person who decides is the same person who must bring the consequences of a wrong choice."

I think you meant "bear the consequences."

But I don't see how this relates to the reasoning itself. I am just comparing the two negative outcomes. In one case, there is a small chance of a bad reaction having a serious physical effect on the child.

In the other, there is some fantastic scenario (that not one person has been able to describe in even a tv-plot fashion) that leads to some future embarrassment or "loss of privacy".

In both cases, the parent does make the decision for the child; but again, it is the perceived consequences that are being compared.

Like I said before calling something out of context doesn't make it so. If that's how you want to dismiss something you should provide the proper context and explain how the meaning of the quote has changed. It's not difficult to do, and significantly strengthens your argument. Actually, dismissing something simply by yelling "out of context" or "cherry picking" just makes it seem like additional context actually has no bearing on the meaning and you simply don't want to address inconvenient quotes.

Also, I have provided several references describing how confidentiality of genetic information can be breached including one actual proof of concept attack. The appropriate way to compare risks is cost/benefit comparisons. Benefits of vaccines far outweigh any risk. Several experts have told you that mass collection of genomes is useless at this time. Even if risk was purely theoretical (it isn't; deidentification attacks have been proven as I already referenced) because there's no benefit there's no reason to take the risk. This is what I was alluding to before when I said you were using the wrong risk assessment tool. Comparing relative risk of storing personal medical records (or vaccinations) to a centralized research database is apples and oranges. When you compare cost/benefit it's clear your idea carries unnecessary risk while in thethe other two benefit greatly exceeds risk.

capnkrunch,

I just did that. #278

Saying I didn't doesn't make it so.

See also Gish Gallop. The common practice here is to provide incomplete quotes-- just phrases-- and then riff on some interpretation of some word in the quote, or just start pontificating on something tangential to the meaning of the original comment. The result is that one would spend lots of time trying to refute all those irrelevancies, distracting from the core topic.

Another ploy is drive-by criticism, like what you just did, to create distraction.

"But it would be like not vaccinating your child if you perceived that your child had a negligible risk of contracting the disease in question, and perceived a higher risk from the vaccination itself."

That's a remarkably piss poor analogy. With vaccines there is a risk to the individual but also a known benefit to the him/her and society whereas your birth year gernome DB has no demonstrable benefit to a living individual. Sure, you can do something for the advancement of science (and that's about all I'll get out my study + 25 dollars), but that is no one else's business but the individuals.

For one who speaks of what matters are the choices of those who have control of what happens, I don't understand your mythical belief in no downsides. It isn't like something such as eugenics never happened or that designer babies aren't a thing or that the non-productive in society aren't discriminated against or that even now there are those who speak of an age when people are expendable. I would love to see it (not) when they say those with certain genes are.

There's better politics and there's worse politics and neither of us are mind readers to know how that is going to play out. So for those who want to donate their DNA have at it and for those who don't you can leave them alone. Unlike the unvaccinated who are a risk to others there are no issues with DNA non-donators. If the science gets better then I expect more people to be on board with it but to expect people to do so now is kind of a joke.

I'm sure you meant 277 but that reminds of how much I agree with Narad@227. Your nonsense sounds like the kind of half baked ideas I'd expect out of a student who just got an A on their first test in intro to philosophy.

No, you neither provided additional context nor explained how the meaning of the quote changed. ann already used the quote in the context of it being a reply to me otherwise the reference to vacuity doesn't make any sense. If anything your additional comments make ann's point even stronger. If all that matters is the choices of those in power than anything that doesn't directly work towards gaining or maintaining power is a vacuous idea.

# zebra 279
"Bear the consequences", of course.
My point wasn't about which is the right choice in this question (in my personal view: to vaccinate and not to provide DNA).

But about who has the right to make the choices.

Giving my own DNA has consequences only for me, so the choice is very personal.
Not to vaccinate involves other people, too (our own children or non immune lay people). So it can be argued that the choice should be made not by parents, but by a public agency.

Speaking of creating distraction I don't think I've yet seen you make an actual response to a single criticism of your idea. It's always out of context this, non sequitur that, I don't agree, 20th century thinking, where's the evidence (ignoring multiple references provided of course). More recently it's this pomo everything's the same nonsense. Look, if you don't think protecting privacy is an important that's fine. HIPAA says otherwise and you seem to care about enforceable laws. How will your database protect confidential health information when genomes can be personally identifiable?

capnkrunch,

What you haven't provided is any plausible scenario in which the project I described has any risk of harm to any individual. That would involve a specific narrative, which could even be quite hypothetical-- as I said, like some tv crime or spy plotline. Nothing.

Your interpretation of what "several" experts have said is as irrelevant as your beliefs about what constitute human rights. I've answered Retro Pump way back at 112. The disagreement is over the rate at which the data is converted to digital format.

I've also pointed out that there are obviously experts who don't agree that large, centralized databases of genetic information are useless at this time. If you want to argue the specifics of my idea v the Brit 100K genome project, that would be fine, but you've already said that you are against any further collection of genetic data. Which isn't feasible anyway.

perodatrent,

OK. But the public agency could also decide that DNA must be given, as in the Kuwait example ann provided.

So, we are back to politics, and the power of the sovereign, whether that is a democratic entity or a king.

perodatrent,

I just realized that maybe you arrived late and were not clear on what the actual project was that I suggested. DNA would be collected from all the children born in a specific year, so again it would be a parental decision in both cases.

I was responding to capnkrunch.

So?

The meaning is clear. There is no difference between UDHR and a bible. Neither matters. What matters are the choices of those who have control of what happens, which ultimately means the sovereign entity. “God is on the side of the big battalions” or some such quote. Napoleon?

I got that that was your point.

Mine was that if the criterion by which you decide whether a concept is vacuous or not is whether or not it saves you from a firing squad and/or being stripped naked in a cell and having a broomstick is shoved up your ass, the concept of establishing a DNA database for an entire year's birth cohort is completely vacuous.

The concept of human rights, on the other hand, provides some protection to most people who live in a society that respects them, as well as some recourse to those whom they don't protect.

IOW: You're using what you call a grandiose strawman. Human rights also don't save people from dying in car accidents. That doesn't mean they're completely useless.

Ann, now you are joining Krebiozen in out-of-context quotes with silly word games and irrelevancies.

I honestly don't see how that was either out of context or silly or irrelevant. I addressed your entire argument, which was that ethics and human rights are vacuous because if you're in front of a firing squad or being stripped naked in a cell and having a broomstick is shoved up your ass, they won't help you.

In the event that my meaning wasn't clear: While that may be narrowly true if you're already in front of a firing squad or being stripped naked in a cell and having a broomstick is shoved up your ass, there's not much about which the same couldn't be said. And it's entirely false if your aim is to avoid ending up in that kind of situation. The concept of human rights does more to keep people out of it than anything else, by greatest-good-for-greatest-number standards.

Ann,

"We all know ethicists are just secular theologists and the Universal Declaration of Human Rights is their version of the Bible. When you think about it, isn’t the concept of human rights just as vacuous as transubstantiation?"

Well yes, we all know that if we look at reality, but some engage in self-delusion, just like the people who think that they have an immortal soul.

Neither your soul nor your “human rights” mean anything to a firing squad. Your “right to privacy”, which is such a big issue for you apparently, means nothing when you are stripped naked in a cell and a broomstick is shoved up your ass.

That's what I said in response to capnk, where his statement is in quotes within the blockquote.

I didn't say that people's beliefs didn't give them comfort.
I didn't say that the delusional beliefs of other people have no effect on the situation in which the individual finds himself.

So you are indeed the one strawmanning.

In the event that my meaning wasn’t clear: While that may be narrowly true if you’re already in front of a firing squad or being stripped naked in a cell and having a broomstick is shoved up your ass, there’s not much about which the same couldn’t be said. And it’s entirely false if your aim is to avoid ending up in that kind of situation. The concept of human rights does more to keep people out of it than anything else, by greatest-good-for-greatest-number standards.

No, exactly no. What best keeps people from in front of the firing squad is self-interest. It is the experience of Martin Niemoller. It is when people stop following that lesson, and start using delusional, arbitrary, "beliefs" to guide their society, that problems arise.

You know, "an eye for an eye" pretty much justifies that death penalty you don't like.

Ann, now you are joining Krebiozen in out-of-context quotes with silly word games and irrelevancies.

This is ridiculous. Not long ago I pulled up zebra for inaccurately paraphrasing people and putting those inaccurate paraphrases in quotation marks. He was apparently unaware that quotation marks generally denote a direct quote; he even claimed that putting paraphrases in quotes is common practice (it isn't) and told me to "get with the modern era".

Now he is complaining about accurate paraphrases of his comments that are not in quotes? Perhaps zebra would like to specify where I, or anyone else, have changed the meaning of his words by quoting him out of context (difficult when the context is right above on the same page), or where I, or anyone else, have made any argument that does not follow from its premises (i.e. a non sequitur). What zebra writes is quite ridiculous enough that no one has to change the meaning of his words to criticize them.

It seems to me that zebra is simply too dim to understand it is possible to paraphrase someone's words without changing their meaning, and lacks the basic scientific knowledge required to see how some arguments do follow from their premises. Either that or he still hasn't grasped what "out of context" and "non sequitur" mean.

Silly word games? I plead guilty, though not on this thread, other than a reference to PoMo and a hypothetical example of data that some might like to keep private as tiny-penis-syndrome, neither of which really count as word games.

As for irrelevancies, zebra has treated us to a plethora of those, from an irrelevant Pynchon novel to someone being physically assaulted in a cell, and other bizarre analogies that make no sense whatsoever.

Back to the issue under discussion: I don't know how large the risks are because I don't understand the details or scope of the problem well enough, which is why many countries have set up committees of experts to look into this, assess the risks and formulate necessary legislation. I don't think we should ignore potential risks and just hope they come to nothing. We have already had some tastes of the possible problems we may face; here's an example (adapted from a real case) of genomic information being used to discriminate against a person:

Jacob, a boy who carries a gene for a disorder called Long QT Syndrome (LQTS), was denied coverage under his father's health insurance policy because of his pre-existing condition. LQTS is a rare and little-known genetic disorder that sometimes triggers sudden cardiac death. Those who carry the gene may be healthy until they suffer an attack without warning, but carriers can control their risk of cardiac arrest with preventive beta-blocker therapy. Jacob's father wanted Jacob to be insured, but even after their state enacted a law prohibiting genetic discrimination, Jacob's insurance company still refused to cover him.

There are other examples of this kind of discrimination on the same page. Isn't it best to figure out how to avoid more of this sort of thing in the future before we start sequencing the genomes of millions of people?

Speaking of the 100,000 Genomes Project and one such group of experts who Krebiozen referenced previously:

We recommend that broader public consideration should be given to whether GeL provides the most appropriate model for the ethical use of genomic information generated in health services for public benefit before it becomes the de facto infrastructure for future projects.

Krebiozen@293
You and your 20th century punctuation.

See also Gish Gallop. The common practice here is to provide incomplete quotes– just phrases– and then riff on some interpretation of some word in the quote, or just start pontificating on something tangential to the meaning of the original comment.

Add "Gish Gallop" to the list of things that Z. doesn't understand.

The result is that one would spend lots of time trying to refute all those irrelevancies, distracting from the core topic.

The "core topic" is that at no point in the proceedings have you demonstrated the slightest understanding of any real-world aspect of your Bright Idea.

^ Well, that's an odd blockquote failure mode.

"Well, that’s an odd blockquote failure mode."

You just had an errant one that resulted in nested ones. Btw, did you figure out your issue with the winking face emoticon?

Okay, reading this long thread, I'm getting the impression zebra is unaware of exactly how much genetic information a human being has.

RE: JP #151

I've been away dealing with real life, but -

I wasn't proposing that we should have a law to prevent the mentally ill from possessing firearms. We already have a suite of state and federal laws that say so, and it's my understanding that the Supremes have said they are an acceptable restriction (HDB take note), so if you want to fight it, your argument is with your Representatives, not me (but I do think these laws are just fine).

You also may have the impression that I think all mental illness are the same, but both me and the law are well aware that there are many different types of mental illness, and not all (and maybe not even most) should lead to restrictions of any rights or freedoms.

You link doesn't link. I agree that most mentally ill people are mostly harmless most of the time, and some are completely harmless all of the time, but I don't agree that “the mentally ill” are no more violent than “normal” people. According to this http://www.bjs.gov/content/pub/pdf/mhppji.pdf ,

At midyear 2005 more than half of all prison and jail inmates had a mental health problem, including 705,600 inmates in State prisons, 78,800 in Federal prisons, and 479,900 in local jails. These estimates represented 56% of State prisoners, 45% of Federal prisoners, and 64% of jail inmates.

The same source says 61% of the state prison population who had current or past violent offenses had mental problems. This source ( http://www.nimh.nih.gov/health/statistics/prevalence/any-mental-illness… ) says the prevalence of any mental illness among adults is 18%, so it would seem that the mentally ill are more likely to be violent than “normal” people. Or maybe the mentally ill are just more likely to get caught.

But none of the was the main point that I trying to make, which seems to have been lost. I was trying to offer a scenario where the existence of a complete medical database could (depending on your point of view) be used or abused. Let me try again.

We have laws that restrict some mentally ill from buying and possessing firearms. However, these laws are mostly toothless, because there is no mechanism for any gun seller to know if the purchaser is amongst those restricted by these laws. Zebra's metadatabase would have the necessary information to enforce the law. Part of me thinks this would be a good thing to use this information to prevent some portion of violent crimes.

But a larger part of me thinks it would be bad. HIPAA laws make it clear that medical information should only be used to provide medical care. I like this law just fine, too. If we allow an “except for” to come in, we can expect others to follow. The government does have a record of using the data it holds to do bad things (census data and the Japanese internments come to mind). But Zebra seems to think that HIPAA should be darned to heck, and if y'all can't convince him otherwise, I'm sure I can't. Y'all have probably made this argument somewhere along the way (I hope to chew thru this in the next day or so) so I apologize for any redundancy.

Zebra may say that this is an argument against UHC. But again, it's HIPPA that would be our protection, and it seems to me we should likewise protect it. Our medical information should only be used for benefit of the patient, unless the patient gives clear, informed consent otherwise.

"Or maybe the mentally ill are just more likely to get caught."

In addition it may just be that treatments and social support for mental illness is vastly underfunded and they end up in prisons because they are nuisances and nobody else is taking them in.

but I don’t agree that “the mentally ill” are no more violent than “normal” people.

Here is my link; with any luck, it'll work.

"Mental disorder" is awfully broad and probably includes substance abuse disorders. People with mental illness who don't have a substance disorder are, according to various studies, either no more likely or only slightly more likely to commit violent acts compared to the general population.

We're talking about SMI here, though, also. There is *one* mental disorder which does correlate with increased violence, and that is antisocial personality disorder (or sociopathy), but that is not considered SMI.

In any case, sex (male), age (youth), and socioeconomic status (poor), correlate much more strongly with violent crime than mental illness does.

^ Substance disorders are considered "mental disorders" even on their own, not concurrent with an "Axis I" disorder. (I think psychiatrists have lately stopped using the Axis model, but you get the idea.)

I wasn’t proposing that we should have a law to prevent the mentally ill from possessing firearms. We already have a suite of state and federal laws that say so....

I forget how this one got started, and I'm on deadline again, but I'll briefly note that in my state, as I recall the FOID application, one has to have been involuntarily committed to be denied a license. This is an obvious, verifiable standard, but it represents a very small subpopulation, indeed.

I didn’t say that people’s beliefs didn’t give them comfort.
I didn’t say that the delusional beliefs of other people have no effect on the situation in which the individual finds himself.

So you are indeed the one strawmanning.

No, that's just you creating a straw man out of what I said by cherry-picking and paraphrasing. As I said:

I got that that was your point.

Mine was that if the criterion by which you decide whether a concept is vacuous or not is whether or not it saves you from a firing squad and/or being stripped naked in a cell and having a broomstick is shoved up your ass, the concept of establishing a DNA database for an entire year’s birth cohort is completely vacuous.

^^See that right there? Please address it. It's my main point.

What best keeps people from in front of the firing squad is self-interest. It is the experience of Martin Niemoller. It is when people stop following that lesson, and start using delusional, arbitrary, “beliefs” to guide their society, that problems arise.

But let it be duly noted that it's no less true to say...

Your “self interest”, which is such a big issue for you apparently, means nothing when you are stripped naked in a cell and a broomstick is shoved up your ass.

...that what you did say is. Because that's my point. As a standard for establishing the vacuity of concepts, that's a useless construction. All concepts fail it equally.

You're therefore still one reasonable argument short of demonstrating that ethics and human rights are vacuous.

Please ante up.

FWIW, I don't disagree that it's in everybody's self-interest to see that each and all are equally protected from abuse. But that's the principle on which the laws and ethical standards you're objecting to are based. Respect for persons, for example.

You know, “an eye for an eye” pretty much justifies that death penalty you don’t like.

Pretty much. But since that particular concept goes back 4000 years and has never (afaik) been absent from any human society on record***, it's probably not the best example you could choose of the kind of delusional arbitrary belief that leads to trouble when people start being guided by them. In order to know whether that was true, they'd have to stop first.

***Before you tell me that there are societies that don't practice capital punishment: I know. I said "the concept." It has numerous other iterations. Too numerous to count, really.

Anyway, back to the topic at hand for a moment, allow me to hearken back to carts and horses.

Has anyone figured out why Z. thinks a birth cohort is some sort of especially interesting sample per se? It can't be pure size, given that other countries can randomly be plugged into the Master Plan (and given that Z. is demonstrably unclear about the size of the U.S. one).

You're not going to capture any particularly granular environmental data specific to a generation unless, oh, say, the medical records are also going to geolocate the genome over time as it wanders to and fro, so, why?

The only thing I'm coming up with offhand is normalizing actuarial models.

so it would seem that the mentally ill are more likely to be violent than “normal” people. Or maybe the mentally ill are just more likely to get caught.

Or maybe people who are incarcerated are likelier to have been evaluated and diagnosed than people who aren't.

Or maybe being convicted of crimes that (in many more cases than one would like to think) you may not have committed and then locked up for years has a bad effect on your mental health.

Or maybe being in prison =/= being violent. There are a lot of people in prison for dealing drugs, for example.

Or maybe all of the above.

Except not maybe. JP is entirely correct. The mentally ill are no likelier to be violent than anybody else is.

More things zebra doesn't get:
Huge government databases are hacked all the time. See the ever-expanding government employee personal information database hack.
If the medical records are linked to the genomes, that information alone is valuable enough to hack. And if any of it were attached to SSN? It could be used to blackmail or influence the parents of a child as well.

Ethics are not optional in human subject research. Not in the US, or the UK, or really anywhere in the world. There are standards and treaties and laws about these things. IRBs are not optional, and they are not optional for a reason.

Zebra: do you know why ethics and human rights must be considered *first* in all human subject research? If you don't, and can't find it in a 2-minute search, then you are a buffoon, a liar, or a monster.

But a larger part of me thinks it would be bad. HIPAA laws make it clear that medical information should only be used to provide medical care.

And for public health surveillance (at least). I've gotta get my ass in gear, but I've already mentioned the Vaccine Safety Datalink. I don't know whether the plan participants of the HMOs who are the data stream are represented on an opt-in, opt-out, or "neither" basis, but it strikes me as real-world example that Z.'s trip should at least be able to make specific contact with.

Not "big picture" enough, though, I suppose, or something.

ann,

Mine was that if the criterion by which you decide whether a concept is vacuous or not is whether or not it saves you from a firing squad and/or being stripped naked in a cell and having a broomstick is shoved up your ass, the concept of establishing a DNA database for an entire year’s birth cohort is completely vacuous.

But how is that my criterion? This is a truly bizarre interpretation, which is perhaps why I think you are strawmanning?

What makes "rights" vacuous is that they cannot be demonstrated, which is why I "equate" them with "souls".

Sorry if that wasn't clear enough to begin with.

You only have "rights" in the context of the legal system, (stipulating, of course, that the legal system operates the way it is supposed to.) Your "rights" are granted by the sovereign entity.

If you don't get this I am happy to discuss it further.

“Well, that’s an odd blockquote failure mode.”

You just had an errant one that resulted in nested ones.

I meant the inversion. It's neither here nor there; I just wanted to note the error.

^ I'm going to take that additional blockquote fail as a clear signal from the perceived world that this is the wrong time to "multitask."

but I don’t agree that “the mentally ill” are no more violent than “normal” people.

IIRC the mentally ill are far more likely to be victims of violence than perpetrators. That said, some years ago, as I have related here before, I had an unfortunate experience with a close friend who had some serious mental problems (delusions) but since he was not presenting a threat to himself or others, and despite my best efforts, I couldn't get any doctors interested.

Eventually I contacted his mother in Scotland and paid his train fare to spend some time with here, thinking a bit of time out of London would do him some good. Sadly while he was there his condition worsened and he killed his mother, rather horribly and then calmly called the police.

I did visit him in the secure mental hospital he was placed in (the Scots sent him back to London), and kept in touch for a while after he was released. I never felt quite comfortable with him after that, even though he was on depot anti-psychotics and asymptomatic, and have since lost touch.

That has left me with an unease around people with SMI that I know isn't entirely rational, but it's still hard to shake off.

^ "some time with her"

@Krebiozen:

That's terrible; I'm sorry to hear that. I know of a similar situation involving the older brother of a school friend, but I wasn't nearly as close to what was going on as you were.

The point is that mental illness doesn't cause violence over and above what you'll find in the general population; it doesn't mean the mentally ill aren't ever violent, just like other people are sometimes violent. Young men have a particular risk of it in general.

As far as the SMI label goes, I mean, I'm pretty sure I'm probably in there. Severe recurrent depression and a nice case of PTSD counts as "serious" or "severe," I'm pretty sure, even if I'm not having, say, command hallucinations. (Which also don't incline people to be more violent, just by the by.)

I mean, most of my friends are crazy, too. I guess I sort of have a hard time grokking anybody who isn't "broken" in some sort of profound way.

Justatech #309,

I offered to capnkrunch that he could make up even a far-fetched, tv crime or spy show scenario about how all this blackmailing and stuff would happen, but he couldn't hack it. (pun intended)

I make you the same offer. I've watched my share of that kind of thing, and I can't figure out what dark or deranged imagining is going on in your minds. How would it work? Why would someone go to my database instead of just bribing the person working for the local health insurance company, or the doctor's office?

jp #316,

So why don't you like me, phony?

"Hey, what's up with that zebra, why doesn't he want to just belong?" (#255)

Or is all your drama just middle-class spoiled brat social-media angst?

Zebra @317: Sure, I'll invent you a crazy scenario involving a diplomat with an unknown genetic allergy to antibiotics and a terrorist group that wants him dead...
But ONLY after you answer my question about human subject research and ethics.

It can be answered in one word, if you're lazy.
Do it.

So why don’t you like me, phony?

Because you're not terribly likable? I generally like people who are nice and smart and know what they don't know.

Someone who refers to his wife as "perfectly serviceable" does not immediately strike me as friendship material either, just by the by.

Or is all your drama just middle-class spoiled brat social-media angst?

Oh, Zony, stop it. My sides, they're killing me.

zebra
All the pieces are there, I shouldn't have to weave a f*cking narrative for you but here it is anyways. Insurance company (via some lab) requests information for, say diabetes. They do legitimate whatever with it but simulatenously use various methods to reidentify the genomes. This information is then used as a kind of blacklist, people they win't sign. It's illegal to deny coverage for pre-existing conditions but now they have access to information that they shouldn't and mountains of plausible deniability. "We didn't deny coverage because of the pre-existing condition. There's no way we could have known," (recall the data is ostensibly deidentified).

How about in the reverse. Some reserachers suspect they have discovered a mutation that greatly increases risk of leukemia. To help verify they request the histories of everyone with said mutation. Somehow an insurer gets a hold of the information and reidentifies it and hikes up everyone's rates.

Heck, hackers comprimise healthcare servers (insurers, providers, etc) all the time; I don't know what they do with the information but it's certainly nothing good. Despite what you seem to think, I imagine bribing an employee probably puts the attacker at far greater risk than a spearphishing campaign; I don't recall any recent attacks where bribery was the method of entry. Why give hackers another target? BCBS was too secure? Try the genomics database.

You may have watched you share of TV shows and movies but like Julian Frost noted before you appear to know jack sh!t about security. It's common knowledge Hollywood rarely does any technology justice. You're a fool to think that means anything.

In any case, a "TV show scenario" involving private health information isn't hard to think up at all. We can even use SMI as an example.

What if, as has been brought up several times, scientists figure out how to read the genome for things like schizophrenia? I do in fact know a couple people who have, yes, schizophrenia, and are not "out" about it, either at work/school or with non-close friends. They have every right to keep that information private, given the kind of stigma it carries. I can well imagine that being "outed" would be an awful experience for them on a lot of levels.

Why would somebody use that information against somebody else? I mean, if they're jerks, and enjoy the idea of making other people miserable and intimidating them out of their money, why wouldn't they? It's more information out there, your entire genetic code. Why wouldn't people go after it, regardless of whether they also go after other health information? Again, we don't know right now just how much information about a person is in their DNA.

^In any case, I gotta jet. I must bike to the Asian Market (I guess) and find a fruit I haven't tasted in 9 months before I go to a Rosh Hashanah thing.

I looked for something recent on the subject of violence and mental illness and found what I think is one of the best overviews of the subject that I've ever seen here

Krebiozen, I'm sorry that all involved went through that. Truth be told I am also leery of those plagued by psychosis. But it is no more than I would feel with anyone else who I sense is unpredictable and/or acting erratically. That includes the police, business managers and angry people. It is just something instinctual and I doubt I will ever be able to change it.

I started wondering about Plomin, and his confident claims back at the end of the last century in 1993 and again in 1998 that he was about to discover the genes for intelligence, Real Soon Now. He seems to have found a fortunate niche where being wrong (and spending millions of dollars and decades of time) only increases his reputation, and his most recent, largest search for the genes for intelligence admitted total failure in 2010.

So he has sent the genomes to a Chinese colleague to be sequenced safely away from IRBs, and as of 2013 he was promising success within the year.
http://www.nature.com/news/chinese-project-probes-the-genetics-of-geniu…

But how is that my criterion? This is a truly bizarre interpretation, which is perhaps why I think you are strawmanning?

What makes “rights” vacuous is that they cannot be demonstrated, which is why I “equate” them with “souls”.

Sorry if that wasn’t clear enough to begin with.

Of course it wasn't clear. One of your examples was of a right being violated; and the other didn't necessarily implicate any rights at all.

Neither illustrated the proposition that your rights can't be demonstrated. And it's not true that they can't. For example:

Are you expressing your beliefs and opinions about the vacuity of rights on this very thread without fear of reprisal from the state?

Well, there you go. You've just demonstrated that one of your rights is alive and well.

You only have “rights” in the context of the legal system, (stipulating, of course, that the legal system operates the way it is supposed to.)

Yes. You also only have citizenship in the context of the legal system. Ownership of property, too. It's also what prevents landlords from shutting off the heat in the middle of the winter to save money and the reason why you don't have to work a seventy-hour week to keep a job.

There are lots and lots of perks and benefits you only have in the context of the legal system. What of it?

Your “rights” are granted by the sovereign entity.

In this country, your rights are granted by the (wait for it) Bill of Rights. And the US Constitution. Assuming that "sovereign entity" means "the state," that's not really the same thing.

But for the sake of argument, let's say that it is. So what?

If you don’t get this I am happy to discuss it further.

That's okay. I'm good, actually. The "So what?" and "What of it?" were just rhetorical questions.

Anyway, if the thing that makes rights and ethics vacuous is that they (putatively) can't be demonstrated, you need to make a case for it. Because you haven't yet. So let's just stick to that.

On further reflection, I realized that even if the vulnerability (as it were) used to compromise PHI was bribery, zebra's data still greatly increases that attack surface. There's a whole new set of people with acess to confidential data who might be susceptible to bribery. No one at UHC bites? Try the genomics guys. Of course, this bribing the peons discussion is vacuous for a number of reasons, not the least of which is that the idea is actually more Hollywood than elite ninja codebreakers or whatever zebra was going on about.

ann@326

In this country, your rights are granted by the (wait for it) Bill of Rights. And the US Constitution. Assuming that “sovereign entity” means “the state,” that’s not really the same thing.

I think what zebra is going for is that your rights only exist so long as those in power choose to allow them to. I get the feeling his AP History class just finished their section on Hobbes.

I think what zebra is going for is that your rights only exist so long as those in power choose to allow them to. I get the feeling his AP History class just finished their section on Hobbes.

In the same sense that your running water does, sure. But you know. Your life only exists so long as someone with the power to kill you chooses to allow you to live. Other people are always around somewhere, having power to affect you some way. Can't be helped. C'est la vie.

Details -- such as how realistically likely it is that those in power will cancel the Bill of Rights, or be able to -- really matter therefore.

capnkrunch,

Don't quit your day job and try to become a scriptwriter. (But I appreciate that you made the attempt.)

The thing is, I asked you to do that for my project. That would be (let's say, assuming lots of opting out) a round figure of 200K samples, derived from a particular year's births.

I may be misinterpreting the specifics of what you suggest, but it sounds like an insurance company bribes a qualified academic researcher to submit a request for a set of genomes to be used in a "cover" project.

That data set would be limited to those individuals in the birth cohort with the specified ("cover") medical condition, and probably a fraction of those if the number is large. Let's say 5K to be generous.

So we are talking about all kinds of criminal risk in order to identify the fraction of 5K individuals with condition A who also happen to have the genetic risk factor for condition B? And who might be applying for insurance with that particular company? I just finished logic101, so I think I will call that reductio ad absurdum. (sarcasm)

There just is no way, no matter how many times you repeat "threat surface", to claim that participating in my project adds anything to the existing risk of harm from unauthorized access to an individual's health information. It just makes no sense, even if you could access the entire thing, to bother with such a restricted and really trivial amount of information for nefarious purposes.

And even though I don't think the identification-through-the-genome route is a big problem, the existence of my database would reduce the risk to the general population. That should be obvious.

I offered to capnkrunch that he could make up even a far-fetched, tv crime or spy show scenario about how all this blackmailing and stuff would happen, but he couldn’t hack it. (pun intended)

That's because you don't understand the basic issues, which in turn devolve to your wholesale failure to describe in even rudimentary fashion how the Bright Idea is supposed to work in practice, aside from "Computer, give me the entire genomes of all persons with condition X that satisfy this random list of natural-language terms."

How many times do you have to be told that no real-world, arbitrarily whole-cohort, data bank linked to complete medical records is just going to barf up whole genomes? Why can't you answer the simple question of what the interface is supposed to look like? With a genome in hand, can one "freely" pull the medical records? There might be something in there somewhere!

It's not as though you've had a shortage of time or lack of genuine input with which to put together something resembling a contentful response, but instead you're just jabbering incoherently about weird jail fantasies and so on.

I mean, yah, I think I've already noted that you're Gumby, dammit, but WTF? If you're going to whine about "the core topic" – and if the "core topic" is something other that Z. is so bright you have to wear shades – why are you so enthusiastically running the hell away from it?

ann,

"There are lots and lots of perks and benefits you only have in the context of the legal system. What of it?"

"Anyway, if the thing that makes rights and ethics vacuous is that they (putatively) can’t be demonstrated, you need to make a case for it. Because you haven’t yet. So let’s just stick to that."

"Your life only exists so long as someone with the power to kill you chooses to allow you to live."

OK, so you answered it yourself, gratis capnkrunch. You can't demonstrate that you have a "right to life".

Note to capnkrunch: If I have to end up sounding like philosophy101, perhaps it's because everyone else sounds like they never took that class, and it is necessary to (re-)state the obvious?

JP,

I like to think she married me primarily because of my sense of humor, (as well as compassion and intelligence, of course). Not just that other thing.

Once again, zebra displays his ignorance of both IT and security risks.

That would be (let’s say, assuming lots of opting out) a round figure of 200K samples, derived from a particular year’s births.

Given the size of each DNA sample, we are talking petabytes, if not exabytes, of data. That will need storage space, electricity to run, and people to maintain both the physical hardware and administrate the software. Queries would take a very long time. It's not yet practical.

I may be misinterpreting the specifics of what you suggest, but it sounds like an insurance company bribes a qualified academic researcher to submit a request for a set of genomes to be used in a “cover” project.

That's one of several possibilities. Another is a gang who blackmails or bribes either a researcher or an administrator into giving them certain data. Remember, in a database that big, the number of users and admins is huge, meaning a HUGE attack surface.

That data set would be limited to those individuals in the birth cohort with the specified (“cover”) medical condition, and probably a fraction of those if the number is large. Let’s say 5K to be generous.

Or the data could be something like who meets the genetic risk criteria for a certain disorder like e.g. schizophrenia.

There just is no way, no matter how many times you repeat “threat surface”, to claim that participating in my project adds anything to the existing risk of harm from unauthorized access to an individual’s health information.

Except that your project is far less secure than you assume, and that the risk of the project getting hacked is far greater than the risk of someone's medical files getting exposed and the data used nefariously.

And even though I don’t think the identification-through-the-genome route is a big problem, the existence of my database would reduce the risk to the general population. That should be obvious.

It isn't obvious. If anything, it is obvious that your proposed database significantly increases the risk to the general population.

If I have to end up sounding like philosophy101, perhaps it’s because everyone else sounds like they never took that class, and it is necessary to (re-)state the obvious?

If I recall correctly, there have been very few retractions from the GCN Circular.

That would be (let’s say, assuming lots of opting out) a round figure of 200K samples, derived from a particular year’s births.

Sweet G-d, do you still think that the U.S. birth cohort is "385,000"?

I may be misinterpreting the specifics of what you suggest, but it sounds like an insurance company bribes a qualified academic researcher to submit a request for a set of genomes to be used in a “cover” project.

Oh, wait, insurance companies aren't able to employ "qualified academic researchers"? The wholly unaddressed regulatory structure becomes more baroque by the moment.

zebra@331
Seriously? What don't you get about additional targets equals more exposure? If hackers are interested in stealing PHI from hospitals they will be interested in stealing it from your database.

Regarding insurers, there have been cases of rates being increased due to genetic risk. Certainly the largest database of genomic information would be attractive but it is unlikely. That said, the reason I originally said these narratives don't matter is that the onus is on you to ensure the information remains confidential, not on others to not abuse it. Regardless of risk, HIPAA says that legally you need to offer protections that our current tehnology cannot provide for a database that size.

If your counterpoint is the 100,000 Genomes Project there's two problems. First is that privacy concerns are alive and well is that project (see the Nuffield Council link Krebiozen provided earlier). Second is that England has far more lax confidentiality laws.

@333

Note to capnkrunch: If I have to end up sounding like philosophy101, perhaps it’s because everyone else sounds like they never took that class, and it is necessary to (re-)state the obvious?

Or maybe it's that most people graduate that kind of sophmoric babbling with high school. Don't misunderstand. You don't sound like philosphy or whatever 101. You sound like someone part way through the course with just enough knowledge to make yourself look like a fool and not enough to realize why.

Narad #337

Krebiozen doesn't like me suggesting that people are CUI (whatever the kind of influence) but sometimes it is difficult not to conclude that.

You have sputtered and blustered about the number multiple times, so let's read the reference you gave: (#44)

Your reply is greatly appreciated. But I remain puzzled as to why anyone would object to my suggestion, since I am offering what you appear to need. Let’s do this with Canada as a more manageable source of data:

Yearly births of 385,000 times 1,000 per genome is 385,000,000.
An F-35 (USAF, the cheaper model) is about 150,000,000.

So the US could buy 4 fewer of these (in a projected fleet of 450 plus) and easily cover data collection costs for our friendly neighbors to the north with their more rational health care system. Need more data, say from a larger country, lose a few more planes.

I take the rest of your blustering and sputtering to also be the result of whatever it is, and not worth wasting my time.

Julian Frost@335

Queries would take a very long time.

You're wasting your time. zebra thinks queries are zero cost actions and that selecting subsets with specific crtiteria is "trivial". I gave up trying to explain the hardware and software limitations after he hand waved away my references as cherry picked and refused to even look at the sources I provided. Of course he's done the same for the privacy concerns so it's I guess I'm just as guilty of time wasting.

I'll provide by references once more, mostly because I think Julian Frost might find them interesting in case he missed then earlier but also on the off chance that this time zebra will decide to at least make an attempt to educate himself.
Big Data: Astronomical or Genomical?
Routes for breaching and protecting genetic privacy

I work with a database for a department of a state government which only contains limited information on a small subset of the state's population, and it takes quite a bit of time and effort to get information, as there are still millions of records. Imagine scaling that up to complete information on the population of an entire country.

OK, so you answered it yourself, gratis capnkrunch. You can’t demonstrate that you have a “right to life”.

That a right can be violated =/= it can't be demonstrated. How does that even make sense? A right is a legitimate claim to consideration by others and the state wrt some aspect of individual self-interest, not a magic spell.

Also -- and I don't mean this in a nasty way -- "gratis" does not mean "courtesy of" or "thanks to" (or anything else I can think of that would make sense as you used it). So unless you intended to say "OK, so you answered it yourself, capnkrunch at no additional charge," I think you probably mean something else.

You have sputtered and blustered about the number multiple times, so let’s read the reference you gave: (#44)

Your reply is greatly appreciated. But I remain puzzled as to why anyone would object to my suggestion, since I am offering what you appear to need. Let’s do this with Canada as a more manageable source of data:

Yearly births of 385,000 times 1,000 per genome is 385,000,000.
An F-35 (USAF, the cheaper model) is about 150,000,000.

So the US could buy 4 fewer of these (in a projected fleet of 450 plus) and easily cover data collection costs for our friendly neighbors to the north with their more rational health care system.

Oho! You're correct; this level of derangement failed to sink in early on. Then again, I may have been conditioned by your comment 2:

But for more general application– if this 1K figure is indeed realistic, we could sequence every child born in the US in one year for the cost of a few dozen F-35’s or similar tradeoffs.

There does remain the glaring issue of your failure to genuinely cost the Bright Idea out, of course. Barfing up the $1K genome doesn't quite cut it.

But, so, just to quantify the terms, wouldn't it make more sense to just trade F-35's to Canada in exchange for all the biomedical data in perpetuity of a cohort from their, ah, more "manageable" population of "cooperative citizens"?

Am I really the only person who's still curious to learn more about the multiple registries of penis size in which zebra's records are kept?

Maybe everybody else just knows about them already. I guess I've led a sheltered life.

capnkrunch, thanks for those links. The second one is really fascinating. My comment was more about if a database like the one zebra proposes could be hacked and how easily (yes, and fairly). Nevertheless, it shows how wrong zebra is about how useless the data would be to a hack team.

Am I really the only person who’s still curious to learn more about the multiple registries of penis size in which zebra’s records are kept?

I'm going with illuminated tracings on vellum and private collection.

ann,

"A right is a legitimate claim to consideration by others and the state wrt some aspect of individual self-interest, not a magic spell."

"A legitimate claim to consideration" is yet another vacuous expression.

Sorry, but you're not making sense with that. The right doesn't exist until the sovereign entity (the state) grants it. The question is, do you have a "right to life"? If you have to "make a claim", then you don't have it.

And yes, "courtesy of" usually signifies gratis, but by itself gratis doesn't signify "courtesy of".

Which reminds me, just for future reference, although I've explained it before, "freely available" means "gratis" in my project design, so thanks for bringing this up-- I will remember to use gratis in the future.

One could just as easily argue the perfect freedom is the natural state of human beings, and governments are formed when people decide to willingly give up certain rights.

capnkrunch 339,

It's OK, lots of people give up when they are faced with making quantitative arguments.

If you think about USA plus Canada, that's beginning to approach 400,000,000 actual health records that could be targeted. Being one of 200,000 individuals with genome information on file in fact represents a negligible incremental risk. Truly negligible.

zebra@348

...but by itself gratis doesn’t signify “courtesy of”.

Which is how you used it. Are you pathologically incapable of admitting that you got something wrong?

Gray Falcon@349
zebra's class just started their social contract unit. You can't expect him to understand everything yet.

capnkrunch:

Are you pathologically incapable of admitting that you got something wrong?

Probably. In a previous thread, when pointed out that his comment was directly contracted by the article, he insisted he meant something else, and refused to tell us what, even going to so far as to suggest we were too stupid to understand what we said.

http://scienceblogs.com/insolence/2015/05/15/the-benefits-of-the-measle…

It didn't work.

zebra@350

It’s OK, lots of people give up when they are faced with making quantitative arguments.

It's ok, the difference between distributed and centralized isn't that difficult but I'm not surprised you continue to fail to see the difference. There's no centralized database of 400,000,000 records. If hackers are interested (they are) in compromising hospitals that might not contain even 300,000 archived records they will certainly be interested in your database. Plus, there's no central collection of more than several thousand genomes which means your database represents a unique asset making it all the more valuable.

Sorry, 200,000.

Zebra, I see that you have still not answered my question from #319: Why are ethics and human rights at the forefront of human subject research?

I'll give you a hint: it involves the major event of 20th century world history.

Tell me that and I'll give you a lovely story about hacking your imagined database. It'll be a right cracking tale!

On that note I remembered a continued failure to understand "order of magnitude".

zebra@195

And anyway, you and capnk keep complaining that there is too much data– ok, you should be happier with half as much, right? That’s something around 190K genomes.

Justatech 355,

How would I know if ethics or human rights (which I've clearly stated are vacuous concepts in my thinking) are "at the forefront" of anything?

If I knew what "at the forefront" meant in this context, I might try, vacuity aside.

"Am I really the only person who’s still curious to learn more about the multiple registries of penis size in which zebra’s records are kept?"

There are certain medical conditions where it may be recorded. Could even be in a text book. But I'm not really interested in knowing.

"It’s ok, the difference between distributed and centralized isn’t that difficult..."

I didn't think so either, and I thought my little vignette made that clear.

Not a Troll@358
I always chuckle when the review of systems says "Genitourinary: unremarkable".

And yes, “courtesy of” usually signifies gratis,

No it doesn't. It suggests or implies it -- ie, if something with a cost is provided gratis, it would be reasonable to infer that it's courtesy of someone. But in order to signify that it has been, you would have to include some signifers to that effect -- ie, "semantic hairsplitting provided gratis, courtesy of ann."

but by itself gratis doesn’t signify “courtesy of”

We agree. It's a red-letter moment in commenting history. Let's rock out.

Which reminds me, just for future reference, although I’ve explained it before, “freely available” means “gratis” in my project design, so thanks for bringing this up– I will remember to use gratis in the future.

And people will comprehend you when you do. Because "gratis" doesn't just mean "freely available" in your project design. That's what it means.

The right doesn’t exist until the sovereign entity (the state) grants it. The question is, do you have a “right to life”? If you have to “make a claim”, then you don’t have it.

I thought your argument was that it didn't exist because it couldn't be demonstrated. Have you abandoned that position? Because if not, you still haven't made a case for it.

And if so, assuming that your position is now that rights don't exist because they're legitimate claims and if you have to claim something, you don't have it:

Just because something is a claim, that doesn't mean you have to make a claim in order to have it. That presumes the claim is not being honored.

So you're basically just reverting to "rights can be violated, therefore they're not real." And how the effing eff does that make sense? Please elucidate.

I think we can add one more item to the growing list of topics that z*bra knows little about: basics of governmental contracting for high-cost items.

If the average cost of 100 F35 fighter planes is $150 million, one doesn't save $150 million by reducing the order from 100 to 99. The actual cost would probably be $5.75 billion for the first plane and ~$75 million for each additional plane.

It's no big deal: just reduces the funding for the Magical Mystery Genome Project by 50%.

ann,

Obviously, one cannot violate or take away something that doesn't exist, and obviously that's not my suggestion.

Do you have a "right to life"?

If yes, you would have to give me some way to test for the existence of this entity "a right". Some way to demonstrate the difference between "having a right to life" and "not having a right to life". I see no way for you to demonstrate that difference.

If I kill you, agreed, that's not a valid test. But, nor is it a valid test if I don't kill you.

You are making the claim for existence, so you must provide the test.

Zebra @357: I didn't ask you *if* human rights must be considered first in human subject research. I asked you if you know *why* those laws are in place.

I have told you in several threads that any human subject research must respect the autonomy of all subjects, and that there are strict regulations and laws about this.

So, to re-phrase my question: Do you, zebra, know *why* there are rules and laws about the treatment of humans involved in human subject research? Do you understand how these rules came to be and their historical context?

The answer can be lazily summed up in one word.

(WRT claims, their inherent properties or lack of same, for the sake of example.)

I claim to be an American citizen. And in the event that a demonstration of that is required, here you go:

I'm an American citizen.

When returning to the United States from another country, however, I not only have to make that claim, I have to produce some evidence of it -- ie, a passport, which I only have to begin with courtesy of the state's recognition of my legitimate claim to one, on which its evidentiary value is entirely contingent.

So let's say I go to Canada, inadvertently forgetting my passport. Because I've actually done that. And this is what happened:

Canada was so cool with it that, I didn't actually have to make any citizenship claims in order to get in. I just said, "Oops, forgot my passport. But I'm going to be late! It's a business thing! I have to be there! Let me in!" And they did.

On the way back, however, I had to both claim that I was an American citizen and back it up in order to be readmitted to the country.

Does that mean my American citizenship isn't vacuous in Canada, although I don't have it in the United States? Or is it fully pre-vacuous in its very nature because it's a claim the legitimacy of which is contingent on recognition by the state? Or would I have it as long as I never actively claimed it? Or what?

Another thing that doesn't exist by this logic: mathematics. How would one test for the existence of a purely conceptual framework?

Justatech #363,

There are laws about human subject research (or any other thing) because the sovereign entity establishes them. In a democracy (assuming it functions somewhat as we normally understand it should), that would be the result of citizens voting for their own perceived best interests.

You seem to be looking for some event that would influence that perception. There are plenty of instances where there were negative consequences to subjects-- perhaps you are thinking of the Tuskegee Syphilis Study, which is pretty famous. I don't know that that would be "the major event of 20th century world history", though. I'm certainly not going to further waste my time trying to read your mind-- "guess what number I'm thinking" is a pretty childish game.

And just a note-- you hardly need something of that magnitude to get laws passed; just look at all the paranoia, including yours, surrounding my innocuous suggestion here.

ann,

#364 way too confusing in terminology and organization.

If you can answer #362, you're done, so I will await that.

Ignoring #361to focus on vague philosophical ruminations? Why am I not surprised. . .

You are making the claim for existence,

Not really. I've implicitly made that claim in passing in the context of rebutting yours, which is that rights don't exist because they're vacuous concepts because...Well. I'm not completely sure at this point. But I think it's either "they can't be demonstrated" " or "claims are also inherently vacuous for reasons not yet stated."

Correct me if I'm wrong, although I suppose I might as well go ahead and note that if it's the latter, it's fully self-negating by virtue of being itself a claim. Which you're making.

In any event: Not really. I'm mostly rebutting your claimed grounds for the vacuity of rights and/or ethics and/or abstract concepts, as the case may be. (Again, I'm not completely sure. But it seems to be something in that general neighborhood. So I hope I've succeeded in covering it.)

so you must provide the test.

No, I most certainly must not.

You contend that rights and ethics are vacuous concepts. Your arguments so far in support of that contention don't make sense. And that would still be true no matter what I thought about it or why. So. On what do you base that assertion?

ann,

#364 way too confusing in terminology and organization.

Fine. Let's streamline it:

Is American citizenship a vacuous concept? If so, why? And if not, why and how does it differ from rights and ethics?

If you can answer #362, you’re done, so I will await that.

If you can tell me why I should disprove something you haven't proved, your waiting will not be in vain. But otherwise it will.

So please support your contention that rights and ethics are vacuous concepts in a way that makes sense. Because you still haven't done that.

Ann,

This is extremely simple and clearly stated:

"Do you have a “right to life”?

If yes, you would have to give me some way to test for the existence of this entity “a right”. Some way to demonstrate the difference between “having a right to life” and “not having a right to life”. I see no way for you to demonstrate that difference.

If there's no demonstrable difference between having a "right to life" and not having a "right to life", the concept is meaningless.

Sorry if you are unwilling to acknowledge this, but I guess I'm the only one here willing to admit I got something wrong.

Zebra, you like to make things complicated don't you?

There is a right to life and the demonstration of that is that someone or some thing is living. Then men and/or nature take that away for various reasons. Yet it still exists independently of either.

zebra- Do you know who else was famous for his efforts to advance medicine without concern for such things and ethics and rights of the patient? Joseph Mengele. Think about that.

Ann,

This is extremely simple and clearly stated:

“Do you have a “right to life”?

It's clearly stated. But I don't see how any question about such a weighty subject could really be simple, except to the simple-minded.

But maybe you just mean that it's easy to understand what you're asking. And if so, I agree that it is.

Be that as it may.

If yes, you would have to give me some way to test for the existence of this entity “a right”. Some way to demonstrate the difference between “having a right to life” and “not having a right to life”. I see no way for you to demonstrate that difference.

And I don't see why I am or should be obligated to disprove a claim I didn't make in terms that I haven't used, when the only reason I'm discussing it at all is that you made it but were unable to support it with an argument that made sense.

You say that rights are vacuous concepts. The reasoning behind that is..?

If there’s no demonstrable difference between having a “right to life” and not having a “right to life”, the concept is meaningless.

OK. Are we now back to "rights are vacuous because they can't be demonstrated"?

Because it's not true that they can't. We've already gone over that. But we can do so again. Are you a commenter here at Respectful Insolence of your own volition, because you freely chose to be? Are you free to cease being one at any time you so choose? And in both cases, are you free to make your choice without fear of reprisal from the state?

Assuming that the answers are yes, yes, and yes, we have just demonstrated that you have the right of free association. Because you're exercising it.

Indeed, your apparent unawareness of that fact comes pretty close to proving that you're so securely convinced of your ability to do so that it's never crossed your mind that it's not among the natural powers you were born with. That's a pretty damn impressive demonstration.

Sorry if you are unwilling to acknowledge this, but I guess I’m the only one here willing to admit I got something wrong.

That's very unfair. I admit that I'm wrong much more readily than the internet average. I just have to be wrong first.

In any event. I'm perfectly willing to acknowledge any point you're capable of supporting with a sensible argument. But you haven't yet made one for the claim that rights are vacuous concepts. So get to it.

There is a right to life and the demonstration of that is that someone or some thing is living. Then men and/or nature take that away for various reasons. Yet it still exists independently of either.

Come on, now. How does a living thing demonstrate the right to life?

Life, in itself, does not demonstrate that the right to it does exist any more than death/killing demonstrates that it doesn't. Be serious.

^ By virtue of it's existence. A dead thing has no right to life. Only something living can have a right to life.

ann, please, you really are not being serious

I have acknowledged multiple times that rights exist in the context of the sovereign entity (SE) or state. Do I really have to go back and show you?

I said that they are meaningless as I described them, as qualities or attributes inherent to the individual. They only exist as granted by the state.

If there is no SE with a code of laws, and a legal system that works as intended, then "rights" is an empty concept. But it is a perfectly useful concept (kind of a shorthand, the way I see it) when there is the SE and legal system. Your problem is not being able to keep these two contexts separate.

Ding ding ding! Gray Falcon @373 has it in one: Why do we have major international declarations on what constitutes human rights? Nazis.

Zebra, in more than one thread I have tried to explain research ethics and the laws, standards and regulations surrounding human subject research. So far you have categorically refused to acknowledge that any of it exists.

Please go read the Declaration of Helsinki and think about how that applies to your proposed research database, as well as your ongoing argument with ann about the concept of rights.

And re: " the paranoia, including yours, surrounding my innocuous suggestion here" - It's not paranoia if there is ample evidence for past malfeasance. Or, is it still paranoia if they really are out to get you?

Of course human rights and ethics are not Platonic entities that exist "out there"; they are the result of agreement and consensus. Ethics are the basis of laws and professional standards that are enforced and have real consequences. How can anyone be so utterly confused about this?

The point of all the many committees and steering groups internationally that are wrestling with the thorny problem of genomic privacy is to establish an agreed ethical framework and to establish what legislation, if any, is required. Here's a list of the different state statutes dealing with this issue that have been introduced over the past eight years.

Sorry if you are unwilling to acknowledge this, but I guess I’m the only one here willing to admit I got something wrong.

You owe me a new irony meter. 20 posts or so ago you were unable to admit that you were wrong about your use of gratis even as you agreed with ann about the definition (a definition that doesn't make sense in the context you used the word).

Zebra @371:
Please, I admitted I was wrong in #126.

All you want to do is look like you're better than the rest of us.

If there is no SE with a code of laws, and a legal system that works as intended, then “rights” is an empty concept. But it is a perfectly useful concept (kind of a shorthand, the way I see it) when there is the SE and legal system. Your problem is not being able to keep these two contexts separate.

Gosh. However did I get the idea that you weren't originally making that distinction?

But other than through the interpretations of SCOTUS in the USA, I have no “right to privacy” that I can characterize– I would have to go to court with standing in a specific case to find out what that would entail.

But OK. As long as your position is now that human rights are not actually a vacuous empty concept in countries that have them, so be it. I'll make more of an effort to keep them separated.

@capnkrunch --

20 posts or so ago you were unable to admit that you were wrong about your use of gratis even as you agreed with ann about the definition (a definition that doesn’t make sense in the context you used the word).

As I said earlier, I'm an optimist. So I could be wrong. But at least as I read it, the reason he said that is that he was admitting he was wrong. I mean, it really wasn't all that big of a deal. Lots of remorseful detail wasn't called for.

^ By virtue of it’s existence. A dead thing has no right to life. Only something living can have a right to life.

No argument here. My point was that its being dead doesn't, ipso facto, prove that there is no such right. Including if the death was caused by an unjustified act of killing in a human-rights-having country. I mean, if unjustified acts of killing are recognized and punished as such, that's actually proof that the right exists.

I guess I should really say "generally recognized and punished as such." There's always room for improvement.

Never let it be said that we don't live in a country that protects the fundamental right of citizens in a free society to write curse words on a speeding ticket, at least in the Southern District of New York, which is probably more liberal wrt such things than some.

/all-but-completely off-topic.

@Not a Troll --

Hey, Not a Troll!

Please accept my apologies for that comment @#375. It came out more brusque and snippy than I meant it to or than it needed to be.

Ann, no worries. We're good :)

Gray Falcon #300 “Okay, reading this long thread, I’m getting the impression zebra is unaware of exactly how much genetic information a human being has”

JustaTech #309 “Ethics are not optional in human subject research. Not in the US, or the UK, or really anywhere in the world. There are standards and treaties and laws about these things. IRBs are not optional, and they are not optional for a reason.”

JP # 322 “we don’t know right now just how much information about a person is in their DNA”

There are many good points arisen in this thread, including the above. My 20th century thinking, as Zebra calls it, is based on much recent NGS data trudging, including the use of the Oxford NanoPore Minion which is really the next NGS (about as far from 20th century thinking as you can get in genomics). I have been involved in the genomic study of a birth cohort, of 97% of people born in a single year in a single city. Note: the cohort is over 35 years old now (and has been tracked for health and sociological outcomes since birth), and we did not get genomic data from all of them. Consents and ethics are a big source of loss, as are death and inability to track individuals over their lifetime. However, even with this rich set of data from tracking a birth cohort over 35 years, there is really not much you can glean from the genetic data, aside from ethnicity and other stuff that makes a lot of people reluctant to participate in these kinds of genomic projects. The reasons for a general lack of clear actionable genomic outcomes are very complex. I could give you a long list, or you could just take my (and other contributors) word for it.

Our group is hardly alone in this dearth of actionable data. There are very large international consortia with very large data sets. COGS is one which is relevant to ORAC’s post (which was a very good synopsis IMO). Samples from 239,832 individuals from 167 research groups from all over the world (breast, ovarian and prostate cancer) have been used. In spite of this huge database, there are very few instances where a clinician can say, based on genetics alone, that an individual WILL get this cancer, or that a particular treatment WILL cure this person’s cancer. That’s due to the fact that cancer, like most diseases, is caused by multifactorial things, which may include genetics at birth, but is often over-shadowed by lifestyle factors and just bad luck. Genetics gets even more complicated when you are looking at diseases such as autism, schizophrenia, depression, obesity etc…And genomics is not just about a person’s DNA sequence which can be recorded in a 3 billion base pair text file derived from a much bigger NGS run. As Malia alluded early on, we just don’t know enough yet to make any sense of the data we already have. Generating even more just “because we can” makes no sense, is not ethically tenable (even if you don’t believe in ethics), and is not without risk to the individuals.

However, I will say that on some level I understand where you are coming from. I think most of us working in the field dream of a day when we can take blood from a person at birth, and use genomics to help guide that person in their lifestyle choices, prevention and screening programs, tailor drug treatments and generally help them to lead a healthy and long life. However mass genomic screening of people who are not yet sick, and may not be sick for many decades if ever, is not the way to go right now. We are working towards it (maybe). The hurdles are truly enormous. Generating the DNA sequence is NOT the bottleneck. Doing the genetic groundwork to make the data useful, consent and ethics, data storage, access, security and a whole host of other things are the current bottlenecks. So maybe I share your dream, but I don’t share your ideas on how to get there. I know the likes of Kaiser Permanente’s HUGE project are working towards this too. So it’s not a lack of people already working on it that is holding things up. It is the fact that humans are complicated, and there’s so much more than genetics that goes into it. It’s that there is never enough money to fund every blue-sky idea that pops up. And even given enough money, ethics and privacy will always raise its ugly head (unless more countries move to a totalitarian approach to the problem). It seemed a good idea at the end of the 20th century. Now we need to work smarter, not generate even more data that we don’t know what to do with and that many people don’t want us to have. If you think data security isn’t an issue, maybe you should brush up on Wikileaks or Edward Snowden to remind yourself how vulnerable sensitive data can be? Why would I say that genomics is complicated data, and then say that your genome is sensitive information? At the moment, it is mainly due to a plethora of really poor studies in the past that have inferred an association between single or a few genes, and certain health or social outcomes. A perfect example of this came up in our lab, where we used to have honours students extract their own DNA, then gave them some ‘mystery primers’ to amplify and sequence different genes. They then went away to identify their mystery gene. One of the students happened to get primers that amplified his DRD2 gene. He correctly identified his gene, then saw he had a variant that in past literature was associated with schizophrenia. Just so happened his mother had schizophrenia. He was beyond distraught, and we’ve since had to change our lab practice to make sure students only use de-identified DNA. Not because there is any real association with DRD2 and schizophrenia, but because spurious genetic efforts to interpret DNA sequence in the past had led to publications that such an association existed. We now refer to most diseases like schizophrenia as MAGOTS. Many Associated Genes Of Tiny Significance. And this is most diseases. And this opens a whole new can of worms. If we generate complex NGS data, who owns it? Can the person who donated the sample access this data and troll through it to find all sorts of worrying associations? And if we don’t let the individuals have their sequence data, is there an onus on those who control the data to let the individuals know if they have any actionable, pathogenic or incidental genetic information, assuming we even have the ability to recognise these? So to recap, there are basic genetic and environmental knowledge issues, data storage and processing issues, privacy, ethical and ownership issues, cost issues, and ummmm, issues galore to be sorted through.

Such projects require a perfect combination of artificial intelligence and human stupidity.

Retro Pump,

Thanks for acknowledging that I was making a positive suggestion.

I have to ask again-- why didn't you jump in sooner? If you had said you participated in exactly the kind of data collection I propose, it might have saved an awful lot of the silliness. At least they would have been reluctant to compare me to Mengele, since you would have been tarred with the same brush.

Just let me clarify the 20th century point, and an apparent misunderstanding. I didn't say we need more data, although we might. I say we need more researchers, doing basic research, not necessarily directed towards finding medical cures.

Making the large amount of data available gratis would aid that.

The availability of that much data at a central source should, if I understand what you are saying correctly, reduce the obvious problems that you list. There would be uniform standards and uniform data, rather than a hodge-podge of sample collection and storage options and security options and privacy options and so on.

So it has nothing to do with what equipment you are using, and there is no criticism of your capabilities. But lots of things are changed by technology, and the structural and organizational aspects of the enterprise should reflect that.

This would be a use of government funding with a substantial multiplier effect.

Pointing out that a lack of ethical guidelines in medical research can lead to the horrors perpetrated by Mengele does not constitute comparing zebra to Mengele.

ann,

Sure, when I admit an error you have no trouble at all interpreting my words....

But I do appreciate your agreement.

I would note that the traffic ticket story is spot on with respect to what you quoted; for me this is where it gets interesting. The most convincing evidence that you have an actual right (granted or constructed by the SE) is when the issue is adjudicated, and even more so when there is a record of rulings that go both ways. In a sad way, the death penalty is ultimate confirmation of the claim that citizens have a "right to life", if it is applied fairly.

I note also that there are obviously people who still don't get it, who believe that "ethics" are the determinants of law. Just like the people who believe that the US constitution is derived from "Christianity". Oh well.

zebra, if you don't want to be compared to Mengele, then stop dismissing the concepts of rights and ethics.

zebra, # 289
I am not really able to follow your arguments about ethics.
I thought you could be happy to live in a country where
"...We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.--That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed..."

# 290
I was really lurking from the start, and I was aware of your proposal.
I wasn't aware of the complexity of the matter too, and I am grateful to cognizant people who put it down right.

But I think something is missing from the discussion, and it is only surfacing from Retro Pump's last post.

As i suggested in Orac's last post about ductal carcinoma in situ, many medical procedures (in that case breast screening) are implemented without knowing all relevant consequences.
A proposal to gather DNA for genomic screening would bring the unwanted consequences Retro Pump found in his student: the discovery of pieces of information which rarely could benefit the donor, but more frequently would damage his psychological health.
And when humans have information at hand, they feel compelled to use it, even when the information is useless for their stated purposes.
This is the reason it is so hard to convince women not to have a biopsy after a positive mammography report.

Just like the people who believe that the US constitution is derived from “Christianity”.

It would be silly to say that the US constitution was merely Christianity transposed to another context. But there's really not a whole lot of western thought that isn't derived from Christianity. It's kind of the you're-soaking-in-it matrix of the contemporary west. For example: Prior to the rise of Christendom, the law -- or, more specifically, Frankish law -- did not distinguish between murder and manslaughter and when somebody got killed, the killer made restitution to their clan in whatever amount the loss of that person's life represented. IOW, all lives were not regarded as equally valuable simply by virtue of being lives. That idea didn't become current until c. the Holy Roman Empire, because it derives from the belief that since everyone has a soul like everyone else, everyone is an autonomously valuable individual of fundamentally equal worth.

So. The constitution wasn't written until almost a thousand years after that. But inasmuch as it expands on the same concept, it is derived from Christianity. It's also why most of the west rejected slavery on its own turf and eventually everywhere.

That's not to say that all western thought is religious. Obviously it's not. But Christianity is a very pervasive (if attenuated) influence. Omnipresent, pretty much.

The idea of government by an institutional hierarchy where the powers of a position and the respect due to it are vested in the office rather than the officeholder, who occupies it by appointment/election and not hereditary privilege -- ie, the office of the presidency, etc. -- is also derived from Christianity. Or actually, the Church.

Not purely. There's some partial precedent in classical antiquity, too. But you know what I mean.

BTW:

Sure, when I admit an error you have no trouble at all interpreting my words…

^^That's not willfully selective. I swear.

Perodatrent, welcome back.

What you quote would correctly be described as "propaganda".

What makes me happy to live in this country is the Constitution itself and the development over the centuries of a structure of law that has, if imperfectly, granted and constructed that could be described in those terms.

It was not The Creator that gave the slaves life and liberty, or gave women the right to vote and own property, or gay people the right to be married-- to pursue happiness.

It was the people who sacrificed and died to achieve those things. And they are only unalienable by virtue of eternal vigilance, as they say.

Saying that the rights are supernaturally extant, like souls, and we are "securing" those rights by forming a government, is a sop to those who need some kind of Authority figure around as a security blanket.

Your concern about the existence of the knowledge is understood. How does my proposal change anything? If you can get a genome done for $1,000 or even much less eventually, people will get them. Are you going to declare a War On Genomes, like the War On Drugs?

#395 "what could be described in those terms"

ann,

All valid. Once again I was using shorthand; Christianity in quotes as a stand-in for garbled misinterpretations of mostly the Old Testament that our resident Taliban types promote.

But you know, we could have another long (sophomoric according to capnkrunch I'm sure) discussion, about whether the Church was more influenced by Rome than Rome by the church. And Rome influenced by Greece, and then there's the Eastern Church, and ....

Another time perhaps.

@zebra --

As long as I'm at it, since you mentioned it @#292:

For example: Prior to the rise of Christendom, the law — or, more specifically, Frankish law — did not distinguish between murder and manslaughter and when somebody got killed, the killer made restitution to their clan in whatever amount the loss of that person’s life represented.

^^The penalties for killing were calculated based on an-eye-for-an-eye principles, IOW. And that wasn't because the Franks were reading the bible. Or even reading, for the most part. They were pre-Christian, and there weren't really books yet.

Because "an eye for an eye" isn't actually a religious sentiment and it doesn't actually originate in the Hebrew bible. It's there because it was a tenet of Babylonian secular law -- eg, the Code of Hammurabi -- that Jewish law picked up. But it's also not inconceivable that it arose independently in other clan/tribe-based societies in other parts of the world subsequently.

Long story short: It sounds like a religious prescription to a contemporary western sensibility because most people in the contemporary west know it from the bible. But it's not creator-mandated or anything like that. It's just law that happens to be in the bible. And it's been around a long time. I don't know why that is. It just has a very enduring popular appeal, seemingly.

Maybe that's dull, I don't know. But if so, I apologize. I find that kind of thing interesting.

zebra, do you really want to work with a medical system that considers ethical behavior "twentieth-century thinking?"

Question for you, zebra: What, precisely, was wrong with the Tuskegee Syphilis Study? Answer that without bringing up ethics or the "right to life".

I've heard the real understanding of "an eye for an eye" is that it was addressing the culture of extracting more from a perpetrator than what was lost "two eyes for one eye".

I can't say I vetted this yet but it wouldn't surprise me if that is what was happening with vengeance and all but it is not something I'll get into a debate about at this point.

ann,

I of course find these things interesting. I just don't want to offend sensibilities much more than I already have, and spend too much time OT.

I don't think there are many if any things that you could properly call "religious" in this context. As I suggest WRT the US constitution (#395), societies establish laws pragmatically, and attribution to some "higher power" or secular "moral/ethical" construct is simply an attempt to better achieve consensus and compliance.

garbled misinterpretations of mostly the Old Testament that our resident Taliban types promote.

As a non-Christian, I feel obligated to point out that the Old Testament is not the same thing as the Hebrew bible, even though it is.

Meaning: The popular belief that those books of the bible are all vengeful and wrathful and full of harsh, bloody and unforgiving eye-for-an-eye conflict is derived from Christianity, which has a much more literal and historicist approach to reading sacred texts than Judaism does. Plus, you know. Back in the origins-of-Christianity day, being an improvement on Judaism was an important brand attribute.

That's not how Judaism reads or understands the same texts, however.

or secular “moral/ethical” construct is simply an attempt to better achieve consensus and compliance.

Well. That's not necessarily a bad -- or ftm, vacuous -- thing, assuming that the consensus isn't unduly restrictive and the compliance isn't coerced.

I also don't know about the "simply." Ethics facilitate the functioning of civil society. And a functional civil society facilitates the preservation, maintenance and practice of fundamental freedoms.

It's all pragmatic, in one way or another. Praxis is complicated. And also error-prone. If it wasn't, everybody would just make a road-map to nirvana and settle in immediately.

ann,

Jesuitical or rabbinical; had to be one or the other....

But 404 doesn't hold up to those standards. What exactly constitutes a "functioning civil society"? Societies that hold slaves and oppress women "function" just fine.

Grey Falcon 404,

I really don't want to spend my time being philosphy101 professor, but you are insistent.

I can't tell you what was "wrong" with the study because "right" and "wrong" are empty terms.

I can tell you why I would vote for laws prohibiting such practices-- because the next time it could be me.

I don't know enough detail to speculate on what laws it might have violated at the time, but that would be about whether it was illegal, rather than "wrong".

Jesuitical or rabbinical; had to be one or the other….

They're very similar. As it happens, I didn't have that kind of religious education, though. Or any kind, in fact. It's just the turn of mind I was born with.

Annoying, but not without its advantages.

But 404 doesn’t hold up to those standards. What exactly constitutes a “functioning civil society”?

One in which the non-governmental institutions and individuals that make up a society are able to function, within reasonable parameters. At a minimum, that requires a general ethical commitment to fairness and....I'm not sure what to call it. Something like "care/harm avoidance."

Those obviously aren't the only requirements. There has to be enough baseline security/stability wrt the resources necessary for survival first. For instance. But assuming that's in place:

You really need those ethics to keep stuff up and running. It's kind of like:

You can take the individuals out of the tribal/feudal society, but you can't really take the tribal/feudal inclinations out of the individuals. Because at the individual level, that's how self-interest tends to incline, due to the scope of consequences that are immediately within any individual's sightlines being very narrow.

Ethical guidelines help. They're a form of practical wisdom, basically.

Societies that hold slaves and oppress women “function” just fine.

Yeah, but they're unfair.

zebra # 395
A long time ago I was teached that there were two sorts of Rights, Natural and Positive.
Consttutions were examples of positive rights. But I think they are based on some kind of natural right which is not completely vacuous, as you seem to believe. After all, people writing constitutions have to base their written words on shared beliefs.
As for your example, if Kuwaitians like it that way, who am I to argue?

As for the "right" to have one's genome sequenced, I think it should be done only after specialist counseling; as already happens, for example, for HIV tests. Not a work for a family physician, but for a geneticist.

I have the "right" to have my genome sequenced. I do not have the *desire* to do so. And I would resent being forced to do so. I would resent any effort to make me have my children's genome sequenced mandatorily. And certainly I wouldn't give consent at their birth to have it taken and kept somewhere. At least not as things stand today.

In 50 or 100 years, my descendants might have a different view/desire. But I would never agree for it to be required.

I really don’t want to spend my time being philosphy101 professor

Oh, G-d, it's delusional. First the jaw-droppingly primitive performance about rights, and now ethics.

You're arguing with a dining-room table, folks.

ann,

We're just going to have to disagree.

Replacing a genuine, expansively empathic and compassionate psychology with the kind of Authoritarian rulebook you are talking about is both dangerous and degrading.

"Fairness is 'good' " is trivially replaced with "greed is good", as we have seen over the last few decades. I would prefer an effort to help people to escape adolescence, as much as possible.

I do enjoy this latter part of our conversation, but unless someone wants to discuss the on-topic Zenome project who can maintain an equivalent level of civility and knowledge as you have demonstrated on this, I'm done.

zebra, would you oppose anything like Tuskegee studies if you were assured you wouldn't be subject to experimentation?

perodatrent,

If you never heard of "natural 'right' to life", you would still put laws in your constitution prohibiting murder, and arbitrary execution by the government, so you would not be walking around in fear all the time.

Also, I have to say that thing about "get with the 21st century" again-- you can buy HIV tests OTC for USD39.99 .

zebra@413: Or the ruling class could put in rules prohibiting murder and arbitrary execution of their own class, and to heck with everyone else.

Grey Falcon, you have to read my other comments and figure it out; I'm not going to keep repeating.

I would oppose them because I empathize with people suffering, and because, even if I am not directly involved, it causes disruption in the society. You can't really be isolated from things-- it's like when the cop shoots the black guy for no good reason. Everyone picks up the tab, in multiple ways.

Done now.

I have to ask again– why didn’t you jump in sooner? If you had said you participated in exactly the kind of data collection I propose, it might have saved an awful lot of the silliness. At least they would have been reluctant to compare me to Mengele, since you would have been tarred with the same brush.

Now there's a failure of reading comprehension.

^ The "they" is particularly precious.

I would oppose them because I empathize with people suffering, and because, even if I am not directly involved, it causes disruption in the society. You can’t really be isolated from things– it’s like when the cop shoots the black guy for no good reason. Everyone picks up the tab, in multiple ways.

Perhaps zebra is unaware that the philosophical study of how we can minimize people's suffering and reduce disruption in society is called 'ethics'.

From 'The Ethical, Legal, and Social Implications Program of the National Human Genome Research Institute: Reflections on an Ongoing Experiment' published last year:

The importance of the ethical, legal, and social dimensions of genetics and genomics research—acknowledged in the initial assessment of the plans for the Human Genome Project —was given formal recognition in 1990 with the establishment of the Ethical, Legal, and Social Implications (ELSI) Program, a component of the extramural genomics research program of the National Institutes of Health (NIH). The program began, and in many ways continues, as an experiment. It was legislatively instantiated in the National Institutes of Health Revitalization Act of 1993, when Congress, in establishing the National Center for Human Genome Research [the predecessor to the National Human Genome Research Institute (NHGRI)], mandated that “not less than” 5% of the NIH Human Genome Project budget be set aside for research on the ethical, legal, and social implications of genomic science. More than 20 years later, the need to pay close attention to such issues is almost universally appreciated, and the terms ELSI and ELSI research—coined initially simply as bureaucratic shorthand for a particular NIH funding program—have become staples in the lexicon of the genetics and genomics field.

I note that "the need to pay close attention" to "the ethical, legal, and social implications of genomic science" is "almost universally appreciated". Perhaps zebra didn't get the memo.

Zebra @ 388
“I have to ask again– why didn’t you jump in sooner?”

Well, mostly I don’t have enough spare internet time to participate in these discussions, so I rarely even keep up with them. I usually hit them from my TOC alerts when they first come out, read the comments, then don’t check back again. When I first read this post there were not that many comments, and to address some of the questions raised adequately would have taken a book length post, and still not covered everything. Plus after 30 years working in the field I still don’t know close to everything. I know a whole lot less now than when I first learned as an undergraduate that DNA>RNA>Protein>human (which is now clearly not true). I have only touched on what I consider the tip of the ice-berg.

I don’t understand how I could be tarred with the same brush as Mengele, but then we perceive our own faults less than others perceive them. The student I mentioned was informed not only that he used his own DNA, but that there is also the possibility of incidental findings. He was also informed that the genes we were handing out had no real, established link with diseases. Part of the exercise was in fact for the students to trawl through the literature on their gene and realise how many false positive associations there are in the literature when you look at it in the light of more advanced research. And obviously, this person was doing a project in a genetics lab, so arguably had better knowledge to understand incidental findings in the context than most lay people.

To this day, twin studies are considered the gold standard in teasing out genetics and heritability from environmental factors. The difference today is ethics and consent. We have to make sure that any study participant (or patient sent for genetic testing) OR their parents, not only consents, but understands the possible positive and negative impacts of participating in the study. We have to take into consideration ethnic, cultural, religious and personal preferences. We never know all the potential pros and cons, but the concept of ‘first do no harm’ is as incumbent on researchers these days as it is on medical doctors. This is the 21st century thinking anyway.

Just for fun, let’s stick to schizophrenia. It is, judging by twin and family studies, the most highly heritable mental disorder we know of. But read this article http://psychcentral.com/lib/schizophrenia-and-genetics-research-update/
and have a good hard think about what malia said about doing the basic groundwork. We are so far from having any cogent answers to most complex diseases, that the question of what we do with this data, what are the benefits versus harm to the participants, who owns the data, how does a child who comes of age and decides they don’t want their data get it removed, who gathers not just the genetic data, but all the phenotypic data which will be needed to maybe make sense of the data someday. Hint: These targeted studies are a much better way forward to answer gene x environment questions, as much of the phenotype data needs people to actually go looking for it. You can’t just count on it being captured by your family doctor. So yes, we need more researchers on the job at ground level, but not necessarily for archiving scads of data that may be close to useless without closely monitored phenotypic data. But we also, in this day and age, need to have systems in place that give study participants the information and reassurances they need to feel comfortable participating in these kinds of studies. Please understand that in a civilised society THIS IS FIRST AND FOREMOST. We have to be very clear up front about who owns and can use the data, what types of analysis or studies can they be used for, how will the data be protected, what happens to the data and samples if the study winds down for whatever reason, and so on. For instance using my own personal experience, we had consent to collect DNA from our cohort for genetic analysis for a broad range of questions, but when we needed to send the DNA to another country to get the genetic analysis done cost-effectively, we had to go back to every person and get new consent. And some of them declined. Most interestingly this consent was to send the DNA to the USA (who then generated the data), and some people just didn’t feel at all good about that. With the size of the study you are proposing, it will almost certainly need international collaboration. How many Canadians in your proposal might say go ahead with your plan, as long as the samples and data stays in Canada? Quite a few would no doubt object to sending it to a US government body. You may think this is just nonsensical paranoia, but what if one the conditions you want to study is paranoia? Though more than ever I don’t think it is paranoid for people to worry about third hand parties getting hold of their data.

Last of all (for now) just think about the amount of data you are talking about. It can be trawled for so many things, and whether a researcher has legitimate or nefarious reasons for doing so, you can massage this vast amount of data to imply ethnic links with intelligence, violence, or socioeconomic outcomes, genetics of sexual preference, or pretty much anything else your heart desires. If you feed this kind of data back to participants, you set in place the perfect storm for eugenics or ethnic cleansing once again. We have to tread very carefully going forward.

Just this week in the BMJ was an article that is relevant to ORAC’s subject. It is about cancer genetics: http://www.bmj.com/content/351/bmj.h4805?etoc=
… cautioned oncologists on their use of multigene panel testing, which can turn up variants whose clinical significance has not been well established. “Because of the current uncertainties and knowledge gaps, providers with particular expertise in cancer risk assessment should be involved in the ordering and interpretation of multigene panels that include genes of uncertain clinical utility and genes not suggested by the patient’s personal and/or family history,” the statement said.
“Increasingly, we will be the recipients of data that we did not anticipate or, perhaps, even seek to know,” they wrote. “Accordingly, we will be in the uncomfortable position of reacting to that data on the basis of an immature and incomplete understanding of what [those] data mean.”

Retro Pump @419: "I don’t understand how I could be tarred with the same brush as Mengele, but then we perceive our own faults less than others perceive them."

As one of the people who resorted to that simile to try to explain the need for IRBs to zebra I want to be very clear that I do *not* think the study you are involved with is unethical in any way. Obviously you have IRB approval, and based on your story about the student and the schizophrenia gene you really care about ethical conduct and the possibility of harm.

Thank you for posting here, it's always great to hear from an expert! :)

Replacing a genuine, expansively empathic and compassionate psychology with the kind of Authoritarian rulebook you are talking about is both dangerous and degrading.

Wait. What?

Fairness and care are universal moral fundamentals in all cultures at all times. Small children understand and appreciate that they're good and useful qualities. It's hardly a dictatorial imposition to suggest that bearing them in mind is generally conducive to successful societal functioning.

I mean, within very broad parameters, people should be free to go to hell their own way, if they like. (By which I don't mean "hell'; it's just a figure of speech.) Goes without saying.

But some basic acquaintance with the outlines of civics and ethics is useful. You never know when it might come in handy.

“Fairness is ‘good’ ” is trivially replaced with “greed is good”, as we have seen over the last few decades. I would prefer an effort to help people to escape adolescence, as much as possible.

Yeah, well. If it all has to be reduced to a three-word motto, it's never going to mean much to anybody. But it doesn't. So moot point.

Adolescence has its pros and cons.

I hope I never completely stop being naive, personally. It's hard to keep that alive sometimes, though. Times is tough.

I do enjoy this latter part of our conversation, but unless someone wants to discuss the on-topic Zenome project who can maintain an equivalent level of civility and knowledge as you have demonstrated on this, I’m done.

Thank you. I enjoyed and appreciated it too.

I do enjoy this latter part of our conversation, but unless someone wants to discuss the on-topic [sic] Zenome project who maintain an equivalent level of civility and treats me with the "respect" that I "deserve" has knowledge as you have demonstrated on this [beg pardon?], I’m done exercising my "rights" to not be offended and to go away.

FTFY. Leaving aside the fact that "the latter part of [the] conversation is completely on-point, you've had plenty of entirely practical questions about the Zanadugnome Project, which have been been completely ignored.

Describe the vaporware that "is" the interface.

^ "and has" and close quote after "[the] conversation."

Retro Pump,

Again, your response is greatly appreciated.

WRT Mengele-- perhaps you are not familiar with the dynamics of this group, so lets get that out of the way first. While I hesitate to compare myself with a young person for whom I have the greatest respect, what I experience on this blog can be compared to what our US president experiences-- I think of it as ZDS, or zebra derangement syndrome.

So, when you participate in a birth cohort study, it is all well and good, but when I suggest a similar, completely voluntary program, with all the same or greater safeguards, it becomes a fantasy about Jack-Booted-Thugs forcibly taking samples from helpless infants. Perhaps they also think I'm in cahoots with Barack, and we're going to store all the guns he confiscates in the same place as my DNA samples.

Can we move on to reality now?

What I would find extremely helpful is numbers. How many individuals in the cohort study you mention? What would be an ideal sample size to study schizophrenia? And so on.

When I read your description of the difficulties of the research, and your reference, I think: "how can I help"? My answer is that if we collect all these samples, and digitize some portion of them, and also create a registry of conditions, we *take away* many of the problems you describe.

If someone withdraws, for whatever reason, or dies, we can replace her. If someone objects to giving the data to a US researcher, we can replace him. The people doing the research do not have to deal with anything *but* the data; all the permissions and notifications are part of the centralized system...

Now, I could go on, but I'll wait and see if you are able to engage further on this. I understand that your research project is extremely difficult, but as another pretty fair US president said-- we aren't going to the moon because it is easy.

Zebra, no matter what you are A. Always a victim B. Always claiming victory. Even in those rare moments when you are being kind. That's quite a skill set to have in the business world, and it's one of the many reasons I'm no longer in it.

When I read your description of the difficulties of the research, and your reference, I think: “how can I help”? My answer is that if we collect all these samples, and digitize some portion of them, and also create a registry of conditions, we *take away* many of the problems you describe.

This is spectacular.

Narad@410

You’re arguing with a dining-room table, folks.

Don't insult dining room tables. At least they have a leg to stand on!

zebra
Do you not see the difference between Retro Pump's birth cohort study (one city) and yours (one country)? Why do you think your idea solves problems that Retro Pump specifically said it doesn't? How on earth does collecting more data (that even according the the experts we can't use) solve storage, ethics, consents, security, etc problems?

Regarding Mengele, no one called you that. You couldn't answer questions regarding privacy and ethics (no surprise, the experts can't either) so you dismissed them as vacuous (then really made an ass of yourself by saying the same of human rights). The Nazis were just brought up as an extreme example of why ethics are so important. You can call this topic off topic (it is) but I'd note that you were all too happy to continually harp on it while on the other hand you were all too happy to drop any discussion of confidentiality issues beyond "I don't agree" or "I don't think...".

The closest you've come to an actual defense is that 200,000 is only a small fraction of the total 400,000,000 population. This is nonsense there is no centralized medical record database of 400,000,000 people. Not 200,000 is several orders of magnitude greater than any current database so even if there were a central repository of medical records it would still represent a unique asset. Of course I already made and you already ignored both those points.

You keep saying that you don't think your database poses a risk to the patients but have yet to provide any evidence to support that claim. There have been several people (myself included) with experience in infosec as well as one person with genomics experience telling you otherwise. I've provided several references explaining how confidentiality can be breached. If you'd like I can find references about hospitals being breached as a demostrarion that hackers would be interested in the PHI of 200,000 but I'm not particularly inclined to look given that you've summarily ignored every reference I've provided so far.

When several experts in a field tell you "we don't need that right now. In fact, it's a bad idea at this point in time," the rational response would be to cut your losses not double down. You're "the only one here willing to admit [you] got something wrong,' my ass. To quote Narad, "[s]eriously, go f[*]ck yourself."

Put another way, several issues with your
idea exist solely because of the size of the database (storage, processing) while others are made worse because of the size (consents, privacy, security). How exactly does gathering large amounts of data solve any aspect of any of those issues?

“So, when you participate in a birth cohort study, it is all well and good, but when I suggest a similar, completely voluntary program, with all the same or greater safeguards, it becomes a fantasy about Jack-Booted-Thugs forcibly taking samples from helpless infants.”

Umm, I never said and certainly don’t believe there is anything wrong with birth, or any other, cohort studies. I am just saying that you have to have proper ethical and other considerations in place, and these considerations need to be flexible into the future as ethics, technology and society change.

“how can I help”

You can volunteer to be in genetic studies that are already underway. If you don’t (yet) have any glaring health problems, you could always just be a healthy control.

“If someone withdraws, for whatever reason, or dies, we can replace her. If someone objects to giving the data to a US researcher, we can replace him.”

So this is no longer a birth cohort??

“My answer is that if we collect all these samples, and digitize some portion of them, and also create a registry of conditions, we *take away* many of the problems you describe.”

Well, the data is digitised to begin with, so not sure what you mean. That is what NGS data is. How will this registry of conditions be maintained. It takes a great amount of resources to follow a birth cohort and “update” their conditions as they develop. It needs to be done proactively.

More importantly, there is so much more to human wellbeing than just raw (digitised) DNA sequence. Properly designed studies will give you so much more bang for your buck than just collecting a whole bunch of random data. In the case of schizophrenia for example, you really need family based studies. That’s because there is not one gene, or even one set of many genes that leads to schizophrenia. It can be different in each family. One estimate is that you would need up to 50,000 sibling pairs to have 80% power to detect a locus accounting for 5% of variance in liability to schizophrenia at α = 0.001. And the best evidence is that a lot of the risk for schizophrenia is laid down in the prenatal period due to environmental factors (e.g. mum is infected with a certain virus at a critical in-utero period). So a best case scenario is that if you had these 50,000 sibling pairs, you *might* be able to detect a gene that *may* contribute up to 5% of the genetic risk of schizophrenia in the general population. Note the data from these sibling pairs is unlikely to come from a birth cohort. Unless the siblings are twins, they won’t be born in the same year, and even then you have to wait until both siblings are adults so you can assess whether or not they develop schizophrenia. So, better to spend the money finding these adult siblings with and without schizophrenia and work backwards from there. Now keep in mind that a 5% risk in one gene, is not saying that if you have this one gene you will get schizophrenia. It just means you are statistically slightly more likely to develop the disease. And even now, we can’t say for sure that THAT gene difference is the cause of your increased risk. That requires a lot more work in genetics at the level of epigenetics, RNA, protein and other more difficult and expensive tasks. The most we can hope for in complex diseases like schizophrenia is to try and understand the basic underlying biology and develop better treatments. Or maybe it’s better to spend the money on understanding and mitigating known environmental risk factors.

Good research quite simply requires good experimental design. You need to ask and answer the questions about what you intend to do with your data. What is the outcome you hope to achieve? How does knowing that out of 200,000 kids born in a single year, 50% of them will grow up to be obese, and about 30% of this risk of becoming obese is due to variation in maybe 300 genes. What if you have 150 out of 300 of these deleterious genes? What are the implications for you (or your children)? What do we decide is the cutoff for number of deleterious genes before we recommend ‘intervention’ to prevent you/your child from growing up obese?

Maybe a more concrete example from my own past will help you understand the genetic and ethical difficulties we have. After much genetic and phenotypic groundwork, a gene variant the MAOA gene is now pretty well established as a risk for abused or neglected kids (male children as it is on the X-chromosome) growing up with a variety of antisocial behaviours. Do we screen all male children at birth, and tell parents “don’t abuse or neglect THIS child as you may ruin them as adults? Or do we just skip the genetics altogether and tell all parents of all children not to abuse or neglect any child? Or do we do prenatal diagnostics and recommend terminating this child or adopting them out unless you intend to look after them properly? There have been court cases where this gene was used as a defence for antisocial behaviour. There was an ethnically divisive and dubious publication that implied a high rate of this variant in a Polynesian population explained their higher rates of convictions for violence and incarceration (dubbed “The Warrior Gene”). Do we really want to allow “my genes made me do it” as a defense, or “they” can’t help it, because ethnicity? There are so many legal and ethical consideration that need to go into how this kind of data is used. If we find your child has 150 out of 300 of the risk genes for obesity, do we say “make sure THIS child eats properly and gets plenty of fresh air and exercise”? Do we flag children with an increased risk of developing depression and let parents know that they should try and minimize life adversities for their child? Do we send children with an increased risk of developing alcoholism to be raised in a strict Muslim country?

Or do we spend the money on bettering the environments for our children. Do we try harder to mitigate the harms from increased food availability and screen time, drugs and alcohol, socioeconomic disparities, environmental contamination and degradation and all the other things that we know for sure can make a difference? Do we invest more in storing cord blood for future treatments that may become available? It may come as a surprise to hear this sort of thinking from a life-long geneticist, but so much of the disease burden we face now, though there is undeniably a genetic component, is better addressed through public health and social approaches. Some diseases could be prevented or treated better through a genetic approach, but these will be in the minority. We still don’t know what to recommend for women who have one of the known and major genes for breast cancer. Basically you still have to weigh up the costs of the Zebra project with the potential harms, and then address what benefits you envision will outweigh these costs and harms. You have to be convincing enough and show enough benefits, to get the US DoD to part with those fighter jets and sink the money into your project instead. You have to convince the parents of the children you want to recruit into your project, not with exaggerations or lies about the potential risks and benefits, but with some cogent and realistic reasons to participate (or not).

For Zebra and anyone else who wants a glimpse at just how hard the groundwork is for the kind of genomics in the Zebra project, I was just coincidentally reading this paper. I *think* it is publically available...basically it reiterates the idea that it is not raw DNA sequence that is holding us back, but really nailing the link between this DNA info and the ultimate results on phenotypes:

http://www.pnas.org/content/112/37/E5189.long

Comparison of predicted and actual consequences of missense mutations

Significance

Computational tools applied to any human genome sequence identify hundreds of genetic variants predicted to disrupt the function of individual proteins as the result of a single codon change. These tools have been trained on disease mutations and common polymorphisms but have yet to be tested against an unbiased spectrum of random mutations arising de novo. Here we perform such a test comparing the predicted and actual effects of de novo mutations in 23 genes with essential functions for normal immunity and all possible mutations in the TP53 tumor suppressor gene. These results highlight an important gap in our ability to relate genotype to phenotype in clinical genome sequencing: the inability to differentiate immediately clinically relevant mutations from nearly neutral mutations.

Abstract
Each person’s genome sequence has thousands of missense variants. Practical interpretation of their functional significance must rely on computational inferences in the absence of exhaustive experimental measurements. Here we analyzed the efficacy of these inferences in 33 de novo missense mutations revealed by sequencing in first-generation progeny of N-ethyl-N-nitrosourea–treated mice, involving 23 essential immune system genes. PolyPhen2, SIFT, MutationAssessor, Panther, CADD, and Condel were used to predict each mutation’s functional importance, whereas the actual effect was measured by breeding and testing homozygotes for the expected in vivo loss-of-function phenotype. Only 20% of mutations predicted to be deleterious by PolyPhen2 (and 15% by CADD) showed a discernible phenotype in individual homozygotes. Half of all possible missense mutations in the same 23 immune genes were predicted to be deleterious, and most of these appear to become subject to purifying selection because few persist between separate mouse substrains, rodents, or primates. Because defects in immune genes could be phenotypically masked in vivo by compensation and environment, we compared inferences by the same tools with the in vitro phenotype of all 2,314 possible missense variants in TP53; 42% of mutations predicted by PolyPhen2 to be deleterious (and 45% by CADD) had little measurable consequence for TP53-promoted transcription. We conclude that for de novo or low-frequency missense mutations found by genome sequencing, half those inferred as deleterious correspond to nearly neutral mutations that have little impact on the clinical phenotype of individual cases but will nevertheless become subject to purifying selection.

The rest of it goes into the nitty gritty, but the summary is "it's complicated"

Retro Pump,

If you don't mind sharing, what is your dream genetic research project?

"what is your dream genetic research project?"

I don't think I can answer that question. When I first started in genetics in the late 80s, it was as a plant geneticist. I was working on pea genetics, and thought this was my dream job...in the footsteps of Mendel. The idea was one I loved, to use genetics to inform plant breeders at making and selecting better plants to help feed the world. Then the anti-GMO fever took hold (even though I wasn't personally making GMOs, just helping breeders) and the whole field seemed to sour a bit for me. I still think using genetics to breed plants and animals for better, more nutritious yield in poorer soils using less water etc...is a noble one. I have moved into human genetics (decades ago) and I have gone from thinking that if we could get a handle on genetics, we could prevent or cure whatever ails us (the hype of the Human Genome Project Infected most of us in the field), to now feeling just plain inundated with data. We have too much of some data (such as the data Zebra wants more of), and no where near enough of other data (clinical, phenotypic, invitro, invivo, environmental, social).

So maybe my ideal genetic research project would be going back into the field of crop and food genetics. It is a field where then phenotypes, outcomes and endpoints are much easier to measure than in humans. It is really hard to control the environment and breeding of humans!!

Retro Pump@429
Speaking of raw sequences, if you were to request data from, say SHIP, would they give you the raw sequences? Is the ability to reidentify pseudonymous genomes (i.e. through surname inference) a concern? I'm curious if you can give some insight about how people in the field feel about attacks on genetic privacy. Like is it a concern but maybe tempered by controlling access to the data and IRB/ethics committee oversight or is it more regarded as like tinfoil hat paranoia?

What do you think of the 100,000 Genomes Project? The Nuffield Council paper that Krebiozen linked to before said that it was rushed because of political pressure and so the ethics weren't given the time they should have been.

@430
The full text is publically available. It's quite over my head though.

It is really hard to control the environment and breeding of humans!!

LOL. Indeed.

I don't know how much of a role teaching plays in your profession now or how much of an interest it is to you, but you certainly demonstrate a talent for it. In any case, I wish you well in your endeavors whatever they may be.

Retro Pump,
Thanks for the added information. It must be weird to be in a field that promised so much that hasn't panned out.

I *think* it is publically available…basically it reiterates the idea that it is not raw DNA sequence that is holding us back, but really nailing the link between this DNA info and the ultimate results on phenotypes:

That was interesting, thanks, though I must admit skimmed the methods and results sections. The lack of a one to one relationship between genes or constellations of genes and specific conditions is precisely what I was getting at in my original comment at #4 that zebra apparently didn't understand.

I guess we could conclude that one mouse's (and man's) mutation is another mouse's polymorphism (sort of).

Retro pump,

You've provided lots of information, which again demonstrates the difficulty of genomic research in general, but you still seem to be "stuck" on what the actual proposal is.

This is not a "study". It is simply a way of making the data available to researchers who will execute the study. It deals with many of the issues you discuss, so that researchers can concentrate on the experimental design. There is no "experimental design" beyond establishing what goes into the phenotype databases, which can be fairly broad.

I wonder, when you suggest that replacing one individual's record with another would make it no longer a birth cohort, whether you have really thought it through. Can you explain just this point? Perhaps there is a disconnect because I am thinking of information/data more abstractly? Exactly what difference would it make?

And I really am curious as to the size of the city-cohort you were involved with.

436 To avoid misunderstanding:

This is not a “study”. It is simply a way of making the data available to researchers who will execute a study of their choosing.

And again, since misunderstanding is common:

This is not a “study”. It is simply a way of making the data available to researchers who will execute a study of their choosing with some subset of the data.

This is not a “study”.

Good luck convincing an IRB or ethic committee that sequencing the genomes of hundreds of thousands of people isn't a study.

Zebra doesn't even know what constitutes a "study."

Guess what, collecting data is a "study."

It deals with many of the issues you discuss

How exactly? Other than dismissing those issues as vacuous concepts, I mean.

zebra # 413
I know some people are very jealous of their privacy, and use every means to not associate their name to a HIV test results.
Or other people "donate" blood to have HIV test made -and a few bucks in their pocket.
But in my view this should be strongly discouraged. For example by printing a well readable phrase on the packet/form: HIV (or Genome) testing can seriously damage your health. Just as for cigarettes.

perodatrent,

We agree that excessive screening is a bad idea. (I think your warning message is a bit extreme if you are being literal.) But again, this does not relate to my project, which is about basic research.

And really all I was trying to say, and I wish I knew the French for it, is "the cat is out of the bag" or "that train has left the station".

I think even Not a Troll? reported taking USD 25 for a genetic sample. And RetroPump, who tells me that there is more data than they know what to do with, then also suggests that I can help by donating my genome.

I suppose zebra thinking that an Italian speaks French shouldn't surprise me. It's oddly apposite, actually.

Zebra, Retro Pump meant donating for specific studies not for casting a wide net. And, I'm unsure what the $25 has to do with people being inclined to donate. It wasn't my reason because the time involved makes it a very poor investment.

As someone who is directly involved in the precision medicine initiative, I can tell you there are a lot of erroneous arguments presented here, both by zebra and against zebra. I'm not particularly interested in discussing those points, as I do enough of that during my day job, but just wanted to give everyone a heads up that there will be a major announcement from the White House regarding PMI in a few hours so stay tuned ;)

In Italian it would be: it's useless to close the stable, after oxen escaped.
A stable is best kept closed, and a "right to know" is best made a little hard. In my opinion.

And it just dropped:

http://acd.od.nih.gov/reports/DRAFT-PMI-WG-Report-9-11-2015-508.pdf

I recommend that any further conversation on the viability of precision medicine refer to what the initiative actually aims to accomplish as described by this document.

Guys, zebra didn't say it's not a study, he said "study". Not sure if those are scare quotes or a direct quote. Given his history of abusing quotation marks it would be rash to assume that " study" has the same meaning as study.

AdamG447
Interesting. I hope it's the draft privacy guidelines.

Ah. Cross posted there. Thanks for the link AdamG. Will put off posting in this thread until I've had a chance to peruse it.

Thanks, AdamG

OK, it's official-- I'm pathetic. 200,000? Hah! Piker!

And Canada? Who needs 'em.

Although I wish my concept had more support from experts in the field...

Zebra @436: "I wonder, when you suggest that replacing one individual’s record with another would make it no longer a birth cohort, "

Generally when epidemiologists speak about a birth cohort it means "all of the people born between these dates". Birth cohorts are inherently closed cohorts - you cannot be added to a birth cohort because it depends on when you were born.

Now, when you suggest adding people if some members of the study are lost, you are talking about an open cohort. That can be totally fine (it is a not-uncommon type of study group). I think what you mean (and please correct me) is that if someone from the study population dropped out, you would replace them with someone from their birth cohort.

Although I wish my concept had more support from experts in the field…

Well, now that you have an actual proposal to hand, you've got a framework and all the time in the world to specifically demonstrate its inferiority to your own thoughtless babbling.

It was always just a dream for zebra. He couldn't have pulled it off, but he will certainly do a happy dance around the blog. Which I'm fine with. That was going to happen regardless.

AdamG,

I very much like that the participants have a say in defining the rules of how their information will be used. But I'm not really a fan of this. I've already donated my DNA for one discreet study and may do so again, yet for this initiative count me out.

But I’m not really a fan of this.

Care to say why? Not interested in debate, just genuinely curious.

Heh, I had missed this gem:

While I hesitate to compare myself with a young person for whom I have the greatest respect, what I experience on this blog can be compared to what our US president experiences– I think of it as ZDS, or zebra derangement syndrome.

This is a classic. Apparently fed up with being aptly likened to a college sophomore, Z. is apparently now in his 70s or something, as 54 isn't "young" by any stretch of the imagination. (Hint: try asking one.)

Even better, though, what the president, ah, "experiences" is named after Z.

Justatech,

Why would you think anything but that I would replace them from the birth cohort?

This is not a “study”. It is simply a way of making the data available to researchers who will execute a study of their choosing with some subset of the data."

The database of 200,000 is a database, not a "study". You would provide a replacement from the database of the Zenome Project to the study the researchers at university X are doing.

How many different ways would you like me to say that?

Why would you think anything but that I would replace them from the birth cohort?

Perhaps for the same reason that RP did. Recall:

My answer is that if we collect all these samples, and digitize some portion of them, and also create a registry of conditions, we *take away* many of the problems you describe.

If someone withdraws, for whatever reason, or dies, we can replace her. If someone objects to giving the data to a US researcher, we can replace him.

"Replace" for what? With what? Your comment was pure blobovianism. You seem to have meandered to this remark from schizophrenia, but if the relevant members of the cohort are completely interchangeable, why wouldn't one be using all of them?

Your own brillian concept appears to be so ill-formed in your own head that you can't even figure out what your own defenses are supposed to mean.

^ "brilliant"

The database of 200,000 is a database, not a “study”.

So, that whole "sorting" thingamabob – i.e., data reduction and analysis – has been thrown overboard?

Perhaps you'd like to "define" your "terms."

zebra@458

The database of 200,000 is a database, not a “study”.

It might not be a "study" but collecting and sequencing genomes is certainly a study. You would need an IRB (one concern addressed in the PMI paper AdamG provided) and you would need to follow all the same legal and ethical rules that any other study.

On that note, I realize that you see this as nothing but a victory, but if you take the time to at least skim the PDF you'll see that significant consideration was given to various issues that you outright dismissed. For example, privacy (including reidentifiction) and security. That's the difference. You obstinately refused to acknowledge that there were any problems with your idea whereas the NIH produced a 105 page paper addressing these concerns. Even then there are some areas where their solution is along the lines of we will work with experts to figure out solutions.

@AdamG
I can't speak for Not a Troll, but I wouldn't participate mostly because the government has a very poor track record in infosec. I also think that access control and penalties for reidentification isn't the be all end all. Especially when there is work being done to technologically provide better privacy guarentees.

capnkrunch 462,

The only thing I would consider a "victory" would be if people stopped making up bizarre strawmen and made an effort to have an adult debate that moved understanding of the issues forward.

So far, a couple of participants have demonstrated that capability.

I'm actually beginning to wonder if I can tell how much is ZDS and how much is just various forms of impairment.

My approach is probably more private and secure than BAU. It's that simple. Apparently, others with actual expertise would not strongly disagree.

Zebra@458: I'm wanted to make sure that we were all one the same page and had clarified our terminology.

For people who work large study populations, like epidemiologists, it sounds wrong to say that you would add people to a birth cohort after the study has started.

You weren't saying that. There was simply lack of clarity on the difference between the birth cohort of the proposed study population and the birth cohort from which that population is drawn. Now that we have cleared that up, we can address other questions.

Given that we are talking about an open-cohort design, that will complicate the calculations of things like incidence rate, that depend on person-time.

463 And, to avoid more wasted bandwidth, that's: "how much of the bizarre commenting is ZDS ....", not "how much of the adult discourse"

"infosec"

Yeah, "infosec", that's the ticket. I'm convinced.

j464 ustatech,

You would have to elaborate. Everyone is in the same birth cohort. Are you talking about the difference of <1 year?

zebra@463

The only thing I would consider a “victory” would be if people stopped making up bizarre strawmen and made an effort to have an adult debate that moved understanding of the issues forward.

Ha. Forgive me if I find this hard to believe. You've dismissed privacy as vacuous and security a minimal risk. I think the fact that the NIH specifically addressed these concerns shows that your dismissal was unwarranted.

My approach is probably more private and secure than BAU.

You keep asserting that. Try defending it.

Apparently, others with actual expertise would not strongly disagree.

Where is this coming from? You're approach was vague and you constantly refused to elaborate on how you would protect privacy or security. Instead you choose to say they were unimportant.

You would need an IRB (one concern addressed in the PMI paper AdamG provided) and you would need to follow all the same legal and ethical rules that any other study.

Table 5.3 is somewhat helpful here, although Z. has "opted out" from replying to questions about his vaporware interface.

463 And, to avoid more wasted bandwidth, that’s: “how much of the bizarre commenting is ZDS ….”, not “how much of the adult discourse”

“infosec”

Yeah, “infosec”, that’s the ticket. I’m convinced.

1. You are now talking to yourself.

2. You don't know what "bandwidth" means.

HTH. HAND.

^ The nonbroken part of the link resides in "463."

@capnkrunch #43

“if you were to request data from, say SHIP, would they give you the raw sequences”

I have no idea what the acronym SHIP stands for, sorry.

“ Is the ability to reidentify pseudonymous genomes (i.e. through surname inference) a concern? I’m curious if you can give some insight about how people in the field feel about attacks on genetic privacy. Like is it a concern but maybe tempered by controlling access to the data and IRB/ethics committee oversight or is it more regarded as like tinfoil hat paranoia?”

All of these things are a concern. Most data is “leaky” at some level. If someone is determined enough, or someone is a bit slack at their job, there is scope for some pretty big screw ups. I see genomics data as being a bit like the internet. Don’t ever put something on the internet that you *might* someday regret or wish to retract. I think that as regulatory bodies and researchers become more aware of the issues, things will improve. But it will never be fail-safe.

“What do you think of the 100,000 Genomes Project? “
I like the idea of the project, I’m not informed about the ethical considerations. Its focus on rare diseases is a good way to find out lots of new things about, well, lots of things. We often get clues about the underlying biology of more common diseases from studying rare diseases, and rare diseases are more often phenotypically and genetically distinct enough that you can find that needle in the haystack, especially if you have familial genomes as well.

@ Not a Troll

I remembered last night that there is a dream genetics project in humans that I would love to do. I have always been fascinated by the possible genetics of the placebo response. A lot of pharmacology is muddied by this pesky but uniquely human trait, and I think knowing more about it could help inform the field of pharmacogenomics.

Teaching is a big part of my job, but not as a lecturer. I am more a mentor for grad students working in the lab. It is the aspect of my job that I love the most. Thanks for your kind words.

@Krebiozen

“I guess we could conclude that one mouse’s (and man’s) mutation is another mouse’s polymorphism (sort of).”

That’s a good way of thinking about it. Some gene variants are deleterious in certain environments, advantageous in others, or just plain neutral. Other gene variants are always deleterious. Sifting through this complexity in humans is daunting.

@ Zebra

“This is not a “study”.

Sigh. As other have pointed out, this IS a study. You can’t just collect/generate the data because you feel like it and have lots of money. Well, you probably could in some countries, but I’m working on some basic assumptions here. Maybe that was wrong of me.

“I wonder, when you suggest that replacing one individual’s record with another would make it no longer a birth cohort, whether you have really thought it through. Can you explain just this point? “
Well according to the original Zebra proposal as I understood it, you wanted to capture sequences from everyone born in a certain year. Presumably you would *somehow* be able to approach the parents of all these births, and consent as many into the study as you could. If some of these withdraw later on, do you plan on going back and harassing the parents who did not originally consent to join their kiddies up? I am not going to divulge the size of the study I worked on as I don’t want anyone here to identify the study through too much information, but suffice it to say that it was a medium sized city, and at the time the majority of births were in a single (socialised) health care system and most were not home births. To scale this up to a whole country would in itself be daunting. To follow them (I’m assuming you want this to be a longitudinal cohort) over their life span and record information about health, educational achievements, drug, alcohol and tobacco use, incarceration, abuse and neglect as a child, teen pregnancy, and many other things that would never show up in even the best of healthcare data, was even more daunting. But honestly, it is the richness of the data that has made this, and other birth cohorts like it, so powerful. Trying to do something like this in the US where healthcare disparities means you would even miss out on a lot of basic health data would be just generate rubbish. And trying to gather all this extra data is expensive and labour intensive. You may in theory get money to collect DNA samples and generate some sequence data at the start of the study, but it would take an enlightened government with a long term view to keep it going. It would cost a lot more than a few F-15s.

Now if you want to do an open study that is a different proposal.

“And RetroPump, who tells me that there is more data than they know what to do with, then also suggests that I can help by donating my genome.”

Umm, most well designed disease studies use a case-control design. You could find a genomics study that is looking for healthy/diseased people of a certain gender, age, ethnicity etc…that you fit into. You could volunteer for the PMI initiative, if and when that gets up and running. Or you could give some of your money to something like 23andme and help them build their data. 23andme is not NGS technology, but they are always happy to take people’s money and use their data. If you’re are interested in genetics and ancestry, you might find the $99 a good investment. Genetic data is always more fun when it’s your own

AdamG @456

It's sharing too much of me for too little in return - for both me and society. For me, literally too much to share without a goal in sight.

For society, rightly or wrongly, I base my thoughts on the points raised here, and on the specific example of Dr. Thomas Insel at the NIMH and his search for biomarkers in mental illness. Psychiatric care has been severely negatively impacted by lack of funding yet this type of research appeared to be their main focus. It isn't that it shouldn't be done. Dr. Insel has gone on to Google and I say let him burn up their money on this; I wish them well in finding biomarkers if they exist. But as long as there is no investment in decent treatments and supportive assistance whether there is a biomarker or not it won't make a much difference in the daily lives of the people affected.

If PMI turns out to be more a project of that project while other more immediate needs go unaddressed then I don't wish to participate in it. Obviously I may change my opinion after seeing how it plays out over time but I'm not interested in being an early adopter. I'm sure you will have plenty though.

Retro Pump@471

I have no idea what the acronym SHIP stands for, sorry.

My bad, I accidentally left a couple words out. It should have read "if you were to request data from one of the existing datasets, say SHIP, would they give you the raw sequences?" It probably would have made sense to say "the 1000 Genomes Project" but SHIP (Study of Health in Pomerania) came to mind instead because I had just been looking at a study that used data from it*.

Based on your response my guess is that the answer would differ based on where the data came from. It was a poorly worded, indirect question. I guess where I was trying to get at is what kind of protections are generally offered against reidentification. The PMI paper said that:

Unauthorized re-identification or recontacting of participants should be
expressly prohibited in agreements for the use of specimens and data, and NIH should pursue
legislation penalizing such actions.

Is that pretty standard?

*For the sake of completeness the study was Genome-Wide Association Study with Targeted and Non-targeted NMR Metabolomics Identifies 15 Novel Loci of Urinary Human Metabolic Individuality. Not that it matters since I can't quite remember how or why I got to that study.

Retro Pump,

The search for the genetics of the placebo response would be a very cool* study. As someone who doesn't seem to have much of this type of response (rather more the null response), I'd volunteer for it.

*Americanism I still use.

zebra@465

“infosec”

Yeah, “infosec”, that’s the ticket. I’m convinced.

Would "information security" be more acceptable? Either way, it doesn't change the government's poor track record. This is exactly what I mean about dismissing and handwaving. You could cut the condescension with a knife... if it wasn't surrounded by an impenetrable layer of stupidity.

Regarding security:

The PMI-CP should establish a Security Subcommittee of the Steering
Committee composed of leading experts on cyber security and the management of large amounts of
data to ensure that the PMI-CP is incorporating cutting edge security measures and actively
monitoring the strength of the data systems.

Security is one of those things where the NIH said they'll ask the experts. I'm curious what the subcommittee will look like and what their recommendations will be but it's a much better response than "[b]eing one of 200,000 individuals with genome information on file in fact represents a negligible incremental risk. Truly negligible."

This is what happens when you can't admit that you don't have every answer in areas you have litte to no expertise. You end up needing to dismiss things because you don't know enough to solve the problem and you have too much hubris to admit that your half formed thought is anything but perfect.

One thing I do like about the PMI paper is this:

Participants need to be
aware that there is a risk that their information may be disclosed or used inappropriately. Educating
prospective participants about the extent to which existing laws protect them from misuse of their
information, and the extent to which this legislation does not protect against misuse will allow them to
make a more informed choice about participation.

This is a good way to do it. Personally I think the risk is too high but there's certainly people who feel otherwise. As long as people are educated enough to make an informed decision I think this is a perfectly fine way to go about it.

zebra@331

And even though I don’t think the identification-through-the-genome route is a big problem,

Another point where the NIH disagrees (and provides a solution instead of ignoring the problem).

Data obtained by investigators from the PMI cohort will, in most cases be de-identified, which offers a
degree of protection to research participants. However, there is a growing field of literature
demonstrating myriad ways that individuals can be re-identified using “de-identified” information of
various types, particularly through combining large amounts of information from multiple sources, and
often involving genomic data.
147 In order to obtain such data, investigators should be required to agree
not to re-identify or to attempt to recontact individuals. Unfortunately, the mechanisms available to the
PMI-CP for enforcement of these agreements could be limited, especially if the data users are not
funded by NIH. In order to provide the strongest enforcement mechanisms, legislation establishing
penalties for violation of these agreements will be needed.

Like I said, not strong enough protection for me to give my data but much stronger than your non-answers. I'd hope that as the crypto technology advances they implement technologic protection alongside the planned legislative ones.

NaT, thanks for your response. I hope you take some comfort in the fact that the PMI workgroup's most heated debates have been about exactly the infosec issues you and others have raised here, and that this document is far from finished regarding these issues.

I urge you to take part in the Twitter chat happening next week: http://www.nih.gov/precisionmedicine/index.htm

We need more voices like yours (knowledgeable potential participants who have reservations regarding participation) to speak up about these issues. The committee takes these comments very seriously, as a primary goal of the initiative is to change the relationship between researchers and 'subjects.'

As an aside, I know for a fact there are several folks involved with PMI at a high level that read this blog, but I'm the only one I know of who reads the comments. Please voice your opinions about PMI directly to NIH! Given the usual morass of red tape that defines the NIH, folks may be surprised with how important public feedback is for this project's future.

Man, copy and paste from PDF's is broken in such a weird way for me.

@capnkrunch
“Based on your response my guess is that the answer would differ based on where the data came from. It was a poorly worded, indirect question. I guess where I was trying to get at is what kind of protections are generally offered against reidentification.”

It definitely depends on who supplies the data. As does what kinds of protections are in place to prevent reID. Some data will only be made available in analysed and annotated form. Other providers will send you only the raw data and leave it up to the researcher to do analysis and annotation. From the POV of making research reproducible and fully open, it is best to have the raw data. From the POV of making life easier for the end user, the analysed/annotated data is best. Think of it as the difference between someone trying to find better ways to analyse the data, versus a clinician or other end user who really just wants a synopsis. The raw data can come in so many different form. The paper you cited was really just using a whole bunch of variants (about a million) based on array technology, as opposed to sequencing, so that data would look quite different to raw sequencing data.

I don’t know if the PMI approach to penalising reID is ‘standard”. The standards are constantly being refined and hopefully improved. At the very least, the people who collect personal information and allocate a study ID should have no access to the genomics data, and visa-versa. Even better if a third and independent party holds a ‘key’ that links personal info with data, and has no other knowledge about the participants or their data.

However it is still theoretically possible for full scale sequencing data to be used to identify individuals or families. It would take a lot of work, but some people seem to have too much spare time, and if there is enough of a money motive you can never say never. And there is always a risk that with social media, your aunt/brother/mother may post a link to some of their genetic/genomic results and in some way 'implicate' you and your genetics,and your children's etc... This is a different kind of security concern, but no less real.

@Not a Troll

I think I would be a placebo non-responder too. I certainly don’t seem to respond well to homeopathy or acupuncture, or many other more SBM drugs. Please don’t give me paracetamol for any kind of pain or inflammation. I simply will not respond, though I’m sure it will still be hard on my liver. The flip side of the placebo response that I would like to include in my study is the nocebo effect. The power of suggestion is indeed powerful for both good and bad outcomes!

@Adam

I am glad to hear the PMI workgroup is thinking through these things carefully and taking public feedback. In this day and age, most people really want assurances that their information will be as secure as possible.

Man, copy and paste from PDF’s is broken in such a weird way for me.

It does that for everybody.

Narad@479

It does that for everybody.

It's odd. There's a character there and I can select and copy it but pasting it into Google doesn't work but in text fields that allow line breaks it causes one. But my text editor doesn't recognize it as a carriage return or a line feed. Curious.

Would “information security” be more acceptable?

My observations of its habits in the wild strongly suggest that Z. appears to consider concepts to be noumenalized and fully specified when he assigns them acronyms (vide "sovereign entity" supra), so you could try that, although I can't remember the Hungarian version of the barn-door routine.*

* Which may have a distinctly different interpretation compared with the Romance language versions, despite the similar setups.

I have one concern: what if these sequences goes in the hands of INTELLIGENT people, able to interpret them? It is not possible to patent the sequences. Foreign (probably alien) scientists could make money by finding drugs against diseases common in the US, like obesity, creationism and quackery.

"go" not "goes"

zebra @452:

Although I wish my concept had more support from experts in the field.

You're amazing, and not in a good way. Don't you understand what is blindingly obvious to the rest of us? If the "experts in the field" are not supporting your idea, it's probably because said idea is a poor one.

Retro Pump #471,

I think you might have been reading some of the strawmen and quotemines as correctly reporting my actual comments. Correcting:

I think I only suggested US as the location once; I then started using the more manageable venue of Canada, and I also pointed out that it would be much more viable within Canada's more rational health-care system.

As to whether it is a "study" or a database. That is a very easy distinction. The function of the project is not to answer any specific scientific question, so it is not a study to me. My goal is to be a facilitator for those who do have specific scientific questions.

Now, I don't see the utility in saying "nyah-nyah that's not what 'study' means to me", as long as we both understand the intent-- do we need to invent new words to communicate?

So, as to the specific question of replacement, as I already answered Justatech: Depending on the size of the database, and the prevalence of the phenotype or condition being studied, the researcher would likely only be using a subset, randomly selected, of the available relevant cases.

I think of a replacement drawn from the excess as still "being part of the birth cohort the researcher is studying". If you don't, that's fine, but again, I'm interested in how you think that would affect results, not what we call it.

Related to that, I am having a hard time understanding the reasoning about acquiring all this other information. How does being abused or not as a child inform us about the genetic correlation, or not, with some physiological condition that develops later?

Those are the issues that I'm really interested in better understanding through your expertise, but I just want to point out that my position on privacy and security is being misrepresented in a ridiculous way, perhaps because ann and I had a more philosophical discussion about morals and laws that was too difficult for some to understand. I would vote, in so far as I have a vote, for very stringent restrictions to prevent misuse of the information, and for resources to be committed to aid in that. That was asked and answered more than once, so any report to the contrary is not to be believed.

zebra,

As to whether it is a “study” or a database. That is a very easy distinction. The function of the project is not to answer any specific scientific question, so it is not a study to me. My goal is to be a facilitator for those who do have specific scientific questions.

Now, I don’t see the utility in saying “nyah-nyah that’s not what ‘study’ means to me”, as long as we both understand the intent– do we need to invent new words to communicate?

The problem is that as soon as you stick a needle into a patient with the intention of collecting blood for your database it becomes human research and you have a legal requirement to get IRB approval. It's those pesky ethics again I'm afraid.

Related to that, I am having a hard time understanding the reasoning about acquiring all this other information. How does being abused or not as a child inform us about the genetic correlation, or not, with some physiological condition that develops later?

Well, now, isn't that the sort of question a researcher might be interested in? Why do some abused children wind up in prison or an early grave, or struggle with substance abuse, while others shake off the worst of it and grow up as healthy productive adults? Just because you aren't interested doesn't mean no one is.

And there are other aspects of life too. One of my friends recently lost two aunts to a rare heriditary cancer; a third has just entered hospice for the same thing. His mother's large family considers it significant that those three remained in their polluted home city whereas all the so-far-untouched siblings left while still young. If they are right, then to tease out the effect of that genetic inheritance from your vast cohort would require knowing where every member of the cohort lived throughout their lives.

For other genetic effects you might need their diet over their lifetime (more protein in childhood has what effect? More protein late in life has what effect?), their socio-economic status (stress due to economic uncertainty in formative years has what effect?), and so on.

Retro Pump,

I really would like to get your input on the questions I asked on the science part of this, but I just thought I would take the opportunity to illustrate why you need to read carefully, concentrating on where I directly reply to you, to avoid the wheat/chaff problem.

Almost instantly after I said this in 485...

I would vote, in so far as I have a vote, for very stringent restrictions to prevent misuse of the information, and for resources to be committed to aid in that. That was asked and answered more than once, so any report to the contrary is not to be believed.

...we are treated to the phenomenon I have described before, of someone taking an unrelated quote from the comment, and in some illogical construct, responding to it with something like "but what about laws protecting individuals from misuse of their genetic and health information?"

Perhaps we should study whether there is some genetic predisposition to ZDS; we can relate it to ODS and see if the President and I give off similar pheromones or something, to which sufferers respond.

zebra, just give it a rest. I know it's a scary thought but this kind of research is already in the hands of people who are much smarter than you. None of the ideas you've presented here are novel or particularly good. If you're actually interested in learning how this research is conducted, go read a textbook: http://amzn.com/0815341490

How does being abused or not as a child inform us about the genetic correlation, or not, with some physiological condition that develops later?

As Retro Pump said, abuse in conjunction with a variant of the MAOA gene is a predictor of antisocial behaviors.

But if it has to be physiological:

Abuse in conjunction with a variant of the MAOA gene alters NE, 5HT, and DA neurotransmitter systems, resulting in neural hyper-reactivity to threat, resulting in antisocial behaviors, such as violence.

LW 487,

Way back at the beginning, I addressed the possibility of some skewing of the data by, for example, a local somewhat inbred population that coincidentally refused to participate for religious reasons. I'm sure there are lots of interesting things to think about, and I'm sure there are lots of young scientists eager to explore them, particularly if there is genomic information available gratis.

But I don't think you are answering my question. You can't look for the needle until you find the haystack. Which Orac said in the first place, or so I interpret his post.

If your friend's aunts have some genetic predisposition established by a legitimate study (of family history, twins, or whatever) how does that relate to my database? Just use that family as part of a conventional study, and acquire their genomes (voluntarily! voluntarily!) and work it out.

The MAOA story is a very interesting one. For a history of the genetic basis of this association, see http://omim.org/entry/309850#0002

Further to 490 --

FWIW, psychogenic dwarfism is also an example of emotional maltreatment in childhood being expressed physiologically. I don't know if there's a genetic predisposition to that. But I wouldn't be surprised if there were.

It does kind of illuminate the complexity of the task to consider, though.

I mean, theoretically, emotional stressors in childhood might play a role in the expression of lots of things later on, but at a less dramatic level than extreme abuse, they're very hard to quantify and track..

zebra,

Almost instantly after I said this in 485…
I would vote, in so far as I have a vote, for very stringent restrictions to prevent misuse of the information, and for resources to be committed to aid in that. That was asked and answered more than once, so any report to the contrary is not to be believed.

…we are treated to the phenomenon I have described before, of someone taking an unrelated quote from the comment, and in some illogical construct, responding to it with something like “but what about laws protecting individuals from misuse of their genetic and health information?”

You repeatedly claimed that collecting blood samples and sequencing their genomes is "not a study". I pointed out that legally it is a study since it is classed as human research. If it is human research it is subject to all the rules that RT was referring to that you dismissed by writing:

This is not a “study”. It is simply a way of making the data available to researchers who will execute the study. It deals with many of the issues you discuss, so that researchers can concentrate on the experimental design. There is no “experimental design” beyond establishing what goes into the phenotype databases, which can be fairly broad.

Since this is clearly human research you would require IRB approval, and to get IRB approval you would have to demonstrate a robust experimental design, show that it would benefit the subjects, assess risks etc. etc..

How is that "unrelated" and what "illogical construct" are you referring to?

AdamG,

The only "idea" I have is that it would be helpful to have a large database, and I obviously don't think that it is novel. Why you would describe it as a bad idea is beyond me if you are working on one.

Now, I'm counting on the fact that this research is in the hands of people far more capable than I am. But we are talking about political decisions, and we all know that the public isn't going to read the textbook, so it always seems worthwhile to me to get information out in a form that might be more accessible.

You are obviously free not to participate, as is everyone else, thanks to the Constitution and legal system and so on. Do you have a problem with that?

ann #490,

Yeah but that doesn't answer my question either. We're not talking about what we know, but what we might find out. Haystack first.

Why do I need the information about the abuse to detect the genetic correlation?

zebra, your very first comment said

we could sequence every child born in the US in one year for the cost of a few dozen F-35’s or similar tradeoffs.
Now that would be a study.

This is a terrible idea.

AdamG #497,

When someone pointed out that this would involve very large numbers, I immediately refined the concept. I guess you could also point out that I "nyah-nyah-nyah said 'study' " in that first offhand comment.

If, in your expertise, you have something to contribute to the discussion where it is rather than where it began-- help out by educating people. I suspect even the very educated people who first thought of your project started out with a napkin or envelope and a pencil stub. Or a tablet and a stylus, perhaps, these days.

help out by educating people

Believe me, I do. It's part of my job. I'm just not particularly interested in educating you given your failure to understand the importance of the obvious distinctions between a 'database' and a research study.

AdamG,

In fact, could you help out with the question I asked ann at 496?

AdamG, 499

I'm the one making that distinction.

In fact, could you help out with the question I asked ann at 496?

If you really want to learn, you have to do the work. These are complex topics. The textbook I linked you to is what I use to introduce these concepts to undergrads, and is very accessible. You can obtain a used copy for $10.

502

Aww, bluff WRT claim of expertise is called.

I will see if ann can apply her rabbinical inclinations to the question.

Precision medicine: Hype over hope?

The origin of "precision" medicine

Complexity intrudes

What is precision medicine, anyway?

Conclusion: Medicine that works is just medicine

More like this

The Academic Woo Aggregator

"Complementary and alternative medicine": Not just one thing

Reclaiming the linguistic high ground: Renaming "complementary and alternative" medicine and the power of language

Boiling "integrative medicine" down to its essence in 34 words

Turning out the lights and moving on: Goodbye, old ScienceBlogs blog, hello new blog

A quick update on the migration to a new domain

A change is gonna come. Respectful Insolence is moving.

And the box of blinky lights has arrived in Manchester for QEDCon

On the "integration" of quackery into the medical school curriculum

Friday Cephalopod: The Molluscan Air Force

How to attract an entomologist

A naturalist's color palette, circa 1686