Language Log is Stealing My Business

Mark Liberman at Language Log has been posting on genetics recently. A couple of days ago he tried to track down the origins of the components of the gene name BTBD9. The letters and numbers in the name stand for complex-tramtrack-bric-a-brac-domain 9, which are hijacked from Drosophila nomenclature. Liberman then tries to figure out the origins of the names tramtrack and bric-a-brac using FlyNome (a cool webpage that I hadn't seen before) and FlyBase. In the end, he couldn't track down the (clever) story behind either one.

I was amused by that post, and I was further impressed by Liberman's range in this post on the news coverage of genome wide association studies. These studies look for single nucleotide polymorphisms (SNPs) that show a statistically significant association with a phenotype of interest. Liberman mentions two studies that mapped a genetic basis of restless legs syndrome, questioning the coverage of these papers in the popular press. Without presenting the frequency of the associated SNP, newspaper articles leave readers with the impression that individuals carrying one allele have a particular disease/syndrome/whatever, while individuals with the other allele do not. But, when one looks at the actual data, that is not the case -- affected individuals may have an allele frequency of 80%, while the general population has an allele frequency of 70%. Liberman want these allele frequencies to be presented in the articles discussing the results.

And if you aren't a regular reader of Language Log, bookmark it, add it to your RSS reader, or tattoo the URL to your forehead. It's one of the few must read blogs in these internets.

More like this

There's no room for comments on his blog as far as I can tell, so I will post one genome wide association study-related addendum here. Point taken about reporting study size, but I disagree on his recommendation to report absolute allele frequencies. The importance of an absolute allele frequency is not clear even to those in the field, and if common frequencies are reported, it would only confuse laymen more ("Wait, 70% of HEALTHY people have this? Why don't more people have the disorder, then?") The point--that these "genes" are not causative, but only INFLUENCE risk--is less confusingly and just as accurately made by reporting the relative difference between cases and controls.

The importance of an absolute allele frequency is not clear even to those in the field, and if common frequencies are reported, it would only confuse laymen more

not necessarily a bad thing.

oh noes!, the poor confused laymen when they get a whiff of complexity or incomplete causal determination. :) just kiddin. But i'd prefer confusion rather than a false sense of understanding.