The reification of the gene

Razib Khan poked me on twitter yesterday on the topic of David Dobbs' controversial article, which I've already discussed (I liked it). I'm in the minority here; Jerry Coyne has two rebuttals, and Richard Dawkins himself has replied. There has also been a lot of pushback in the comments here. I think they all miss the mark, and represent an attempt to shoehorn everything into an established, successful research program, without acknowledging any of the inadequacies of genetic reductionism.

Before I continue, let's get one thing clear: I am saying that understanding genes is fundamental, important, and productive, but it is not sufficient to explain evolution, development, or cell biology.

But what the hell do we mean by a "gene"? Sure, it's a transcribed sequence in the genome that produces a functional product; it's activity is dependent to a significant degree on the sequence of nucleotides within it, and we can identify similar genes in multiple lineages, and analyze variations both as a measure of evolutionary history and often, adaptive function. This is great stuff that keeps science careers humming just figuring it out at that level. Again, I'm not dissing that level of analysis, nor do I think it is trivial.

However, I look at it as a cell and developmental biologist, and there's so much more. That gene's transcriptional state is going to depend on the histones that enfold it and the enzymes that may have modified it; it's going to depend on its genetic neighborhood and other genes around it; it's not just sitting there, doing its own thing solo. And you will cry out, but those are just products of other genes, histone genes and methylation enzymes and DNA binding proteins, and their sequences of nucleotides! And I will agree, but there's nothing "just" about it. Expression of each of those genes is dependent on their histones and methylation state. And further, those properties are contingent on the history and environment of the cell — you can't describe the state of the first gene by reciting the sequences of all of those other genes.

Furthermore, the state of that gene is dependent on activators and repressors, enhancer and silencer sequences. And once again, I will be told that those are just genetic sequences and we can compile all those patterns, no problem. And I will say again, the sequence is not sufficient: you also need to know the history of all the interlinked bits and pieces. What activators and repressors are present is simply not derivable from the genes alone.

And I can go further and point out that once the gene is transcribed, the RNA may be spliced (sometimes alternatively) and edited, processed thoroughly, and be subject to yet more opportunities for control. I will be told again that those processes are ultimately a product of genes, and I will say in vain…but you don't account for all the cellular and environmental events with sequence information!

And then that RNA is exported to the cytoplasm, where it encounters other micro RNAs and finds itself in a rich and complex environment, competing with other gene products for translation, while also being turned over by enzymes that are breaking it down.

Yes, it is in an environment full of gene products. You know my objection by now.

And then it is translated into protein at some rate regulated by other factors in the cell (yeah, gene products in many cases), and it is chaperoned and transported and methylated and acetylated and glycosylated and ubiquitinated and phosphorylated, and assembled into protein complexes with all these other gene products, and its behavior will depend on signals and the phosphorylation etc. state of other proteins, and I will freely and happily stipulate that you can trace many of those events back to other genes, and that they respond in interesting ways to changes in the sequences of those genes.

But I will also rudely tell you that we don't understand the process yet. Knowing the genes is not enough.

It's as if we're looking at a single point on a hologram and describing it in detail, and making guesses about its contribution to the whole, but failing to signify the importance of the diffraction patterns at every point in the image to our perception of the whole. And further, we wave off any criticism that demands a more holistic perspective by saying that those other points? They're just like the point I'm studying. Once I understand this one, we'll know what's going on with the others.

That's the peril of a historically successful, productive research program. We get locked in to a model; there is the appeal of being able to use solid, established protocols to gather lots of publishable data, and to keep on doing it over and over. It's real information, and useful, but it also propagates the illusion of comprehension. We are not motivated to step away from the busy, churning machine of data gathering and rethink our theories.

We forget that our theories are purely human constructs designed to help us simplify and make sense of a complex universe, and most seriously we fail to see how our theories shape our interpretation of the data…and they shape what data we look for! That's my objection to the model of evolution in The Selfish Gene: it sure is useful, too useful, and there are looming barriers to our understanding of biology that are going to require another Dawkins to disseminate.

Let me try to explain with a metaphor -- always a dangerous thing, but especially dangerous because I'm going to use a computer metaphor, and those things always grip people's brains a little bit too hard.

In the early days of home computing, we had these boxes where the input to memory was direct: you'd manually step through the addresses, and then there was a set of switches on the front that you'd use to toggle the bits at that location on and off. When a program was running, you'd see the lights blinking on and off as the processor stepped through each instruction. Later, we had other tools: I recall tinkering with antique 8-bit computers by opening them up and clipping voltmeters or an oscilloscope to pins on the memory board and watching bits changing during execution. Then as the tools got better, we had monitors/debuggers we could run that would step-trace and display the contents of memory locations. Or you could pick any memory location and instantly change the value stored there.

That's where we're at in biology right now, staring at the blinking lights of the genome. We can look at a location in the genome — a gene — and we can compare how the data stored there changes over developmental or evolutionary time. There's no mistaking that it is real and interesting information, but it tells us about as much about how the whole organism works and changes as having a readout that displays the number stored at x03A574DC on our iPhone will tell us how iOS works. Maybe it's useful; maybe there's a number stored there that tells you something about the time, or the version, or if you set it to zero it causes the phone to reboot, but let's not pretend that we know much about what the machine is actually doing. We're looking at it from the wrong perspective to figure that out.

You could, after all, describe the operation of a computer by cataloging the state of all of its memory bits in each clock cycle. You might see patterns. You might infer the presence of interesting and significant bits, and you could even experimentally tweak them and see what happens. Is that the best way to understand how it works? I'd say you're missing a whole 'nother conceptual level that would do a better job of explaining it.

Only we lack that theory that would help us understand that level right now. It's fine to keep step-tracing the genome right now, and maybe that will provide the insight some bright mind will need to come up with a higher order explanation, but let's not elide the fact that we don't have it yet. Maybe we should step back and look for it.


More like this

I mulled over some of the suggestions in my request for basic topics to cover, and I realized that there is no such thing as a simple concept in biology. Some of the ideas required a lot of background in molecular biology, others demand understanding of the philosophy of science, and what I am…
The ENCODE project made a big splash a couple of years ago — it is a huge project to not only ask what the sequence of a strand of human DNA was, but to analyzed and annotate and try to figure out what it was doing. One of the very surprising results was that in the sections of DNA analyzed,…
Let me tell you the hard part about writing about epigenetics: most of your audience has no idea what you're talking about, but is pretty sure that they can use it, whatever it is, to justify every bit of folk wisdom/nonsensical assumption that they have. So while you're explaining how it's a very…
Recently there has been a flood of press about epigenetics and non-coding RNA. What is lacking from these articles is a description of how DNA is packaged and what DNA elements such as promoters and enhancers do. Today I would like to touch upon all of these subjects with a post on how DNA is…

It's always good to look at, and perhaps question, all aspects of a scientifically-arrived at `given', especially when it's an idea that is as relatively new and esoteric (to a layperson) as gene function. Go, PZ.

Dobbs also makes the issues more confusing by confounding the two meanings of gene (as locus and as allele).

By Rosie Redfield (not verified) on 08 Dec 2013 #permalink

In oversimplified layperson's terms:

Dawkins et. al. are found to be engaged in something roughly akin to erroneous reductionism by attributing all phenotypic causality to the gene alone. This in turn is based on the error of isolating the gene from its chemical environment and broader context. (These may not have been errors when Dawkins was writing, but have been revealed as such through more recent research.)

Whereas what we find is the case in numerous fields of science, is that one can't isolate specific phenomena from their broader contexts, e.g. the behavior of an animal from its social and ecosystem contexts. In retrospect, there's no good reason for genetics to be divergent from this generalization.


I found the "grasshopper/locust" example to be particularly convincing of the point that the "old paradigm" is obsolete and a new paradigm is needed. And it's to be expected that those who have the greatest personal investment in the old paradigm will defend it vociferously to the end.

One name: Deinococcus radiodurans.

What's up with Deinococcus?

By David Marjanović (not verified) on 10 Dec 2013 #permalink

The reification of the gene – Pharyngula

By running shoes … (not verified) on 15 Dec 2013 #permalink