Instruction and information

Many words in English come directly via Latin or indirectly via French from Latin, and they have a meaning in English that is sometimes quite different from their etymology, occasionally leaching back into French.

Two such words are instruction and information, and both have peculiar meanings when used in the context of genetics.

Instruction is particularly interesting. The OED tells us that it has the following etymology:

[f. L. instruct-, ppl. stem of instruÄre to build, erect, set up, set in order, prepare, furnish, furnish with information, teach, f. in- (IN-2) + struÄre to pile up, build, etc.: see STRUCTURE, and cf. F. instruire. The history in Eng. does not correspond with the sense-development in L.]

Likewise information:

[a. F. informe (15th-16th c. in Godef. Compl.), ad. L. informis shapeless, deformed, f. in- (IN-3) + forma FORM.]

So if we were to go by etymology, we would say that to instruct someone is to furnish them with a shapeless something... so much for argumentum ab lexicon.

An instruction in computing is effectively the smallest program unit that causes action to be taken. Changing the content of a register is an instruction. In an informal sense, any unit of a programming language counts as an instruction - a line of C++, or a piece of assembly code, or a call to a function. An "instruction" is something that causes a change in the state of the CPU or peripheral intelligent device.

Why should this matter in biology? Well, one of the oldest and most widely used rhetorical tropes regarding genes, and latterly DNA, is that it/they are instructions. They cause the body to be built according to a program. Other metaphors include recipes, blueprints, code, and so on, but they all rely on this basic notion of instruction: DNA is a process of imparting information to be used in the construction of the organism. [Note the recurrence of the Latin term structura here. It means an arrangement, particularly in building. The building metaphor is ubiquitous.]

And information, which we all think we have a handle on, is what the instructions comprise. There are a slew of different uses here, not always or even often distinguished: information about traits, information about the environment, information about evolutionary strategies, information about signals, the list goes on. People will move immediately from molecular information, about the specificity of the expressed DNA sequence to produce protein primary sequences, to ecological information, in which the DNA is supposed to "encode" responses to the environment that have worked in the past.

Information is often a semantic notion; in which the informing instruction means or is about some other thing. So we say a gene is a gene for a particular trait or behaviour. So we get God Genes, Gay Genes, and Altruistic and Selfish Genes. [Note: This sense of "selfish" is different to Dawkins' in important ways.] Or we say that a certain gene is about a propensity for a particular outcome: such as breast cancer (Bcr genes, for example). But there is an intentional aspect to semantics - for something to "refer" to some other state, it needs to have an intent. This is not to say that it has to do the intending itself - the name "John Wilkins" does not intend to pick me and the other people by that name out - but something or someone has to intend that it does this. It might be the conventions of the language community, or the rulings of the Committe Français, or, as is often claimed, the outcome of selection. Selectionist accounts of meaning are called teleosemantic, and the "intent" here is that a given gene has is what it was selected for in the past. Following Ruth Garrett Millikan and others, this is sometimes called its Proper Function.

However, I think that this is all periphrasis for non-informational and non-instructive facts. In short, it is a useful way to think about genes, but not something that has any deep metaphysical import. I am one of a very few who think this - so far only Giovanni Boniolo and Predrag Å ustar have argued for it in print, and possibly also Sahotra Sarkar:

Boniolo, Giovanni. 2003. Biology without Information. Hist. Phil. Life Sci. 25:255-273.

Sarkar, Sahotra. 2000. Information in Genetics and Developmental Biology: Comments on Maynard Smith.
Philosophy of Science 67 (2):208-213.

Sustar, Predrag. 2007. Crick's notion of genetic information and the 'central dogma' of molecular biology.
Br J Philos Sci 58 (1):13-24.

What I want to do now is consider the underlying metaphysics of instructions and information, rather than give my argument against information in genes. If accepted, that will be published soon. Suffice it to say for now that I think that every use of the term information in respect of genes can either be replaced with the notion of causality, or can be ignored as adding nothing to the debate, or worse, as confusing and unhelpful.

The very idea of information presupposes something like this: There is a substrate that lacks a particular form, which, when informed, has that form and whatever functions that follow from the possession of it. This is a modern kind of hylomorphism, a view which goes back to Aristotle and earlier. There is something that doesn't intrinsically have form, and which gets it when instructed by something else. There is an alternative view, also back in the Greeks - atomism, which is badly named for something that the physical nature doesn't have (the ability to be further divided) than what it does (which is, a determinate nature). Atomism in Greek terms is roughly this view: All substrates have innate properties that determine how things that are composed out of them will behave. In short, hylomorphism, or form-substance dualism, requires that the properties of things are at least partly not due to the stuff of which it is made. Atomism requires that the properties of the parts fix all the properties of the wholes.

In modern debate terms, this is very much like - but not identical with - the reductionism/holism, or more recently, the reductionism/emergence dichotomy. One important difference is that unlike the Greeks, the properties that everyone cares about are at least partly caused by the properties of the parts (the particles) of which they are composed. The main argument now is whether any important properties - causally effective properties - are due to something like form or structure.

The idea that genes impart structure is both uncontroversial - for gene sequences do impart structure from the sequence of the DNA, via mRNA, to the structure of the primary sequence (the sequence of amino acids) of a polypeptide - and controversial - for it is unclear how the structure of, say, a body correlates to the structure of the DNA. Certainly DNA is causally important. For instance, genes direct the expression of growth proteins, and the expression of cell death proteins like retinoic acid. This is causal. Is it information?

Here is the ambiguity. If we think of causality as the impartation of a mark, as Salmon and Dowe do in the Conserved Quantity Theory of Causality, we can think of it as the instruction of information. But this is not the sense of information that matters in genetics, because if true, it applies in each and every case of causality, not merely those to do with genes. So there has to be something different about the causal relations of genes to development and phenotype.

I think that the underlying assumption, an intuition if you like, is that form needs to be imposed on substance. That is to say, we haven't yet shrugged off Aristotle's metaphysics. We are atomists, but not completely. Why can we not say that genes are causal actors in a molecular play? What more needs to be said? Why bring intentional language into this at all?

The answer is, of course, that we are intentional actors, and so we conceive of the world in those terms. It's hard not to talk about natural selection "choosing" or "intending" or having "goals", even though we know natural selection is not an agent but a dynamic. It's hard not to talk about evolutionary strategies, although we know that no organism strategises. And it's hard not to talk about genes as imparting information. This is the basic human flaw of anthropomorphism, seeing the world in human terms. Of course, only a little bit of the world is in fact couched in human terms (the human bit of it), but it helps us think. And sometimes it retards our thinking and research.

The world, particularly the biological world, is not composed of formless gunk that gets its properties only when information is pressed into it. The parts of the system give the system its properties. Genes cause processes to occur, and not alone - they are one causal element of the entire process of being alive. They do not instruct us how to live and grow. They do not impart information. They cause developmental properties to occur, just as ribozymes and other proteins do. They are the stable elements in that system, and that is what counts.

Basically, this is the message of Developmental Systems Theory - that genes are developmental resources. It is not to deny the specialness of genes in living systems, for they are crucial in most processes. But it is to deny that genes are magical elements in a form-substance world.

Categories

More like this

As someone with a B.S. in English Literature, this is a good analysis of an etymological paradox. As someone with a B.S. in Math and M.S. in Computer & Information Science, who did PhD research in interdisciplinary Biology, I think that the problem goes deeper.

In the Hard Sciences, there is a complicated connection between "Information" and "Entropy" [see ScienceBlogs' basics on Entropy].

In Computer Science, the notion of "instruction" is more complicated than summarized above, in part because of the distinction between Source Code and Object Code, which is vaguely analogous to the distinction between Phenotype and Genotype; and partly because of the distinctions between [see the recent Good Math, Bad Math Basics thread] parallel, concurrent, and distributed.

Further, the paradigm of Biology has been changed by the central role of computing -- and the change is accelerating.

I am not able to summarize my thoughts on this in a short space, as in a blog, as it is the subject of several hundred pages of Mathematical Biology papers that I've written.

But I do think that the subtleties are more numerous and more "gnarly" than your summary can say.

Of course, you were, I suspect, intentionally opening the door to a plethora of more complicated arguments, and the usual percentage of oversimplified arguments.

I was. But I think the relation between information and entropy is one way; all information is entropic, but not all entropy is information. This means pretty much the same thing as the point about causation (I don't follow Weber and Depew on this [Sorry, John Collier!]), and so we still need to know substantively what "information" might be in respect of genes.

So I reject the "information paradigm" claim. Computational modelling is little more than applied logic, and we had logic long before we had computers, and did our biology anyway. All computational power now permits us to do is figure out the implications of our models in a much shorter time, like within the age of the universe.

Biology is very little like a computer in any real sense. At least, until we have an analogue massively parallel computer that can connect to a slew of same, with a bandwidth both within and between computers that approaches quantum complexity (that is, where the number of quantum states employed for information transfer approaches the physical maximum). Then we will have a computational paradigm shift. Might just be easier to do it in wetware, though.

There is no "instruction" unit in biology akin to a machine code unit. Not any. Alas for the computational biologist, but there it is.

Sustar, Predrag. 2007. Crick's notion of genetic information and the 'central dogma' of molecular biology. Br J Philos Sci 58 (1):13-24.

Is it just me or is it totally hot when someone uses the word 'dogma' in a philosophy paper.

By jimmychkMcC (not verified) on 28 Mar 2007 #permalink

I suppose it hardly needs mentioning, the propagandistic abuse that ID/Creationists regularly make of the gene-as-information metaphor. That alone is a good reason to banish it, in favour of a teleology-free causality conception (OK, not really -- we shouldn't let the loons drive the choice of permissible terms in serious discourse. But if I ever manage to have a reasonable discussion with a Creationist, I will certainly try to make the point: what's really happening is chemistry; "information" is an abstraction we impose on it).

(And as it happens, I've just gotten to the dev-sys chapter of Sterelny & Griffiths).

By Eamon Knight (not verified) on 28 Mar 2007 #permalink

I think that if we were to banish anything, it would be to banish computer analogies. It seems to me that the more one understands the workings of computers and the more one attempts to apply that knowledge to biology, the less one is able to understand that the two are so different as to make any analogy pernicious.

By Grant Canyon (not verified) on 07 Sep 2007 #permalink