A brief overview of Hox genes


In previous articles about fly development, I’d gone from the maternal gradient to genes that are expressed in alternating stripes (pair-rule genes), and mentioned some genes (the segment polarity genes) that are expressed in every segment. The end result is the development of a segmented animal: one made up of a repeated series of morphological modules, all the same.


Building an animal with repeated elements like that is a wonderfully versatile strategy for making an organism larger without making it too much more complicated, but it’s not the whole story. Just repeating the same bits over and over again is a way to make a generic wormlike thing—a tapeworm, for instance—but even tapeworms may need to specialize certain individual segments for specific functions. At its simplest, it may be necessary to modify one end for feeding, and the opposite end for mating. So now, in addition to staking out the tissues of the embryo as belonging to discrete segments, we also need a mechanism that says “build mouthparts here (and not everywhere)”, and “put genitalia here (not over there)”.

Many people have at least heard of the particular set of genes, the Hox genes, that are responsible for assigning specific regional identities on body parts (Ed Lewis won the Nobel for his work on them, for one thing). I’ll just try to give a rough overview of them here, but if you want more details, check out Thomas Bürglin’s Homeobox Page.

First, some terminology, working from the general to the specific. We tend to throw around vaguely similar sounding terms like homeotic and homeobox and Hox, but they don’t mean precisely the same thing.

Homeotic gene: a gene which defines a region or position in the embryo. Mutations in homeotic genes lead to transformations of one structure into another; the classic example is antennapedia, a mutation that turns the antenna of a fly into a leg.

This class of mutations was first recognized in 1894 by William Bateson, who coined the term in his book, Materials for the Study of Variation. This is a general category for a variety of different kinds of genes.

“The case of the modification of the antenna of the insect into a foot, of the eye of a crustacean into an antenna, or a petal into a stamen, and the like, are examples of the same kind. It is desirable and indeed necessary that such variations, which consist in the assumption by one member of a meristic [involving a number of similar parts] series, of the form or characters proper to other members of the series, should be recognized as constituting a distinct group of phenomena…I therefore propose…the term Homoeosis…for the essential phenomenon is not that there has merely been a change, but that something has been changed into the likeness of something else.”

Homeotic genes are defined by their phenotype. When molecular genetics came into vogue, and scientists started looking at the molecules responsible for homeosis and determined their sequence, they noticed a similarity: the genes were transcription factors that regulated the activity of other genes, and many of them contained a common motif, a stretch of DNA they named the homeobox.


Homeobox and Homeodomain: The homeobox is a 180-basepair sequence of DNA that has been found in many regulatory genes. The homeodomain is the 60-amino acid stretch that corresponds to the translated homeobox. This part of the protein is a DNA binding sequence; it forms three helices that nestle neatly into a groove formed by the DNA spiral, and the amino acids in these regions assign binding specificity to particular sequences in the DNA.

The homeodomain sequence is highly conserved. The DNA sequence is roughly 75% homologous when comparing different Hox genes within the fly; comparisons of the homeodomain sequence shows even higher homology, since most of the nucleotide differences conserve the same amino acid. This homology is conserved between phyla, as well as within a species. Some vertebrate homeodomain sequences are more similar to their fly homologs, than that fly homolog is similar to other Hox genes in that same species. There is variation, of course: it is these small differences that give the different homeodomain genes different properties.


Not all genes that contain a homeobox are Hox genes, however! Hox genes are a subset of homeotic genes that contain a homeobox, and are also found in a homeotic complex.

Homeotic complex (HOM-C): Many of the homeotic genes that have been identified so far have another shared property, in addition to having a homeobox. They are all linked together in a sequential cluster on a chromosome, and their position in the cluster corresponds roughly to the spatial pattern of their expression. What this means is that if you look at the chromosome that contains the Hox genes (the subset of homeobox-containing genes that are also part of a homeotic complex are called Hox genes), they are all lined up in order on the strand of DNA. In the diagram below, you can see an array of 9 genes in the fly, from the orange labial (lab) gene on the left, or 3′ end of the DNA, to the blue Abdominal-B (Abd-B) gene on the right, or 5′ end. What’s really cool about this array is that it also corresponds to the spatial pattern of expression in the fly—the orange gene is turned on at the very front end of the fly, and the blue gene is turned on in the most posterior part.


We can also now look at the Hox genes in many different animals (isn’t scientific progress wonderful?) and see how they compare from species to species. Here is a chart of the homologous Hox genes that have been identified in various arthropods.

Arthropod Hox genes. Arthropods for which at least four Hox genes have been reported are shown. Drosophila melanogaster, with the best-characterized Hox cluster, is shown at the top. Other taxa are arranged by subphylum. Lines connecting genes indicate known linkage relationships. Black borders around genes indicate that all or most of the sequence of the 60 residue homeobox region has been reported. Boxes without borders represent short fragments only. Box colors indicate homology within columns. Some short fragments are identifiable as central class genes (i.e., Antennapedia-like) but cannot be classified as any one gene. These are shown in two colors. Gene duplications are shown as additional, slightly offset boxes in each row.

Another extremely cool thing about them is that they are universal in animals. We vertebrates also have Hox genes, and they have the same properties: each gene contains a homeobox, and they are organized on the chromosome in the order of their expression from front to back. Here is a similarly color-coded version of the homologous Hox genes in the mouse:


There are differences, of course. We mammals have more than one Hox cluster—we’re the recipients of several duplications of the whole shebang, so we have multiple, overlapping sets of Hox genes. There have also been individual duplications of a few genes within the clusters, so we have some extras at the caudal end. There are also a few gaps; with duplication comes redundancy, and the possibility of deletion without detriment, and so we also see some examples of culling duplicates in our history.

Speaking of history, one of the things we can do with a phylogenetic analysis of the Hox cluster is see fascinating aspects of our ancient history. Since the genes are conserved, we can map correspondences between them within a lineage and in comparison with other lineages. We can surmise where duplications and deletions occurred, and most interestingly, since the genes are associated with morphological regions of the organism, we can speculate about how new additions to animal morphology occured.

For example, we can piece together the relationships between individual genes in the complex. The anteriormost and posteriormost genes in the complex are the most different from one another, so we can surmise that they diverged earliest, and have had the most time to accumulate differences. The sequence differences also tell us that Sex-combs-reduced (Scr) is almost as old, so it must have arisen by an early duplication. We can use the pattern of differences to assemble a history of the genes, as shown below, where an ancestral Antennapedia (Antp) gene has undergone multiple duplications, followed by the development of variations, to yield a larger array of genes.


Since we also know in which regions of the body these different genes are expressed, we can put together models of morphological evolution. We start with a simple, ancient arthropod ancestor that has segments, but only primitive specializations: there’s a head end (with stuff for eating and sensing) and a tail end (with genitals for…well, you know), and a series of identical bits in between.


The genes suggest what happened next. An ancestor acquired a duplication of the gene responsible for patterning the middle part, which now allowed it to add extra specializations to the front end, just behind the head. In this case, this gave a few limbs at the front a special identity, and they could then acquire new features that made them useful as mouthparts without simultaneously modifying every single limb in the animal’s body.

Another gene duplication would have subdivided the body into two parts, again each with its own molecular address, allowing the two parts to evolve unique characters. This arthropod division would have produced an animal with an abdomen and thorax. If the abdominal Hox genes then acquired a property that suppressed limb formation, which it could now do without shutting down limbs in the entire body, you’d then have something like a primitive insect.

The last figure also suggests more complex levels of regulation. The little circles represent subdomains of Hox gene expression—one of the things the organism can also do is express sub-segmental patches of a different Hox gene, bringing in the whole subsequent module of gene recruitment to that area and adding a bit of abdominal character, for instance, to a particular feature of that segment. We vertebrates add a whole ‘nother level to the potential complexity here by having duplicated Hox genes with overlapping domains of expression. That means that regional identity can be a product of a combinatorial pattern of Hox gene expression.

Akam M, Dawson I, Tear G (1988) Homeotic genes and the control of segment diversity. Development 104 (supplement): 123-134.

Cook CE, Smith ML, Telford MJ, Bastianello A, Akam M (2001) Hox genes and the phylogeny of the arthropods. Current Biology 11:759?763.