Synteny -- A Semantic Debate

There's a post up at Pharyngula describing the concept of synteny in comparative genomics (Basics: Synteny). The definition given by PZ Myers will sound pretty familiar to those of you who have read some of the genomics literature. The problem: it's not quite correct. It's actually the definition that I think most comparative genomics folks would give if they were asked to define synteny. But they keep using that word, and I don't think it means what they think it means. What's the definition? Here it is in PZ's own words:

Synteny is the conservation of blocks of order within two sets of chromosomes that are being compared.

I disagree. While this is what many genomicists mean when they write or talk about synteny, they are wrong. Instead, I would argue that synteny merely means that genes are found on the same chromosome. Synteny says nothing about the order of genes. What give me the right to say this?

Let's take a quick journey through the literature. In a paper comparing the genomes of various mammals, Joseph Nadeau and colleagues wrote the following:

Synteny refers to the occurrence of two or more genes on the same chromosome, whereas conserved synteny refers to two or more homologous genes that are syntenic in two or more species, regardless of gene order on each chromosome, i.e., synteny but not necessarily gene order is conserved (Figure 2; see also NADEAU 1989). Conserved linkage pertains to the conservation of both synteny and order of homologous genes between species (Figure 2; see also NADEAU 1989). A disrupted synteny refers to circumstances where a pair of genes are located on the same chromosome in one species but their homologues are located on different chromosomes in another species, i.e., the genes are syntenic in only one of the two species.

i-2d75cf4cd6e3e7af82e6061cd4b1d292-synteny_nadeau.gif

In the Nadeau framework, if genes are found in the same order in two species, we say there is conserved linkage. Is there any precedence for this terminology? Well, here's the relevant passage from Nadeau's 1989 paper (doi:10.1016/0168-9525(89)90031-0):

Conserved syntenies are homology segments composed of two or more pairs of homologous genes located on the same chromosome, regardless of gene order. These represent the first formal evidence for conservation.

Conserved linkages are the most rigorously defined segments because both synteny and gene order must be conserved. Distinctions between the three characterizations, which are illustrated in Fig. 1, are important for understanding the extent and nature of conservation and for assessing progress towards saturated maps of linkage and synteny homologies.

In the figure to the right, genes A, B, and C are used to illustrate conserved synteny and conserved linkage. In panel A, there is no conserved synteny. Panel B shows conserved synteny, but not conserved linkage. And Panel C shows conserved linkage (which implies conserved synteny).

Why does any of this matter? Well, rather than talk about things like micro- and macro-synteny, as Myers and various other do, there is only synteny. This clarifies the terminology a bit. We have a separate term for the conservation of gene order of syntenic genes -- conserved linkage. A uniform and clear vocabulary would make the literature and discussion within the comparative genomics community more precise. Precision is good, right?

Is anyone in the comparative genomics community using this terminology? Yes, there are a few people, but it's a small group that doesn't carry much weight: Drosophila geneticists. But, hey, Drosophilists invented modern genetics, so they are a bit of an authority on the topic. The second sequenced Drosophila genome, that of D. pseudoobscura, provided an opportunity to compare gene order between that species and D. melanogaster (doi:10.1101/gr.3059305). In this paper, the conserved linkage terminology was used.

Okay, but what the heck are syntenic blocks? Yeah, that's a bit of a oddity. It appears that syntenic block has become the term to refer to syntenic genes with conserved linkage. I'd prefer "conserved linkage blocks", but those of us who favor clear terminology may have lost this war. So, it now seems like there are syntenic genes (those found on the same chromosome) and syntenic blocks (groups of syntenic genes found in the same order in the species being compared).


Ehrlich et al. 1997. Synteny Conservation and Chromosome Rearrangements During Mammalian Evolution. Genetics 147: 289-296 [link]

Nadeau 1989. Maps of linkage and synteny homologies between mouse and man. Trends Genet. 5: 82-86 doi:10.1016/0168-9525(89)90031-0

Richards et al. 2005. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution. Genome Res. 15:1-18 doi:10.1101/gr.3059305

Tags

More like this

Hmmmm. One paper from 1989, one from 1997, and one with, shall we say, a non neutral author list. I'm not sure PZ is going to buy that he has it wrong if he is using the terminology most recently and frequently.

Yeah, that's a bit of a oddity. It appears that syntenic block has become the term to refer to syntenic genes with conserved linkage. I'd prefer "conserved linkage blocks", but those of us who favor clear terminology may have lost this war. So, it now seems like there are syntenic genes (those found on the same chromosome) and syntenic blocks (groups of syntenic genes found in the same order in the species being compared).

The problem I see is that "linkage" immediately invokes images of classical genetics and mapping, which is pretty misleading in the vast majority of organisms where classical genetics were never used.

You are right. PZ is wrong.

The term "synteny" means on the same thread. Thus, orthologous genes on human and mouse chromosomes can't be said to be "syntenic". They can, however, be said to be in regions of conserved synteny.

The term was originally devised to describe genes that could be assigned to the same chromosome but for which there was either no linkage data, or the genes are more than ~35 centiMorgans apart, or for which the assignments were made using FISH or analysis of somatic cell hybrids.

I have used the term a lot, and like Joe Nadeau, I work with mice.

I have to say, though, that the term is currently used incorrectly more than it is used correctly. We have to recognize that words, and language, evolve, so this is likely to be a losing battle.

The definition thing had me confused too for a while. A good explanation is in a paper on the C. briggsae genome, where they did a lot of work comparing it to the C. elegans genome. They state

"With the bulk of the C. briggsae genome placed along chromosomes, the conservation of synteny (using synteny here in the originally defined sense of genes on the same linkage group or chromosome) and colinearity (meaning the order of genes along the chromosome) between C. elegans and C. briggsae could be investigated directly across the whole genome."

See L. W. Miller et al. in PLoS Biology .

I confess, as the Academic Editor for this paper, I originally thought this paper was not interesting because I was using the "modern" definition of synteny. But they actually had found something VERY interesting, which was that synteny (which chromosome genes were on) was very highly conserved even when order on the chromosome (colinearity) was not.