20,000 genes a surprise? Heck, this guy knew that long ago

John Hawks bumps into a prescient estimate of the total gene number in humans:

While doing some other research, I ran across a remarkable short paper by James Spuhler, "On the number of genes in man," printed in Science in 1948. We've been hearing for the last ten years how the low gene count in humans -- only 20,000 or so genes -- is "surprising" to scientists who had previously imagined that humans would have many more genes than this. So here's the next to the last line of Spuhler's article: On the basis of these speculations there are then some 19,890-30,420 gene loci in man. He actually estimated the total gene number in two ways. The first, based on estimates of chromosome length in Drosophila and humans, coupled with Bridges' estimate of fruit fly gene number (5000), led to an estimate of 42,000 genes in humans. This means of estimation was probably closer to those that later suggested a high gene number in humans.

I love this. The history of science is almost always richer and more variant than we imagine.

Hawks amends:

That estimate also gives the lie to the idea that geneticists always expected a very high gene count in humans. What's remarkable to me is that the entire means of estimation required no knowledge of gene sequences or DNA; the estimates required only epidemiology coupled with cytological estimates of chromosome lengths.

More at Hawks' blog, including the linkless old-school ref:

Spuhler JN. 1948. On the number of genes in man. Science 108:279-280.


That is early! I bet Larry Moran would be interested in that reference as he has collected many more early papers that predict a smaller number of genes. Some of those references are so mainstream and so well known, that stating 20,000 was a surprise is either disingenuous or hugely uninformed.

Prescient, or maybe a lucky guess? Regardless, during the first decade or so of the HGP, it was commonly reported, even by scientists "in the know", that the decoded human genome was expected to contain between 80 and 100 thousand genes. Yes, there certainly were scientists who thought otherwise, but many _were_ surprised when the "final" tally was revealed.

Very neat on estimating between 20K-30K gene loci in H. sapiens. The 30K number may turn out to be the more accurate value when also taking into account microRNA genes and all sorts of other genetic loci that fit the broader definition of a gene.

The estimate is based on 5000 genes in Drosophila, but the fruit fly genome encodes approximately 14,000. Thus, Spuhler was a bit fortunate that the estimate in D. melanogaster was off by ~3-fold. Spuhler hit upon a good or very close number by a less than accurate means.

Reading only Hawks' account and not the original paper, Spuhler's methods sound very crude (as they would almost have to be in 1948), such that he may have stumbled upon a good result via sheer chance and not any good science.
May be worth noting also that Spuhler was by training a physical anthropologist, not a biologist (or, maybe irrelevant!).