Race or Whatever

If you read evolgen, you've probably been following the race riots that Wilkins started. It's pretty much died down now, and it was more a debate about semantics rather than an actual scientific disagreement. This is usually the case in evolutionary biology -- take, for example, the neutralist-selection debate or the recent junk DNA fun we had here at evolgen. I have refrained from offering my opinion on Wilkins's post due to my poor understanding of human population genetics (as evidenced by my attempt to discuss marketing BiDil to African Americans), but I have a few comments I would like to make. You can find them below the fold.

Before my opinion, a quick recap. It all started when John Wilkins wrote his opinion about race in humans, arguing that it is not biologically meaningful (defending the Lewontin thesis). PZ Myers then posted in agreement with Wilkins, and Jason Malloy disagreed in the comments. Razib, who knows more about human population genetics than any armchair scientist should, provided a nice argument for why Lewontin is wrong, and the Contingency Table took a look at some of the articles Wilkins cited to show they don't actually support his argument. My conclusion (based on something Razib wrote): what one person is willing to call race, another wants to label as population structure. Oh, and some people think Lewontin is an idealist who believes in a homogenous population of all humans. I'm going to try to stay as neutral as possible. This post is mostly an attempt to request feedback from my readers and encourage a discussion of the data, rather than opinions.

First of all, I'm going to refer to populations rather than races. If you want, you can think of them as metapopulations or some other nested population structure. Either way, we're using the word population. Let's assume we have no a priori assumptions regarding our populations. Now, it's not very practical to identify populations based on fixed alleles between the populations because there won't be very many of those. Instead, we should employ a probabilistic approach where certain alleles tend to be found in one population versus another. If we genotype individuals at multiple loci, we can construct our populations using an assignment test. This is what Jonathan Pritchard's Structure program does. In this algorithm, the most probable model tends to be over-split into many populations, but we can constrain the method to find no more than 3, 4, 5 or any other number of populations.

I think everyone involved will agree that we can recover our predefined "races" using an assignment test. The races have been reproductively isolated (to some extent) allowing for the neutral alleles to reach different frequencies in the different populations. Determining whether these differences are meaningful beyond population structure requires a bit more work. First off, we must define what we mean by meaningful. I would argue that meaningful requires some sort of physiological, anatomical, or other phenotypic difference between the populations.

Are there such differences? Well, duh. Human populations have different skin colors, tolerance to types of food, resistance to pathogens, facial characteristics, body types, etc. None of these differences seem contentious, but when we start dealing with behavior or intelligence, supremacist undertones permeate the conversation. Most of the aforementioned differences are due to natural selection, and that makes them poor markers for recovering population structure -- if selection regimes are similar in multiple populations, allele frequencies may be similar due to convergent or parallel evolution rather than recent common ancestry. But these loci are what make our populations biologically different.

So, we must first determine our populations using neutral loci, and then examine non-neutral alleles to see which ones differ between the populations. We can examine the polymorphism around a locus to determine whether alleles differ between populations because of selection or population structure. Also, genes that suggest different population structure than most of the other loci in the genome may have their pattern due to selection.

That's what I think; sadly, I'm not familiar enough with the data from humans to tell you what they reveal. My (limited) understanding is that the majority of human genetic diversity can be found in African populations. African populations differ from each other at both neutral alleles (revealing population structure within Africa) and non-neutral loci (revealing biologically meaningful differences). My argument has always been that the current races (as defined in the United States) are inappropriate because they marginalize the diversity found within Africa. If each race were to contain the same amount of diversity, we would need to split Africans into multiple races. Those who are more familiar with these areas of research, please correct me where I am wrong.

More like this

John Wilkins has a post on race where he expresses skepticism about its biological reality. He comment was in response to a post on my other blog (by another individual), but I'll stand by it. I've talked abut race in the past, and I'm not into the topic at this point since it is going over old…
I have mentioned before that at one point in my life I wanted to study conservation genetics. This field can be thought of a subdiscipline of molecular ecology -- wherein researchers use molecular markers to test hypotheses regarding demography in their population of interest. Jacob at Salamander…
John put up his last thoughts on race, and Evolgen chimed in with his ruminations. First, nice exchange. Quick points.... 1) I'm not hung up on a word. If you want to agree on another word that captures what I'm trying to say, I'm willing to go along with it. 2) One key point I want to make is…
The post from yesterday was inspired by the news coverage surrounding the paper describing gene expression differences (DOI) between human populations. The original article uses neither the term 'race' nor the term ''Caucasian''. Instead, what would normally be called 'races' are referred to as '…

Nice recap and conclusion. But I said from the start I was merey musing on the "received view" out of ignorance. I got educated, yes I did...

I'd also like to point out that Wilkins has taken this all in stride, openly admitting his errors. That post also provides a much more philosphical conclusion on the debate, which causes practicing biologists like myself to be confused about what science is.

Evolgen writes:

"My argument has always been that the current races (as defined in the United States) are inappropriate because they marginalize the diversity found within Africa. If each race were to contain the same amount of diversity, we would need to split Africans into multiple races. Those who are more familiar with these areas of research, please correct me where I am wrong."

That is true primarily for junk genes, the DNA that doesn't do much of anything. Junk genes are highly useful to population geneticists tracing the genealogies of racial groups, but they don't affect anything in the real world.

Then, are black Africans highly diverse physically? Well, that depends upon who you are lumping together. There are indeed some highly unusual peoples in Africa, but almost none of them were brought to America as slaves. The most genetically distinct people in sub-Saharan Africa are the Khoisan. These are the yellowish-brown, tongue-clicking Bushmen and Hottentots of the Southern African wastelands, the remnants of a great race that once dominated most of Africa before the blacks ethnically cleansed them from the more desirable lands. The most striking contrast in Africa is between the tiny Pygmies and the ultra-tall herding tribes of East Africa. But except for the 7'7", 190-pound basketball novelty Manute Bol, few of either group made it to America. In contrast, the West African tribes that did provide the vast majority of American slaves are relatively homogenous. The great population geneticist Cavalli-Sforza sums up the situation on the ground like this, "... differences between most sub-Saharan Africans other than Khoisan and Pygmies seem rather small."


We're all absolutely sure that extended families exist, even though nobody can specify exactly how many extended families they personally belong to or exactly where they end or what the unambiguous name of each extended family is.

We can also be absolutely sure that some extended families are more coherent and long-lasting than others because they possess some degree of inbreeding in their genealogy. And that's basically what those racial groups you read about in the newspapers everyday are: partly inbred extended families.

If you think about it, you'll find that this is both a more technical version of a very old definition of race -- "a lineage" -- and a useful way to think about what people casually refer to as races.

Another useful definition is to define an "ethnic group" as a group of people who share some common characteristic that is frequently passed down within biological families, but that doesn't have to be -- e.g., language, religion, last names, feelings of fraternity, cuisine, etc.

For more on these very handy ways to better understand the world, see:


My point is that neutral loci are good for elucidating population structure. Studying loci under selection is a lot more interesting when you can overlay it on genealogical relationships between populations.