About every single post on human population clusters tend to shift into a discussion as to whether human variation is clinal, or where one can make assertions of discrete groups. I think it is fair to note that most of the populations sampled have been skewed to one locale. For example, “French” might mean a few hundred patients from hospitals in the Paris area. “Belgian” might be a few hundred patients in hospitals in Brussels. The “gap” between the French and Belgian cluster may simply have to do with the fact that the populations are not representative of their nationalities. Surely as the genetic data gets more fleshed out and the sample sizes increase to the point where there are no “Here Be Dragons” spaces on the maps many of the clusters will begin to exhibit some continuity with each other. On the other hand, I do thinking of this purely as changes in allele frequency removes some important information.
Consider an idealized circumstance where you have 11 demes positioned in a sequence.
Gene flow can only occur between adjacent populations. So Deme 3 exchanges genes with Deme 2 and Deme 4. Now let’s chart the frequency for allele A on gene 1 across these populations.
As you can see, the rate of change between demes is constant. Obviously these are discrete demes, so by definition you can categorize them as individual populations, but if you want to divide them into two classes based on allele frequencies it will have to be an arbitrary classification. There just isn’t a “jump” in this sequence. But, now let’s look at the population sizes….
As you can see, the population sizes vary quite a bit. There is a population “desert” in the middle of the sequence and two broad conglomerations spatially of demes. Now, let’s take the change in allele frequency from deme to deme from deme 1 to 11 (-0.1 for every step), and multiply by the absolute population (so here I’ll start with deme 2 since there isn’t a deme 0 for deme 1, so every value is the population of the deme X the difference in allele frequency between that deme and the previous deme in the sequency).
As you can see there’s now some “funny business” in the middle of the sequence. Though the frequency of allele change is constant, the change in population means that the real density of the alleles varies quite a bit, and the change in that density is not constant. If this was continuous I’d obviously be speaking in higher order derivatives. The point is that the density of humans across geographical expanses matters a great deal. This does not mean that variation is not clinal, but, taking density into account allows one to conceive of “natural breaks.”
There’s a pretty obvious way population size might matter; the relative strength of selection vs. stochastic factors is proportional to population size. The smaller the population the more likely that random noise is to make selection irrelevant. The larger the population the more random noise will be dampened generation to generation in relation to selection. A very small population along the clinal gradient between two clusters may serve as a bottleneck to gene flow of adaptive alleles. Conversely, a large constant population along the clinal gradient should facilitate rapid gene flow of selectively favored alleles.