The more than 25,000 blood samples collected already make it possible to conduct various background studies. For example, comparing the genetic data of Estonians with other European nations has revealed that Latvians, Lithuanians, Poles and some Russians are genetically much more similar to Estonians than the Finns with whom Estonians share a similar language.
The genetic maps I post on now and then are real popular (invariably they are the ones that sites like reddit pick up), but the sample sizes aren’t that big. Often “France” means some patients in a study from Bordeaux and Paris. The goal with the Estonian Genetics Project is to collect 100,000 samples. As it is, there are only 1.25 million speakers of Estonian, so if the project is limited to ethnic Estonians we’re talking about ~10% of all Estonians. For most questions on historical-population size scales I doubt that there is any difference in power between a sample size of 100,000 and 1 million, assuming that it is modestly representative.
In any case, the finding that Estonians might be closer genetically to their Latvian and Lithuanian neighbors is not particularly surprising. Gene flow has a way of equilibrating and eliminated variation across adjacent populations. Similarities with Russians and Poles might also illustrate how water barriers build up genetic differences. Imagine for example that 3,000 years ago a group of Finnic speaking peoples crossed the Gulf of Finland. To the south, west and east are a host of non-Finnic peoples. These people are so numerous in relation to the proto-Estonians that one can think of them as nearly infinite in population size. 3,000 years is about 100 generations. Assume that there is gene flow from the proto-Estonians and into them for the past 3,000 years. Since the non-Finnic peoples are so numerous the proto-Estonians have negligible impact. But, consider the possibility that 1% of the genes in any generation are introduced from non-Finnic peoples. Iterate. After 3,000 years only about 1/3 of the genome content should be distinctively Finnic. The other 2/3 will be introgressed from the surrounding populations.
This “toy” model is probably wrong, but it illustrates how continuous gene flow over time can eliminate differences. Unlike genetics, language and cultural identity can be passed asymmetrically through parents, and changing languages can be quite abrupt and discrete. In other words, though outsiders can change genes imperceptibility from generation to generation, they may have still little cultural impact. The fact that Estonians speak a Finnic language despite being similar genetically to their neighbors may simply be a matter of “First-Settler-Effect.” There are fruitful analogies that can be made between linguistic and biological evolution, but the disjunctions are also notable and of interest.
Finally, I would like to introduce one alternative model which I find plausible. Before the rise of the Indo-European language group the center of gravity of the Finnic languages was almost certainly much further to the south and east, and that Finland itself was on the perimeter. By this model the Indo-Europeanization of most Finnic speaking peoples left the Estonians as a rump. The Finns to the north of the Gulf of Finland were already then distinct from the Estonians and other southerly peoples before this process occurred. The reason I present this model is that there is evidence that Russification up to the present day has come at the expense of the Finnic language groups.