A recent study of dog genetics, published in PLoS, seeks to improve the quality of genetic research by better understanding the underlying patterns of genetic variation at the level of specific dog breeds.
Sometimes we are interested in the evolutionary relationship between two “species” or populations, and genetics can be helpful. The more different the genetic sequence between two populations, the more distantly related they are (on average) and thus we can construct phylogenies (“family trees” of species or groups).
Sometimes we are interested in finding genes that are linked to particular phenotypes, like the gene for this or that disease. Finding a gene usually involves having a “probe,” which is essentially a molecule that can locate a particular DNA sequence under the proper conditions. Probes tend to be small compared to the whole genome, and the genome is very big (and generally uninteresting at the detailed level). For this and other reasons, it is not the case that there are probes for just any “address” in the genome. One tends to work with the probes that exist where possible, using nearby addresses (nearby some area of interest) to navigate the actual genome in a particular sample.
Both of these efforts would be easiest if there was very little variation in the genetic makeup across individuals within a given population. If all cats had the same genome, and all rats had the same genome (at the most detailed level) than any one cat would be useful to inform us of all the genetic details of all cats, same with the rats, and the rat-cat relationship would be simple to work out. But of course, there is variation within species or populations, and that variation can be both large and patterned. By this I mean that it is not simply a matter of “more” or “less” variation … there may be patterns to the variation that apply to one population that don’t apply as much to some other population.
In other words, lack of a detailed understanding of the structure of genetic variation in a particular population leaves a fair amount of uncertainty. A better understanding of this structure would allow for the application of more appropriate analytical techniques, more secure results, and overall more useful research.
One example of variation is the location of genes themselves in relation to specific genes and their alleles. (An allele is a variant of a gene … different alleles may result in different products, say a “normal” one vs. one connected with a disease.) For any pair of genes, there is a certain probability that during reproduction the specific alleles will be inherited together … this is called linkage. The pattern of inheritance for genes on different chromosomes is typically thought of as random … there is no link between them. However, if two genes are right next to each other on the same chromosome, there is a pretty good chance that they will be inherited together. The farther apart they are on the same chromosome, the more ‘random’ the inheritance pattern is, due to crossing-over.
The same is true of the linkage between a genetic marker that may be used (with a probe) to find a particular area of interest, and the actual DNA sequences of interest. A marker and a gene of interest should be very close to each other, or they may get passed on randomly … that would be a very inaccurate marker.
In both cases, as DNA sequences change over time, with the insertion or removal of sections of junk, or the movement of genes in relationship to each other, the linkage patterns of genes and of genes and their markers change. Say there is a gene that causes Disease X in many species of carnivores. There is little reason to expect that a marker for this gene in raccoons would be useful in finding this gene in distantly related pandas. But would the marker serve reliable within raccoons, or within pandas? It depends. You get the idea.
This is an example of farily complex patterning in the DNA of a given population. If one is using markers to find disease-connected alleles, one would ideally have information on population-level patterning of linkage. Non-random behavior of genes (a particular allele being selected for or against, for instance) is often revealed by examining the linkage-related measures. So, understanding the pattern of linkage within a population is important.
What is needed is a better understanding of nature of genetic variation within populations or sub populations.
Now we come to the part about the dogs…
A new paper in PLoS, “Canine Population Structure: Assessment and Impact of Intra-Breed Stratification on SNP-Based Association Studies” by Quignon et. al. explores this issue.
In gene studies of dogs, the problem of variation within breeds is usually managed by using a sample of a number of individuals as controls and individuals with a particular gene or condition of interest. One way to increase the utility of these studies is to sample (within a breed) individuals from different geographic areas. The separation in time and space between these individuals makes the individual sampling points more independent, which makes the statistical analysis more powerful. However, these practices, of even sampling of treatment and control, or of using geographically distinct populations, are based on (reasonable) assumptions about how the genetic structure underlying the actual dogs looks. The present study looks more closely at the reality of the underlying genetic patterning, to replace assumption with measured observation where possible.
These researchers looked only at a small selection of common breeds recognized in the U.S. and Europe: In particular, the Rottweiler, the Bernese mountain dog, the flat-coated retriever, and the golden retriever. These all have a genetic susceptibility to a certain class of cancer (e.g. malignant histiocytosis in the Bernese). They looked at a particular set of genetic data on one chromosome (canine chromosome 1) across 119 dogs.
We showed that each population is characterized by distinct genetic diversity that can be correlated with breed history. When the breed studied has a reduced intra-breed diversity, the combination of dogs from international locations does not increase the rate of false positives and potentially increases the power of association studies. However, over-sampling cases from one geographic location is more likely to lead to false positive results in breeds with significant genetic diversity. … [thus] … These data provide new guidelines for [statistical] studies using purebred dogs that take into account population structure.
One question that comes to mind immediately for me is the difference between breeds that are, essentially, offshoots of some basic stock vs. breeds that are amalgams of multiple breeds. One could say that to some extent both are true of all breeds, but I think that would be wrong. For instance, the mountain dogs such as the Bernese and the Pyrenees are probably bred from Tibetan mastiffs more or less directly, thus involving a reduction in genetic variation within the breed. In contract, the Newfoundland is also bred from a mastiff stock but possibly with another very distantly related breed added in for special effect (thus offsetting the variation). The doberman is one of the most complex breeds of recent times, with several different breeds used to achieve a true breeding highly specialized form. Breeds that derive mainly from divergence should have different patterns (genetically) than breeds derived from combinatorial breeding.
Quignon, P., Herbin, L., Cadieu, E., Kirkness, E.F., HÃ©dan, B., Mosher, D.S., Galibert, F., AndrÃ©, C., Ostrander, E.A., Hitte, C., Awadalla, P. (2007). Canine Population Structure: Assessment and Impact of Intra-Breed Stratification on SNP-Based Association Studies. PLoS ONE, 2(12), e1324. DOI: 10.1371/journal.pone.0001324