Alternate post title: Why Charles Jackson is a tool who can quote papers, but doesnt understand what he is reading.
I get this question all the time, and its totally valid:
How do you tell the difference between an endogenous retrovirus that is shared because of common descent, and a retrovirus that was endogenized independently in two species?
A follow up paper to the one I wrote about here (Me, and you, and Zaboomafoo) provides a lovely example of how we do this!
So heres the back story: There are a ton of different retroviruses. Its not one big homogeneous group of ‘virus kind’, each family of retrovirus has its own genetic features and personality quirks. For a nice comparison so you all can appreciate the diversity of retroviruses: you are as ‘related’ to green algae, as HIV-1 is ‘related’ to a family of ERVs, HERV-K.
Lentiviruses, like HIV-1, apparently dont like to endogenize very much. Weve looked in humans, primates… we cant find any. We found one in a particular species of bunnah. And they just found another one in Cheirogaleus medius, a kind of lemur.
Neato!
Well, another lab looked at different species and genera of lemur… and found another endogenous lentivirus! In another genus of lemurs, Microcebus! Additionally, these lemur ERVs are 93-96% similar to one another, while HIV-1 viruses are only 80-85% similar to one another, max.
At the surface, this might make it look like Microcebus and Cheirogaleus share this ERV because of an endogenization event in a common ancestor– ‘same virus’, two different genera. Right?
Wrong!
You need to remember that where retroviruses insert in a genome is random. Like Ive said a hundred times before, yes, some like to insert near active genes, and some like quiet genes, but exactly where a retrovirus inserts– near which active genes, exactly which nucleotides are up/down stream, is random.
However, there is no complete ‘lemur’ genomic sequence. How can you ‘tell’ where an ERV has inserted if you dont even have a genomic map?
To identify putative lentiviral ERVs in lemur genomic sequences, Gilberts lab compared the RELIK sequence to lemur sequences (however little we have). Areas where the sequences matched, they called putative lentiviral ERV sites. Even if you dont have complete lemur genome sequences, you can still see what sequence is directly upstream and downstream from the ERV. In this drawring I made, A is a ‘normal’ sequence, B is the sequence after an ERV has inserted, and C is what happens when an ERV ‘pops out’– the pink parts, the viral LTRs (promoters) line up during cell division, and the gene portions of the ERV just pop out, leaving a ‘solo LTR’ footprint of where the complete ERV used to be.

Because Gilbert knew what sequence was upstream and downstream of the LTR, he could design primers specific for that location, and PCR amplify that region of the lemurs genome. IF there was not and had never been an ERV at that site in the genome, he expected a 250 base pair product. IF there used to be an ERV there, and there was only a solo LTR left (which is what he thought he saw in his RELIK/genome comparisons), he expected a 670 bp product (just an example, he did a few this way).

You can see how different these sizes look in a gel in Figure 5.
He determined that even though the lentiviral ERV sequences were super similar, they are not located in the same genomic regions. ‘Present’ PCR fragment sizes in Microcebus species were ‘absent’ fragment sizes in Cheirogaleus (he also tested this in a similar, but different way, Southern Blots, and other basic molecular genetics tools– Im not explaining them in this post, lol).
He found that with every test, with this particular family of ERVs, there were no orthologous sites of insertion. By using molecular clocks, Gifford determined that though this lentivirus endogenized in two different genera, it happened at about the same time– about 4.2 million years ago. Thus it can be found in the same location in various species of the genus Microcebus, meaning common descent. However it is in a different location in species of the genus Cheirogaleus, meaning Microcebus/Cheirogaleus infections were two independent events.
We can differentiate between common descent and two independent events.
And they didnt even need a complete ‘lemur’ genomic sequence to figure this out.