Alternate post title: Why Charles Jackson is a tool who can quote papers, but doesnt understand what he is reading.

I get this question all the time, and its totally valid:

How do you tell the difference between an endogenous retrovirus that is shared because of common descent, and a retrovirus that was endogenized independently in two species?

A follow up paper to the one I wrote about here (Me, and you, and Zaboomafoo) provides a lovely example of how we do this!

So heres the back story: There are a ton of different retroviruses. Its not one big homogeneous group of ‘virus kind’, each family of retrovirus has its own genetic features and personality quirks. For a nice comparison so you all can appreciate the diversity of retroviruses: you are as ‘related’ to green algae, as HIV-1 is ‘related’ to a family of ERVs, HERV-K.

Lentiviruses, like HIV-1, apparently dont like to endogenize very much. Weve looked in humans, primates… we cant find any. We found one in a particular species of bunnah. And they just found another one in Cheirogaleus medius, a kind of lemur.

Neato!

Well, another lab looked at different species and genera of lemur… and found another endogenous lentivirus! In another genus of lemurs, Microcebus! Additionally, these lemur ERVs are 93-96% similar to one another, while HIV-1 viruses are only 80-85% similar to one another, max.

At the surface, this might make it look like Microcebus and Cheirogaleus share this ERV because of an endogenization event in a common ancestor– ‘same virus’, two different genera. Right?

Wrong!

You need to remember that where retroviruses insert in a genome is random. Like Ive said a hundred times before, yes, some like to insert near active genes, and some like quiet genes, but exactly where a retrovirus inserts– near which active genes, exactly which nucleotides are up/down stream, is random.

However, there is no complete ‘lemur’ genomic sequence. How can you ‘tell’ where an ERV has inserted if you dont even have a genomic map?

PCR!

To identify putative lentiviral ERVs in lemur genomic sequences, Gilberts lab compared the RELIK sequence to lemur sequences (however little we have). Areas where the sequences matched, they called putative lentiviral ERV sites. Even if you dont have complete lemur genome sequences, you can still see what sequence is directly upstream and downstream from the ERV. In this drawring I made, A is a ‘normal’ sequence, B is the sequence after an ERV has inserted, and C is what happens when an ERV ‘pops out’– the pink parts, the viral LTRs (promoters) line up during cell division, and the gene portions of the ERV just pop out, leaving a ‘solo LTR’ footprint of where the complete ERV used to be.

Because Gilbert knew what sequence was upstream and downstream of the LTR, he could design primers specific for that location, and PCR amplify that region of the lemurs genome. IF there was not and had never been an ERV at that site in the genome, he expected a 250 base pair product. IF there used to be an ERV there, and there was only a solo LTR left (which is what he thought he saw in his RELIK/genome comparisons), he expected a 670 bp product (just an example, he did a few this way).

You can see how different these sizes look in a gel in Figure 5.

He determined that even though the lentiviral ERV sequences were super similar, they are not located in the same genomic regions. ‘Present’ PCR fragment sizes in Microcebus species were ‘absent’ fragment sizes in Cheirogaleus (he also tested this in a similar, but different way, Southern Blots, and other basic molecular genetics tools– Im not explaining them in this post, lol).

He found that with every test, with this particular family of ERVs, there were no orthologous sites of insertion. By using molecular clocks, Gifford determined that though this lentivirus endogenized in two different genera, it happened at about the same time– about 4.2 million years ago. Thus it can be found in the same location in various species of the genus Microcebus, meaning common descent. However it is in a different location in species of the genus Cheirogaleus, meaning Microcebus/Cheirogaleus infections were two independent events.

We can differentiate between common descent and two independent events.

And they didnt even need a complete ‘lemur’ genomic sequence to figure this out.

Comments

  1. #1 Anton Mates
    April 22, 2009

    cprs,

    Given criteria as rigorous, the orthologous CERV 1 and 2 sites for man and orangutan are all indisputably missing without trace, and these for viruses with subfamilies of estimated ages of 5, 7.8, 14 and 21 Mya.

    But by those same rigorous criteria, the CERV 1 and 2 inserts aren’t found to be orthologous between any of the relevant primate species, and the age estimates are found to be unreliable.

    If ERVs inserted in the germline of common ancestors how have all the traces of all of these disappeared?

    If the ERVs didn’t go to fixation, then by definition they would only have existed in some of our common ancestors. Humans and chimps, after all, are descended from a common ancestral population. If humans don’t have those ERVs, it would simply mean that the relevant stretches of our DNA had been inherited from that chunk of the ancestral population which didn’t have them either.

    But, again, this isn’t actually necessary to explain the pattern we see here. It could just be that chimps picked up these ERVs after we split.

    The nonshared families are numerically large and reportedly contain numerous orthologous sites in other species (>100 CERV1, most heterologous and ?14 CERV2, based on fig.4’s LTR tree http://genomebiology.com/2006/7/6/R51/figure/F4).

    I don’t think this is correct. The families are large, but according to these papers, none of their insertion sites in different species have been found to be orthologous. Most have been found to be definitely not orthologous (remember, if two sequences are heterologous, they’re by definition not orthologous!), and the rest probably aren’t either, although they’re more ambiguous.

    As shrunk (I think) suggested, I believe you may be misreading this sentence: “Consistent with our findings, the results of a previously published Southern hybridization survey indicated that sequences orthologous to CERV 1/PTERV1 elements are present in the African great apes and old world monkeys but not in Asian apes or humans.” “Orthologous” is being used here to refer purely to the viral sequences themselves, and not to their insertion sites.

    In other words, the sequences in question come from a common ancestral virus, but apes and monkeys haven’t all inherited them from an insertion in a common ancestral primate.

    Do you confidently attribute all this loss to lineage sorting/genetic drift?

    As the others (and the authors themselves) have said, I think it can be attributed primarily to errors in the age estimates they used. If the ERV subfamilies are not as old as the estimates suggest, then nothing ever had to be lost.

  2. #2 William Wallace
    April 23, 2009

    Abbie,

    I have completed a first pass simulation, based on random ERV insertions*, and wish to build another under a common descent model for comparison purposes.

    So I have some questions:

    1. Assume that ERV insertions are a random events.
    2. Assume the number of successful ERV insertions into an individual from a population during any given time period is proportional to the population of the species. (This makes sense if you think about it. ERVs insertions in question occur during at or near the time of fertilization. It seems reasonable that the number of opportunities for a successful ERV integration is the number of procreation events during a given year, times the probability of the ERV integration.)
    3. Assume that the population of each species grows geometrically.

    Under these assumptions, you would expect the vast majority of ERVs to be late.

    Assuming that the overwhelming majority of ERVs are not in fact late, as evidenced by the calibrated genetic clocks rumored to be as accurate as NIST’s caesium based atomic cloks, I assume it is because of near extinction events that cause founder effects.

    So, how should such near extinction events be simulated for a fair comparison? At a regular rate, one near extinction per X years, or should these be picked randomly as well?

    *It doesn’t look good for your side.

  3. #3 William Wallace
    April 23, 2009

    Clarification: It seems reasonable that the number successful ERV integrations is the number of procreation events during a given year, times the probability of the ERV integration. The number of procreation events during a given year is probably roughly proportional to the number of individuals in that population.

  4. #4 windy
    April 23, 2009

    3. Assume that the population of each species grows geometrically.

    For goodness sakes Willy, we were having an interesting discussion here.

    *It doesn’t look good for your side.

    Would that be the side of people who are not clueless about biology? Riddle me this: what’s the average time for an ERV to reach fixation in your simulation?

  5. #5 Shrunk
    April 23, 2009

    It also seems that WW is completely ignoring the nested hierarchy issue, and focussing on the completely irrelevent issue of whether ERV’s are “late” or “early” (whatever the hell that’s supposed to mean). This doesn’t surprise me. He probably realized, too late, the large number of possible cladograms with only seven species, as his example uses. That number is 10,395. Of course, limiting to just seven species is arbitrary as the ERV pattern is not limited to just those 7 primates. You could even include the Bonobos since, under WW’s non-random insertion model there’s no reason that chimps and Bonobos should have common ERV’s tht indicate common descent. So if he’s really being rigorous he should include an 8th species. That bring the number of possible cladograms to 135,135.

    IOW, not only is it necessary for WW to explain how a nested hierarchical pattern of any sort can occur without common descent. He also needs to explain how ERV’s produce the specific cladogram that is consistent with all other independent lines of evidence.

    At the very least, doing all that complicated math should hopefully keep him occupied long enough to allow the grown ups to have an intelligent converstation in peace.

  6. #6 Shrunk
    April 23, 2009

    I’m also not sure that WW is correct that ERV’s are more likely to occur during fertilization. As I understand it, it only requires that a germ cell infection occur in one of the parents, which could happen at any time. But I could be wrong.

  7. #7 William Wallace
    April 23, 2009

    Shrunk,

    I think you’re misunderstanding something. It turns out that even under a random model of ERV insertions, it is possible to construct a simulation that generates data consistent with the nested hierarchies that support common descent.*

    As a comparison simulation, however, I’d like to write a simulation for the common descent model. This will allow a comparison against the non common descent model, to see how significant a difference common descent would make on the outcomes of the simulations.

    Late means recent, early means a long time ago.

    The argument I am making is a mathematical argument.

    If an ERV exists, it is more likely to come into existence when there are more individuals in a population.

    Under the assumptions, you would expect most ERVs in any given individual to be late (recent) using evolutionist time scales. Consequently, without some other assumption, you would not necessarily expect nested hierarchies to only support CD. In other words, depending on the probabilities you assign, in a monotonically increasing population, if common ERVs were found across species, you might also find NEs that do not support common descent as well as NEs that do.

    Unless near extinction or founder effects cause ERVs to be fixed in the popoulation more quickly. Consequently, under the common descent simulation, I need to somehow account for most ERVs being early, and not late, and also differentiate between ERVs in current individuals versus ERVs established in a population. E.g., maybe ERVs only come into existence during pandemics that decimate the population.

    I do not think that scientists believe most ERVs are recent in origin.

    Plausible explanations would be founder events and/or near extinctions and/or population boom/bust cycles.

    Consequently, I am trying to ascertain what other assumptions I need to make to get a common descent simulation that generates data roughly consistent with actual observations.

    Do you understand, yet?

    I’m also not sure that WW is correct that ERV’s are more likely to occur during fertilization. As I understand it, it only requires that a germ cell infection occur in one of the parents, which could happen at any time. But I could be wrong.

    I can go along with that, but the point is that the probability of an RV successfully integrating into a genome times the number of opportunities to do so will give you an expected value. And the number of opportunities is at least in part proportional to the number of hosts in the population. (It is also proportional to the number of RVs in existence at the time, which is itself probably proportional at least in part to the population of the RV’s host).

  8. #8 Pete
    April 23, 2009

    It turns out that even under a random model of ERV insertions, it is possible to construct a simulation that generates data consistent with the nested hierarchies that support common descent

    Forget the rest of what you are doing and just focus on this claim. This is the very claim the rest of us are saying is not true, so since you think it is true please demonstrate it.

  9. #9 William Wallace
    April 23, 2009

    This is the very claim the rest of us are saying is not true, so since you think it is true please demonstrate it.

    In due time. But I think you’re disbelief is unfounded, as my claim isn’t that hard to believe (unless you’re reading much more than what I am writing).

  10. #10 Shrunk
    April 24, 2009

    In due time. But I think you’re disbelief is unfounded, as my claim isn’t that hard to believe (unless you’re reading much more than what I am writing).

    Taken literally, I agree what you wrote is not difficult to believe: “…even under a random model of ERV insertions it is possible to construct a simulation that generates data consistent with the nested hierarchies that support common descent.”

    All you would have to do is create a simulation that models evolution and common descent. Such a simultaion can only create a nested hierarchy pattern. Your task, in part, is to show why such a model would not produce a nested hierarchy. Or how a model using non-random insertions and separate creation of species can produce such a pattern.

  11. #11 Stephen Wells
    April 24, 2009

    The “assume geometric growth” line is already enough to trash the whole “model”. And Willy, what assumptions are you making about _where ERVs integrate in the genome_?

  12. #12 William Wallace
    April 24, 2009

    Stephen Wells,

    I haven’t even started the common descent model. I am just pointing out that if you assume geometric growth, even 1.0000001, you would expect most of the ERVs to be recent.

    Do you disagree with this conclusion?

    And, since this conclusion contradicts my understanding of the current observations, this would be a flaw with a model that simply assumed geometric growth. So I am asking about other effects, such as near extinction events, founder events (e.g., due to migration), or other effects that decimate and concentrate ERVs in a population.

    Please do try to be honest, Mr. Wells.

  13. #13 W. Kevin Vicklund
    April 24, 2009

    But why assume geometric growth? I agree that there will be geometric reproduction rates, but various selective pressures usually relegate population sizes to a quasi-equilibrium. Homo sapiens is currently experiencing geometric growth, true, but that is because we haven’t hit the new equilibrium. Historically, even human populations stop geometric growth when they reach the new equilibrium point. What you should be modeling is long periods of no growth with short bursts of geometric growth.

  14. #14 cprs
    April 24, 2009

    Anton “But by those same rigorous criteria, the CERV 1 and 2 inserts aren’t found to be orthologous between any of the relevant primate species, and the age estimates are found to be unreliable.”

    That’s the opposite of what the authors claim, read the paper again.
    “Consistent with our findings, the results of a previously published Southern hybridization survey indicated that sequences orthologous to CERV 1/PTERV1 elements are present in the African great apes and old world monkeys but not in Asian apes or humans.”
    “As was the case for the CERV 1/PTERV1 family, these age estimates are inconsistent with the fact that no CERV 2 orthologues were detected in the sequenced human genome. Again, we were able to detect pre-integration sites at those regions in the human genome orthologous to the CERV 2 insertion sites in chimpanzees, effectively eliminating the possibility that the elements were once present in humans but subsequently excised.” The problem is properly acknowledged not brushed away, and they seek to provide some explanations.

    There are multiple, rigorously (as rigorous as Windy’s well demonstrated findings above) determined orthologues for “chimpanzee, bonobo and gorilla” but “absent in human, orangutan, old world monkeys” for CERV1 and CERV2.

  15. #15 Shrunk
    April 24, 2009

    cprs,

    You are still misunderstanding the use of the term “orthologous” in this context (for which I think the authors might partially be to blame). It does not mean the ERV’s are located at identical sites in African great apes and old world monkeys.

    It’s the viruses that share the common ancestor. The ERV’s are the result of independent infections, much like the ones in the lemurs that started this thread, and as such are not relevent to determining phylogenetic relationships.

    You’ve already been told this. In fact, it’s explained in Anton’s post that you are responding to.

  16. #16 Anton Mates
    April 24, 2009

    cprs,

    That’s the opposite of what the authors claim, read the paper again.

    “Consistent with our findings, the results of a previously published Southern hybridization survey indicated that sequences orthologous to CERV 1/PTERV1 elements are present in the African great apes and old world monkeys but not in Asian apes or humans.”

    Reread that sentence. They’re talking about orthologous sequences, not orthologous insertion sites. That’s why this was established through Southern blots, which can only tell you whether a DNA sequence is present in the sample, not its actual location. And that’s why, in the very next sentence, the authors explicitly state that the CERV1 insertions in chimps are not orthologous to those in any other primate:

    “These results suggest that some members of the CERV 1/PTERV1 subfamily entered the chimpanzee genome after the split from humans through exogenous infections from closely related species and subsequently increased in copy number by retrotransposition.” (emphasis mine)

    Again: when you’re looking for orthologous sequences, you’re talking about the family tree of the viruses. When you’re looking for orthologous insertion sites, you’re talking about the family tree of the primates. Does that make sense?

    The authors say that there is only one line of evidence that these insertions “ought” to be orthologous with respect to primates: namely, the results of their age estimator, which is based on sequence divergence. That’s the only problem. All other evidence indicates that the insertions are not orthologous, and the authors provide reasons why their age estimator would be in error here. They’ve resolved the problem, as far as I can see.

    But why don’t you email them and ask about it?

  17. #17 cprs
    April 24, 2009

    I see your distinction, Anton, and its validity (it reflects the same misperception I had earlier on with Abbie, but in reverse, if that makes sense, and highlights the plasticity of the term orthologous). Thanks for pointing this out. I am now not sure what the authors actually intended, but you may well be right, and will contact them, since the distinction is important. I hope to post back.

  18. #18 Anton Mates
    April 24, 2009

    Great, I look forward to their response.

    BTW, I don’t think “orthologous” is a particularly imprecise term in general. It’s just that, in the particular case of horizontal transmission, one suddenly has multiple types of “ancestry” to contend with. I see that the term “xenologous” is sometimes used to denote a common origin through horizontal transmission; using that terminology, the authors might say that the CERV1 family is xenologous across various primate species, but not orthologous across the same. (Unfortunately, “xenologous” doesn’t seem to be a very common term.)

  19. #19 William Wallace
    April 24, 2009

    W. Kevin Vicklund wrote:

    What you should be modeling is long periods of no growth with short bursts of geometric growth.

    Thanks. This is exactly the type of feedback I am looking for.

  20. #20 Stephen Wells
    April 25, 2009

    That’s Dr Wells to you, little Willy. Have fun with your toy models.

  21. #21 windy
    April 26, 2009

    If an ERV exists, it is more likely to come into existence when there are more individuals in a population.

    But the fixation rate of neutral mutations is equal to the mutation rate (for ERVs, that would be the probability of integration per individual). So this “most ERVs are late” is just a red herring regardless of whether you are modeling common descent or not.

    There are multiple, rigorously (as rigorous as Windy’s well demonstrated findings above) determined orthologues for “chimpanzee, bonobo and gorilla” but “absent in human, orangutan, old world monkeys” for CERV1 and CERV2.

    If they had been “as rigorous” I would have downloaded an example of one of them as well. But the information simply isn’t there. As Anton says, the paper seems to use “orthologous” for the virus sequences as well as for their integration sites.

  22. #22 cprs
    April 29, 2009

    Comments noted Windy, although it may be best to avoid this ambiguity in sense of ‘orthologue’. Still no response from the authors so far – none has even read the email yet!

  23. #23 cprs
    May 2, 2009

    Reply imminent, I’m told.

  24. #24 cprs
    May 4, 2009

    So here’s the question I asked and the answer received this afternoon in full:

    cprs: My primary question is about the meaning of the phrase ‘orthologous sequences’.

    CERV 1/PTERV1
    ‘Consistent with our findings, the results of a previously published Southern hybridization survey indicated that sequences orthologous to CERV 1/PTERV1 elements are present in the African great apes and old world monkeys but not in Asian apes or humans ‘
    The allusion is to another paper on CERV1 which refers to a number of ‘ambiguous’ orthologues (the vast majority of course being non-orthologous), for which imprecision prevented exact site identification. If the sites were truly orthologous the authors claim at least six of them would have been lost in man and orangutan.

    CERV2
    ‘As was the case for the CERV 1/PTERV1 family, these age estimates are inconsistent with the fact that no CERV 2 orthologues were detected in the sequenced human genome. Again, we were able to detect pre-integration sites at those regions in the human genome orthologous to the CERV 2 insertion sites in chimpanzees, effectively eliminating the possibility that the elements were once present in humans but subsequently excised.’

    My question is, does the term orthologous sequences refer to identical chromosomal sites presumably inherited from a common ancestor in the primates in whom the sequences were shared (the sense used in the other paper), or does it refer simply to sequences of the same viral origin amongst these primates irrespective of their chromosomal site of insertion? The text seems to point to the former meaning, whilst it’s clear that humans lack the virus at the same site as the chimp, after some discussion with others I am now less certain about the other primates who do share the virus (are their insertion sites identical or different from Pan trog.?). Is this perhaps the result of unexpectedly late cross infection?

    Author response
    In our paper we were referring to both.

    In our statement (‘Consistent with our findings, the results of a previously published Southern hybridization survey indicated that sequences orthologous to CERV 1/PTERV1 elements are present in the African great apes and old world monkeys but not in Asian apes or humans ‘) we were referring to sequences of the same viral origin amongst these primates

    while in the latter statement (‘we were able to detect pre-integration sites at those regions in the human genome orthologous to the CERV 2 insertion sites in chimpanzees, effectively eliminating the possibility that the elements were once present in humans but subsequently excised.’) we were referring to chromosomal sites inherited from a common ancestor in the primates.

    The exact origin of these viruses is under investigation.

    cprs:
    In conclusion, Anton is partly right, but the problematic nature of the lack of CERV2 and possibly 1 orthologues (chromosomal insertion sites) for monophyletic inheritance remains as I last outlined above. The issue is compounded since excision can be excluded and the apparent phylogenetic tree does not match convention. This is not just a question of ERV sequences of common viral origin at least for CERV2, and very possibly in some of the ‘ambiguous’ CERV1 sites too.

    There are many interesting ancillary questions that need to be addressed: Where exactly are these orthologous sites (the authors clearly do know)? How many are shared by other primates, and which? Does this phenomenon also occur for other families of CERVs even if not all are missing, and if so what is the pattern? What about the dating problems – why are apparently old viruses still actively transcribing and also completely absent in two branches of primates? The authors have not yet addressed these directly apart from the paper, although some of them had been raised in my mail.

  25. #25 Shrunk
    May 5, 2009

    cprs,

    I didn’t get a response at all this time. Lucky you!

    I’m not sure why you think there remains anything “problematic” here. Anton wasn’t “partly right”. He was completely right. These ERV’s are the result of independent insertions, and the authors’ use of the term “orthologous” does not mean the insertions are at identical sites. Humans don’t possess CERV 1 or 2 because we were never infected with the corresponding retrovirus. In the case of CERV 1, the molecular basis for humans’ resistance to the virus is even known. End of story, as far as I can see.

  26. #26 cprs
    May 7, 2009

    ‘We were referring to chromosomal sites inherited from a common ancestor in the primates’, but not for man and orangutan. http://genomebiology.com/2006/7/6/R51/figure/F5

  27. #27 Anton Mates
    May 7, 2009

    cprs,

    That figure has nothing to do with chromosomal sites. It’s referring to “CERV 2 elements”–that is, sequences of the same viral origin.

    I have to agree with, er, Shrunk’s agreement with me–the authors have cleared up everything. CERV 1 and 2 sequences across various primate species are orthologous with respect to viral origin, but not with respect to a common primate ancestor.

  28. #28 Shrunk
    May 7, 2009

    It’s frustrating enough to have to correct creationists’ scientific ignorance. It really drives me up the wall when we have to give them remedial lessons in reading comprehension.

    cprs:

    ‘We were referring to chromosomal sites inherited from a common ancestor in the primates’, but not for man and orangutan.

    So you quote from the response you received, then add your own superfluous phrase that has nothing to do with what the authors were writing about? Nice.

    Read what the authors said again:

    “while in the latter statement (‘we were able to detect pre-integration sites at those regions in the human genome orthologous to the CERV 2 insertion sites in chimpanzees, effectively eliminating the possibility that the elements were once present in humans but subsequently excised.’) we were referring to chromosomal sites inherited from a common ancestor in the primates.”

    They are referring to the intact sites in the human genome that correspond to the sites in the chimp that contained the ERV. That these sites remain intact rules out the possibility that the ERV once existed in the human genome and was subsequently excised. These sites are present in all the primates (humans and orangutans included), which already blows a pretty big hole in the doctrine of denial of common descent.

  29. #29 cprs
    May 8, 2009

    Anton, fig. 5 refers to CERV 2 and so does the quote. You say ‘not with respect to a common primate ancestor’, argue with the authors not with me, they say ‘inherited from a common ancestor’.

    Shrunk, the ‘sites’ may be present, the ERVs are not, that’s the point.

  30. #30 W. Kevin Vicklund
    May 8, 2009

    ‘We were referring to chromosomal sites inherited from a common ancestor in the primates’, but not for man and orangutan. http://genomebiology.com/2006/7/6/R51/figure/F5

    —facepalm—

    Let’s break down what they are saying there into digestible chunks. They looked at chromosomal sites that are orthologous, that is, they are inherited from a common ancestor in the primates. But they only looked at those sites in humans and chimps. They didn’t look at those sites in gorillas or orangutans or old world monkeys or anything else, just humans and chimps. Therefore, while you can conclude that chimps and gorillas have orthologous CERV 2 sequences, you can’t conclude that they have orthologous insertion sites. The authors of the study simply didn’t investigate that (possibly because the gorilla data isn’t yet available).

    There are several possible explanations (and likely the actual explanation is a combination of several) for how this minor discrepancy arose that don’t in any way falsify evolutionary theory.

  31. #31 Shrunk
    May 8, 2009

    You’re still not understanding, cprs.

    When they say “inherited from a common ancestor” they are referring to parts of the genome all the primates share in common, because these parts of the genome can be traced back to the common ancestor of all the primates. The quote has nothing to do with CERV 2, except to say that, at the site in the human genome where one would have expected to find the human version of CERV 2 (if CERV 2 arose from a common ancestor of humans and chimps), no ERV is found. This is evidence that CERV 2 was inserted into the chimp lineage after divergence from the human line.

    The diagram only shows that the distribution of CERV 2 among the primate lines is consistent with separate, independent insertions, and not with inheritance from a common ancestor.

    Shrunk, the ‘sites’ may be present, the ERVs are not, that’s the point.

    I have no idea what you think the point is here. Again, this simply indicates that CERV 2 inserted into the chimp line after divergence.

    You’ll find it easier to understand the article if you don’t insist on misinterpreting it to conform to creationist ideology.

  32. #32 Anton Mates
    May 8, 2009

    cprs,

    Anton, fig. 5 refers to CERV 2 and so does the quote.

    Sure, but that’s irrelevant to the meaning of “orthologous”. The word doesn’t change its definition for each viral family, you know!

    One last time, let’s look at what the authors told you.

    In our statement (‘Consistent with our findings, the results of a previously published Southern hybridization survey indicated that sequences orthologous to CERV 1/PTERV1 elements are present in the African great apes and old world monkeys but not in Asian apes or humans ‘) we were referring to sequences of the same viral origin amongst these primates

    while in the latter statement (‘we were able to detect pre-integration sites at those regions in the human genome orthologous to the CERV 2 insertion sites in chimpanzees, effectively eliminating the possibility that the elements were once present in humans but subsequently excised.’) we were referring to chromosomal sites inherited from a common ancestor in the primates.

    (my bolds)

    For ERVs, there are two meanings of “orthologous.” One refers to sequences or elements–the actual genetic sequence of the ERV itself–and this indicates common viral origin. The other refers to sites or regions–the location in the genome of the host primate where the ERV was inserted–and this indicates a common ancestor in the primates.

    Both meanings could apply in a discussion about CERV 1, or about CERV 2, or about any viral family. So you can’t just look at which family is discussed–you have to see whether the authors are talking about sites or about sequences.

    In Figure 5, what are the authors talking about? “Distribution of CERV 2 elements among primates,” using PCR and Southern hybridization. This is clearly a discussion of sequences; an element is a DNA sequence, and hybridization can only tell you whether a sequence is present, not where it’s located–IOW, it can’t tell you the site or region of insertion. So Figure 5 has implications for common viral origin, not for common ancestry among the primate species in question.

    Which really shouldn’t be surprising. Figure 5 comes from a CERV 2 discussion which is exactly parallel to the CERV 1 section the authors quoted to you–and they did that as an example of the “common viral origin” meaning of “orthologous!

  33. #33 cprs
    May 8, 2009

    Kevin, wouldn’t you be interested in knowing whether ortholgous gorilla sequences have orthologous sites to chimps? There is tentative evidence for some orthologous non human primate sites for CERV1 from the earlier study http://biology.plosjournals.org/archive/1545-7885/3/4/supinfo/10.1371_journal.pbio.0030110.st003.pdf, though the vast majority are not http://biology.plosjournals.org/perlserv/?request=slideshow&type=table&doi=10.1371/journal.pbio.0030110&id=2858 . Some apparent orthologues on further examination were clearly excluded (macaque to chimp)http://biology.plosjournals.org/archive/1545-7885/3/4/supinfo/10.1371_journal.pbio.0030110.sg005.pdf

    It will be interesting to see the full data for CERV 1 & 2 when full sequencing is available (no indication from the authors of where the chimp sites were except the index example).

    Shrunk, the question is how CERV2 could have extensively populated some primates but not man. If cross infection or autocopying is your only explanation that alone is revealing. Especially if some sites do turn out to be orthologous, since on your current assumptions none of them should be.

    Anton, I agree with your points. My intention in linking the figure to the comment was to underscore the complete absence of CERV 2 from oranutans and man, despite its significant presence in other primates ostensibly from the same roots.

  34. #34 cprs
    May 9, 2009

    I agree it’s time to bury the thread (and me) in peace, but I do want to express my appreciation to all for the comments, I’ve learned a lot from them. I know I’m a bull in a china shop (a medic dealing with the delicate concepts and perhaps sensitivities of molecular biologists). You have a highly enviable field to pursue. Maybe it will be a little grit to the pearl of your thoughts.

    Best wishes,
    Charles Soper

  35. #35 Anton Mates
    May 10, 2009

    cprs,

    Kevin, wouldn’t you be interested in knowing whether ortholgous gorilla sequences have orthologous sites to chimps? There is tentative evidence for some orthologous non human primate sites for CERV1 from the earlier study

    That’s a rather misleading way of putting it. The authors found strong evidence that various sites (the ones of chimps vs. macaques and gorillas) weren’t orthologous, and for the remainder (chimps vs. baboons and macaques vs. baboons), the evidence was ambiguous. Your claim is analogous to, “We don’t know exactly who stabbed Julius Caesar and what weapons were used, which is tentative evidence that he was actually killed with a laser gun by a time-traveling Hitler.”

    Shrunk, the question is how CERV2 could have extensively populated some primates but not man. If cross infection or autocopying is your only explanation that alone is revealing.

    Revealing of what? That’s what the currently-accepted phylogeny would imply, certainly.

    Especially if some sites do turn out to be orthologous, since on your current assumptions none of them should be.

    Well, sure–if they turned out to be orthologous, it would suggest that chimps (and bonobos) and gorillas were sister species, which would be pretty remarkable. Not much evidence for that so far, though.

    Anton, I agree with your points. My intention in linking the figure to the comment was to underscore the complete absence of CERV 2 from oranutans and man, despite its significant presence in other primates ostensibly from the same roots.

    Fair enough, but you incorrectly referred to the insertion site as shared by various primates, but not by orangutans and man–

    ‘We were referring to chromosomal sites inherited from a common ancestor in the primates’, but not for man and orangutan

    –which simply isn’t true. In fact, we know that man does have a site orthologous to the chimpanzee site; that’s why the authors were able to examine it and conclude that we’d never had an insertion there!

  36. #36 charles s
    February 22, 2010

    I have found evidence for a chimp-macaque orthologous site for a non-human CERV, though not as tidy as AS’s example above (193), if published I will bring it back here for critical scrutiny.

  37. #37 charles s
    June 23, 2010

    A few shards and some coordinates, but I have not been able to search with the efficiency I had hoped for. Sorry not to do better – time is too short. Evan Eichler promised his team will look again at candidate CERV1 orthologs.

  38. #38 charles s
    June 29, 2010

    Dear Abigail, will you please post my message?

  39. #39 charles s
    July 23, 2010

    Here it is again (I guess a server problem bounced me first time round), orthologues, but so much more needs to be done on this than a paltry and fragmentary search by an outsider, I heartily welcome Prof Eichler’s personal commitment to reexamine his data with updated tools.

  40. #40 W. Kevin Vicklund
    July 26, 2010

    Charles s: (I presume you are cprs from a year ago)

    Several observations:

    1) This thread is over a year old, and it’s only chance that I saw you post here. If you want to engage in discussion on this topic again, it may behoove you to point out on a recent thread that you have made a comment on an old thread with link to that thread.

    2) The example @193 you talk about here and on your blog was given by windy. Proper attribution is important in academia.

    3) Word usage: on your blog you used “putative” and “apparent” to describe the anomalous sites. “Putative” is appropriate, but “apparent”” is not. “Possible” or “alleged” would be good substitutes.

    4) On a somewhat more recent thread, I calculated the odds that at least one anomalous site per species pair would arise. It is rough, of course, but it came out as about 36%. Follow the link for a more detailed discussion. Bottom line is that the reported results are not at all surprising for so coarse a sample of the genome.

    5) Your discussion of what you have found is rather brief. Frankly, I’m not sure what evidence you have found. Could you expand a bit? I’ll try to follow this thread for the next while, but be aware that I have some vacation coming up and this thread is not easy to find.

  41. #41 charles soper
    July 28, 2010

    Dr Vicklund, thanks for your comments.
    1. I apologise for being slow, although recent comments always show on the main page, whichever thread they appear in.
    2. The attribution appears correct given ‘Windy’s’ identity elsewhere.
    3. I have changed the description of apparent to potential.
    5. I will post the coordinates for the details of the matches, and crosspost to your own blog when it’s done. I readily acknowledge they fall far short of the long homologous sequences found for CERV5.
    4. I am not fully pursuaded of the applicability of the birthday paradox, not least given evidence of the non random nature of some genetic change (e.g. 1G5 gene mutations in Drosophila and evidence of homoplasy between less ‘related’ species, giving rise to instances of illusory common descent). Best wishes.

  42. #42 W. Kevin Vicklund
    July 28, 2010

    Whoa, full stop!

    I am not a doctor, a Ph.D, or anything else like that. The highest degree I hold is a BS in electrical engineering (power systems), unless you want to include the joke degree “Master of Pepology.” The closest I’ve come to a doctorate is that a class assignment of mine has been cited by several doctoral dissertations after it was made part of the curriculum by my professor.

    Back to the discussion.

    1. I just wanted you to be aware that I and others might miss you posting.

    2. If I’ve parsed what you said correctly, you are accusing Abbie of sock-puppeting on her own blog (ie, that windy is in fact Abbie). This is a pretty serious accusation, at least for the internet. Can you back it up?

    3. Thanks!

    5. I look forward to it. I have not played with BLAST yet, so while I wait for your results, I’ll take a look around so I can better understand your data. I appreciate your willingness to present your evidence with greater detail and explanation. I prefer that over having to guess what someone else is arguing.

    4. I think we can both agree that the ‘birthday paradox’ is, at best, a very rough approximation. In fact, I mentioned some of the reservations I had about it’s applicability. The main utility of it is that it provides a base point of reference as to the minimum homoplasy we can expect for a given granularity of the genome. Basically, it establishes that, even with purely random genetic change, evolutionary predicts a certain amount of homoplasy. Obviously, a greater amount of non-random contribution will increase the amount of homoplasy. Now, the important part. If we can quantify the non-random contribution, we should be able to predict the amount of homoplasy. (Of course, keeping in mind that these are statistical predictions). If the predictions match reality, that is strong support for the theory. If it is off, then we need to re-evaluate the theory to see where it is lacking – possibly even abandoning the theory altogether.

    Does this make sense to you?

  43. #43 Kemanorel
    July 29, 2010

    4) On a somewhat more recent thread, I calculated the odds that at least one anomalous site per species pair would arise. It is rough, of course, but it came out as about 36%. Follow the link for a more detailed discussion. Bottom line is that the reported results are not at all surprising for so coarse a sample of the genome.

    Just as an FYI, there’s an easier way to determine probabilities to find sequences by chance.

    I know you said you haven’t played with BLAST, but when you do BLAST the sequence, look at the e-value associated. It tells you a probablity that the match happened by chance.

    1G5 gene mutations in Drosophila and evidence of homoplasy between less ‘related’ species, giving rise to instances of illusory common descent).

    Do you mean the 1G5 gene mutations as evidenced by a paper by Peter Borger who has yet to be accepted via peer-review to any reputable journal for teh subject of 1G5 gene mutations? The one’s who’s a member of ISCID, a group who’s tagline is “retraining the scientific imagination to see purpose in nature?” The one who’s claiming persecution at the hands of Darwinists?

    “This is what Kreitman, editor of JME argued as well (Of course he rejected the paper, as did the other five or so Darwin-dominated journals)” and that it’s some unnamed mechanism that must be doing it rather than evolution?

    The paper the shows he’s obviously looking at it backwards, that it wasn’t the human, chimpanzee, orangutan, and guinea pig genomes that all had the same mutation, but rather that it was just the rat that had one mutations at each of the sites noted?

    Please tell me you didn’t really take this “evidence” seriously…

  44. #44 W. Kevin Vicklund
    July 30, 2010

    Kemanorel:

    Thanks for the heads up. However, I think I was calculating a different probability than the e-value given by BLAST. Also, thanks for the pointer on the paper Charles appears to be referring to. I’ve now read it.

    Charles:

    I just read the Borger paper. I hsve a number of comments about it.

    Drosophilia spp.

    Comparing the introns of the two species reveals ten polymorphic sites immediately adjacent to each other (figure 1, position 153-162). The chance that ten point mutations occur at random in the intron equals 1.4 x 10-18. By way of contrast, the chance that ten adjacent mutations occur in the intron equals 2.2 x 10-14. Because the authors demonstrate no significant deviation from the assumption of a neutral evolution in this region, natural selection cannot explain this cluster of mutations. It is therefore reasonable to assume that the cluster of mutations observed in the introns was not the result of a random accumulation of (point) mutations.

    What Borger fails to consider is that point mutations are not the only source of mutation. My guess is that this is the result of the DNA repair mechanism. Ie, that a break in the DNA occured and was repaired with a short, random sequence. Not the only explanation of course. But the most parsimonious.

    the sequences were obtained from species
    inhabiting separated continents and are therefore reproductively isolated.

    Borger is wrong. There is no indication that these populations have been reproductively isolated for a sufficiently long preiod of time for these mutations to arise in isolation. Both of these species are native to Africa. In most of the locations supplying the genetic samples, Drosophilia is an invasive species. Any isolation based on current geographic location is only a few hundred years old, at best, significantly less time than the original paper Borger refers to calculates. This means that a single isolated population might have colonized several different continents, while another population might have populated other areas of those same continents.

    If there was no additional information for the 1G5 genes, we would most probably be inclined to argue that D. mel-1, -4, -5, -6, -11 and -13 are very closely related, since they all have exactly the same gene and thus have a very recent common ancestor. Likewise, D. mel-7, -9 and -12 must have a recent common ancestor as they have several shared point mutation; for the same reason D. mel-3 and -10 must share a recent common ancestor.

    The information he was lacking (I hope for his sake) puts the random mutation hypothesis back on solid ground. Also note that the authors of the original paper did not claim these populations were isolated. I will note that the authors of the original paper treated the ten contiguous changes as separate point mutations, so they may have overestimated the amount of intronic variation.

    Conclusion: Borger’s failure to allow for other types of mutation and lack of knowledge about the chrono-geographic location of the population of Drosophilia renders his argument worthless.

    Vitamin C

    As Kemanorel noted, Borger is assuming that the rat gene has not had any mutations, instead treating it as if it were the ancestral gene. Once you recognize that the rat gene can also experience mutation, the phylogeny puts the great apes as closely related to humans, and rats and guinea pigs as distantly related on a different branch.

    Conclusion: Not enough data to support an inference of non-random mutation (wrt location in genome) – you really need a more diverse sample (breadth and depth). That said, the concept of genomic hotspots is not inconsistent with evolutionary theory. There is almost certainly a non-random component to the location of mutations which has long been recognized.

  45. #45 windy
    August 1, 2010

    If I’ve parsed what you said correctly, you are accusing Abbie of sock-puppeting on her own blog (ie, that windy is in fact Abbie)

    LOL!

  46. #46 ERV
    August 1, 2010

    **shrug**

  47. #47 charles soper
    August 11, 2010

    Sorry for the honorary doctorate, Kevin, I confused you with Stephen Wells.
    I note the criticisms of Borger’s paper with interest – well sleuthed.
    As to the sock puppeting, I don’t at all agree it’s such a serious matter – even Christ once concealed his identity – and why should Abbie be seen to demean herself to deal personally with every tedious objector?
    Apologies too for my snail-like pace – I have been busy of late, will post (and cross post) details soon, it’s simply a matter of fishing through my messy data storage – today if possible, the findings are modest – sifting through the genome and observing its bizarre patterns of repetition has been fascinating to me as a novice, but I appreciate much better some of the difficulties.

  48. #48 charles soper
    August 11, 2010

    I searched for the pol gene of CERV1 (4544..5611 from NCBI’s AY692036.1) in the macaque genome and then took contiguous sequences of the hits from this with adjacent non CERV containing sequence and blasted it against Pan Trog.
    In the macaque contig NW_001112574.1 the a portion of CERV1 pol gene is found at 7352782-7351802(-), (though there’s a gap in pol sequence from its 487th to 664th nucleotide).
    Two non CERV containing adjacent fragments (I called a and b) from the macaque contig are also with the pol gene in the chimp contig for chromosome 2, Contig33.125 (AACZ03012828.1).
    The fragments a and b are found at 7347266-7347577 and 7349461-7349794 in NW_001112574.1. None of 94 other chimp contigs shared either of these two non-CERV fragments from the macaque genome and CERV1 pol sequence.
    A third non CERV fragment from the same macaque contig (which I called c) 7346538-7350059, near the pol sequence, was not in the chimp contig, but showed widespread distribution throughout the genome of several primates (nearly 3 k hits in man and 3,048 in orangutan). It was inside the regions of 15 different macaque genes.

    The second interesting site was shared between the gorilla contig CABD02426596.1 and chimp AADA01328632.1 . Most of the shared sequence is from CERV1, but there is a non CERV1 region from 6129 to 6395 in the gorilla sequence also found in the chimp sequence at 800 to 1065 (Blast scores: 375 375 100% 3e-107 92%).

  49. #49 charles soper
    September 20, 2010

    Comments?

  50. #50 W. Kevin Vicklund
    September 20, 2010

    Sorry… just got back last night from New Orleans and a cruise in the Caribbean, prior to which I had my usual vacation and was working massive overtime. In short, I haven’t had much of a chance yet to look at your results in detail, plus my computer doesn’t seem to play well with the BLAST website, so it’s been a slog. And it always takes several days to recover from a long trip. So, probably no comment until next week.

    Hmm… maybe if you told me the parameters you used it would go faster?

  51. #51 charles soper
    September 25, 2010

    No hurry, I’ve been much slower, and as I say I fully agree they’re hardly earth-shattering findings. If you need help with the parameters please email me cpsoper at gmail. I have saved some but not all the search strategies and can send them on.

  52. #52 charles soper
    December 28, 2010

    Still no thoughts?

  53. #53 W. Kevin Vicklund
    January 5, 2011

    I’ve pretty much had to give up on this until I get a new computer.

  54. #54 charles soper
    February 22, 2011

    Kevin, if you (or others) get a chance to look at this or other apparent discrepancies, I’d be grateful for a mail.

The site is currently under maintenance and will be back shortly. New comments have been disabled during this time, please check back soon.