Thursday, April 20, 2017

The last molecular evolution exam: Question #3

The Three Domain Hypothesis has eukaryotes and archaea branching off from eubacteria. It shows eukaryotes more closely related to archaea than to eubacteria. However, many scientific studies indicate that a majority of our genes are more similar to eubacterial genes than to archaeal genes. How do you explain this apparent conflict?

Question #1, Question #2, Question #3, Question #4, Question #5, Question #6


  1. I think the question ought to be rephrased in order to get the answers you presumably want. As it stands, a high rate of evolution in the archaeal lineage would explain gross similarity.

    1. I'm stuck figuring out what answer Larry would want. My initial response would be that lacking an outgroup you can't root the 3 domains and thus end up with a polytomy anyway. There's too much additional information for that to be what Larry is going for. So then I would guess it's about endosymbiosis and in that case it's worth noting that there are 3 basic hypotheses on the origin of the nucleus currently debated in the literature (one is an autochtonous formation of the nucleus, one is an archean endosymbiont and one a bacterial endosymbiont. I'm not sure why nobody is considering the option of a stem-line eukaryote as the endosymbiont, given the alternatives are already pretty dissimilar). If one goes with the bacterial endosymbiont nucleus then that's it. In the other cases we're talking about mitochondria (where it's pretty settled that we're looking at bacterial endosymbionts) and subsequent transfer of mitochondrial genes to the nucleus. And of course you are right John, the way it's presented allows for the alternative of "these similarities are plesiomorphic". It's not likely in this case, but it's worth mentioning, because there's quite a bit of talk about HGT that ignores the possibility of something simply being the ancestral state.

    2. Hey, you might be able to root on paralogous genes, even without an outgroup.

    3. Presumably Larry has been regaling his students with tales of how the Tree of Life has "fallen" and that phylogeny is a hopeless endeavor due to horizontal gene transfer and is looking for an answer along those lines.

      Of course I'm surprised that Larry brought up "similarity" -- mere sequence similarity is not a good evolutionary measure, as in often shown in practice. The top BLAST match of a sequence is often not its closest relative on an phylogenetic tree.

    4. John Harshman insults me by suggesting, "I think the question ought to be rephrased in order to get the answers you presumably want."

      The course is about critical thinking focused around controversies in molecular evolution. Students are encouraged to think on their own. They know that trying to guess the answer that I want is probably not going to get them a good grade.

      Feel free to answer the question (you too, Jonathan Badger). I'll let you know what grade you deserve.

      P.S. If your answer is that a high rate of evolution in archaebacteria explains the result then you would fail unless you have evidence that I'm unaware of. A large part of critical thinking involves evidence.

    5. The problem is twofold: First of all, not all genes are equal in terms of phylogenetic content. You don't even need to invoke HGT for this. As John alluded to, unequal rates of evolution really messes up phylogeny, as does duplication and loss of copies leading to "orthologous" genes that aren't really. Taking the average phylogeny of all genes in a genome isn't the way to get the best organismal phylogeny, and nobody really does this anyway.

      Second of all, the notion of a what a organismal phylogeny *is* needs to be defined. Even the most skeptical of people in regard to HGT has to accept that eukaryotes received quite a few bacterial genes from their endosymbionts, which seem to be losing their genomes but not so much their genes themselves which need to go somewhere, and nuclear genome is a good a place as any. But at least what *I* mean by an ancestor of eukaryotes is what the organisms were prior to endosymbiosis. It's pretty clear that this was an archaeal lineage like TACK. Yes, you can argue that coming within an archaeal lineage is technically different than being a sister lineage as per the three domains, but the important thing is that there is a closer evolutionary relationship between eukaryotes and archaea than between archaea and bacteria.

    6. Jonathan Badger says,

      "But at least what *I* mean by an ancestor of eukaryotes is what the organisms were prior to endosymbiosis. It's pretty clear that this was an archaeal lineage like TACK."

      You wouldn't get a very high grade from me for saying something like that. The evidence suggests strongly that the first true eukaryotes formed as the result of a fusion between two distinct prokaryotes. If you were to examine that cell when it first arose, you would be hard pressed to predict which genome would become reduced in size by transferring most of its genes to the other one.

      "Endosymbiosis" evolved after the original fusion when one of the genomes underwent a reduction in size.

      The fact that you give priority to the archaeal half of the original fusion reflects a non-scientific bias that makes no logical sense—especially if the majority of eukaryotic genes seem to have come from the α-proteobacterial half of the fusion event.

      Given that a substantial fraction of eukaryotic ancestors are proteobacterial and not archaeal, it makes no sense to promote the old Three Domain Hypothesis as the best representation of the origin of eukaryotes.

      "... but the important thing is that there is a closer evolutionary relationship between eukaryotes and archaea than between archaea and bacteria."

      I don't believe this is correct. You don't have any evidence to support such a claim and refute the evidence that says otherwise.

    7. Counting numbers of genes and their individual origins is not the way to figure out the ancestry of an organism. That's basically like saying English is a Romance language because thanks to a "fusion" event in 1066 more current English words are of Romance origin than of Germanic. And yet all linguists would consider the ancestor of modern English to be Old English, not Norman French. Why is that? Because the *rules* of modern English are more derived from those of Old English than from those of French. The information processing systems of English simply picked up French vocabulary.

      That's what seems to be the case with archaea and modern eukaryotes. Whether you say it was due to a single fusion (I'm dubious) or other mechanisms, it's true that there are a lot (perhaps even a majority) of eubacterial genes in modern eukaryotic nuclear genomes. But how are these genes expressed? By archaeal rules. Our nuclear genes aren't transcribed by means of eubacterial sigma factors but by TATA binding proteins originating from archaea. As with the history of English, the meaningful lineage is the lineage of rules.

    8. Jonathan Badger says,

      But how are these genes expressed? By archaeal rules. Our nuclear genes aren't transcribed by means of eubacterial sigma factors but by TATA binding proteins originating from archaea.

      But how are nucleotides, amino acids, and glucose made? Basic metabolism isn't performed by archaeal enzymes but by eubacterial enzymes. Therefore we descend from eubacteria, not archaea.

      Note, I'm merely pointing out the flawed logic of your argument. You are making the unsubstantiated claim that some genes are more important than others in determining who your ancestors are. It's like saying that because you have your father's eyes, your mother doesn't count.

      The reality is that both eubacteria and archaea are our ancestors and the simplistic Three Domain Hypothesis is refuted.

    9. Organisms can eliminate nearly all of their metabolic genes (take a look at Mycoplasmas for example) and live just fine by getting amino acids, nucleotides, lipids, sugars, etc. from their environment. It isn't *me* saying metabolism is less important than information processing -- life *itself* proves it. We ourselves are pretty auxotrophic, for that matter.

  2. I disagree. If you are designing exams to simply get students to regurgitate what you want, give multiple choice rote memorization exams and send all the biologists to med school. I prefer open ended questions where even if the student gives an incorrect answer but backs it up with some logic and evidence, they can earn some if not most of the points.
    I want to teach the students how to think critically. I have thought about problems and issues and often there isn't a correct answer. We serve our students better if they learn to evaluate evidence, be able to argue there position, and reevaluate their position in the face of new information/counter-arguments.

    1. That the question should be rephrased in order to get the answer you want.

    2. The question should be rephrased in order to make the issue clear. The issue is that some eukaryote genes are more closely related to eubacterial genes than to archaean genes. Similarity is a poor approximation and the wrong word to use.

  3. "However, many scientific studies indicate that a majority of our genes are more similar to eubacterial genes than to archaeal genes. How do you explain this apparent conflict?"

    I'm going to need clarification here. By "a majority of our genes" are we talking about homo sapiens, or eukaryotes, when we say "our"? I'm guessing just eukaryotes in general, since it would be news to me if the majority of human genes could be traced to some particular prokaryotic class.

    1. We talked in class about eukaryotes in general but they did read the following paper from Bill Martin ...

      Dagan, T., and Martin, W. (2006) The tree of one percent. Genome Biol, 7:118. [doi: 10.1186/gb-2006-7-10-118]

      That paper looked at human genes. They say,

      Of the 5,833 human proteins that have homologs in these prokaryotes at the specified thresholds, 2,811 (48%) have homologs in eubacteria only, while 828 (14%) have homologs in archaebacteria only, and 4,788 (80%) have greater sequence identity with eubacterial homologs, whereas 877 (15%) are more similar to archaebacterial homologs (196 are ties).

  4. Last time I checked there was a lot of conflicting results about what "branched" from what. Maybe I should update my readings on this regard.

    I think that the problem is with the idea of figuring out "branching" from some phenomena that happened so long ago, when so many organism get so much horizontal gene transfer. A work by Gogarten, I think, showed that the task would become pretty messy if only 2% of genes were horizontally transferred. (I think that the branching model might be the wrong way of thinking about those early stages. I think that Woese said so too, back in the 90's.)

    Maybe we're too much in love with the branching models.

    Maybe I would get zero marks for this answer.

    1. That's a perfectly valid criticism of a universal tree of life and of the Three Domain Hypothesis. However, it's only (a small) part of the answer. You also have to explain why so many eukaryotic genes seem to most closely (and consistently) related to certain branches of archaebacteria and certain branches of proteobacteria.

  5. Larry, it seems that you know the answer. That is very interesting, because I did not believe that present theories could explain this. Or the answer has to involve speculations that there has been a lot of horizontal transfer of genes. I do not believe that. Instead, I believe that the answer is contained in my new theory, the Organelle Escape Theory. It is a possibility that has not been investigated, but which solves a lot of problems with the endosymbiosis hypothsis. 

    When Woese showed that eukaryotes and archaebacteria seem to be branching off from eubacteria, then what he actually showed was that archaebacteria and eukaryotes once had a similar translation apparatus and that eubacteria got their translation apparatus at a different state of evolution, probably an earlier stage. It does not say anything about how how and at which time the different eubacteria and archaebacteria got their proteins, or more exactly: got their most updated version of these proteins.

    According to my theory there was initially only eukaryotes. The evolution the atmosphere from slightly reductive to very oxidative created new demands to the organisms, and these demands were tackled by creating organelles. When double membraneous organelles were first created, they contained cytoplasm, and they therefore contained the full machinery for translation, but no genes, just mRNA. They evolved by creating channels for transport of proteins etc. and they they also created a genome by reverse transcribing and linking mRNA. Most organelles can regenerate from scratch, but some of these organelles reproduced so well by fission that they did not need this ability. Thereby they got a genetic start for their most basic features, such as ribosomes, that are not imported from the cell cytosol. These organelles could commute to the environments, and they could thereby extend the reach for their host by importing useful molecules and exporting waste. Some of them became so autonomous that they became the bacteria, i.e. they survive as autonomy organisms when their host became extinct.

    The first ones to be created, but maybe not the first ones to commute became the eubacteria, and those that were created at later stages became the archaebacteria. The first one of these may have been the "methanosome" organelle, that thrived together with the present state of evolution of the eubacterial organelles, that were at the hydrogenosome stage. Due to the oxidation, the latter had much longer and in many ways more advanced evolution than the newer archaebacterial organelles, and in the later stages of oxidation, when oxygen could be utilized, the need for commuting ceased. Thereby first mitochondria were created and later chloroplasts.

    All the proteins that were used in the organelles, and therefore all proteins used in bacteria were originally invented by the host organelle and came originally from the nucleus. Due to the longer evolution it is natural that eubacteria have gotten more proteins than the archaebacteria. I am writing about this in my own blog. I have just started. There is much more to come.

  6. Isn't that explained simply because Euk = A+B?

    1. I don´t know what you mean with that. When you say that Euk = A+B, I would interpret that as saying that eukaryotes were created from archaebacteria and eubacteria. But what I say is almost the opposite. In my theory eukaryotes created all types of organelles and thereby also all types of bacteria, first eubacterial organelles like the hydrogenosome, later archaebacterial organelles like the one that I would call "methanosome", which cooperate with hydrogenosomes to create methane. Most of the organelles became commuting organelles, and many of these became free-living bacteria. Is that what you mean by Euk = A+B?

    2. No, I meant eukaryotes originating from the combination of an archaea and an eubacteria.

  7. I am following this thread with some interest as I in the process of constructing a worksheet at the AP Bio high school level (equivalent to freshman university) to pique my students’ interest and convince them that science is not dogmatic, but rather fraught with controversy. I am posting with ulterior motives: I am hoping that any present will correct any naïveté on my part and be able to offer helpful suggestions.

    By way of explanatory overture: I am dismayed with current Biology curricula in Canada as Prokaryotes are given very short shrift. I am also dismayed at everyone’s rush to vertebrates and flowering plants inadvertently reinforces much misconception!

    I spend quite some time in class attempting to correct these failures.

    Here is one exam question I pose to my grade 11 students on the understanding that reality contains far far more branches and the relationships are far far more complex. For example, primary vs secondary symbiosis is not addressed in this diagram and students are impressed that this is a first approximation and we have a lot more to learn later on in higher level university classes.

    So, as a first approximation, the Tree of Life must give way to the Circle of Life, given the two endosymbiotic events close the circle that originated with LUCA. In other words, the question at first becomes trivial.

    The AP Bio worksheet I am currently constructing continues to address Woese’s interpretation?

    Before doing so, I provide the following table:

    1. Here is a “snippet” from the rough draft of my proposed worksheet that owes a huge debt of gratitude to Jonathan Badger and Larry Moran.

      There are two ways to explain these data remembering that some scientists today are disagreeing with Carl Woese’s original explanation. According to Woese’s ribosomal RNA tree, eukaryotes and Archaea are sister groups that are distantly related to Eubacteria (see "a" below).

      Other scientists claim the data doesn't support such a simple interpretation. We now know that only one-third of the ancient genes in eukaryotes are more closely related to Archaea than to Eubacteria. Most Eucarya genes have closer homologues in Eubacteria. That's because eukaryotes arose from a fusion of a primitive archaebacterium and a primitive eubacterium—the Endosymbiotic Hypothesis.

      It’s fair to say that the dominant ancestor of eukaryotes, in terms of genetic contribution, is bacterial, not archaeal.

      The primitive eubacterium became mitochondria and transferred most of its genes to the archaebacterial genome, which became the nuclear genome. In the beginning, you couldn't tell which genome was going to become the biggest (see "b" above). (Chloroplast endosymbiosis occurred several times later on.)

      Can Woese’s version of events still be salvaged? Here is the problem: Phylogenomics provides a great diversity of tree topologies when looking at other genes and gene families. In other words, sceptics claim there is no unique Tree of Life (remember Horizontal Gene Transfer) and Woese’s Tree is one of many, in his case only specific to rRNA genes.

      However, rRNA may deserve privileged status. As it turns out rRNA is indeed a good phylogenetic marker because it is not a subject to HGT.

      Other scientists wonder out loud if we are getting caught up in semantics. Read the following argument:

      ”English is a Germanic language not because of its Germanic vocabulary (only 26% of words in modern English are of Germanic origin), but because of its shared grammar. Likewise, eukaryotes share a common grammar with archaea in that their key molecular biological processes of transcription and translation are clearly related to the exclusion of the bacteria.”

      Does this argument support or contradict Woese’s original Hypothesis?

      I provide a space for students to catch their breath and answer the question, before proceeding. I expect student to understand that even if Eukaryotic ancestors acquired the bulk of their genome from Eubacteria – the Eubacterial genetic contribution needed to be converted into a Archeal format (for lack of better words)

      Worksheet snippet Con’t

      Is there really a difference between the two views? The major difference between these hypotheses centers about when exactly when and how the first eukaryotes “evolved”. According to Woese and his supporters, the eukaryotes emerged relatively late after the diversification of the ancient archaeal and bacterial lineages had been around for a long time already. Woese’s critics argue that the ancestors of the early eukaryotes are more ancient than Woese suggests. They suggest endosymbiosis event was important but maybe not as important as Lynn Margulis originally suggested. There you have it, as the controversy stands today.

      The bio-existential question of whether the host "stole" the genes from the symbiont, or whether the symbiont “co-opted” the host becomes one of relativity and viewpoint. So, maybe we are really quibbling about semantics?

    2. That was the first snippet. My proposed worksheet then delves deeper:

      Just recently, an exciting microbe was discovered that was called the “missing link” between Archaea and Eucarya. Read this link.

      Questions are provided:

      What is the name of the new organism discovered and where was it found?

      Why was its location significant (from an evolutionary perspective)?

      This organism has genes for many traits only found before in eukaryotes. List some of them.

      As mentioned above, anumber of recent studies have indicated that eukaryotes are not actually a third separate branch. Does the discovery of this new organism support this theory or does the discovery support Woese’s original theory or does it support his critics?

    3. Next snippet:

      Without going into too much technical detail, scientists have suggested it may be possible to identify genes common to Eubacteria and Archaea that did not undergo Lateral Gene Transfer. If so, they may be able to deduce the identity of LUCA according to a unique phylogenetic tree. Read this Link.


      By definition, LUCA (check diagram on page 2) no longer exists. Explain

      Read the following quote:

      “Dr. Sutherland and others have no quarrel with Luca’s being traced back to deep sea vents. But that does not mean life originated there, they say.”

      Explain why this is NOT a self-contradiction”

      What do I expect students to garner? It may be possible to rescue Woese’s interpretation after all.

      Final snippet – involves some thinking. The phylogeny tree question asks students to deduce the evolutionary relationships of rRNA gene sequences isolated from the nuclear genomes of humans, yeast, and corn; from an archaeon (Halobacterium), a proteobacterium (E. coli), and a cyanobacterium (Chlorobium); and from the mitochondrial and chloroplast genomes of corn.

      Any corrections or suggestions for improvement are gratefully appreciated

  8. "They suggest endosymbiosis event was important but maybe not as important as Lynn Margulis originally suggested."
    As evident from the reference, there are a lot of different theories, and you may say that endosymbiosis is more or less important. I suppose what you mean is that the number of them differs. But with my theory, that I believe will make all of them obsolete, there is no endosymbiosis event involved at all. With many of the mentioned theories there is reduction of complexity. In my theory the only reduction is when the eukaryote creates organelles. I am in an early phase of presenting my theory, OET, but it will eventually be well explained in my blog at

    Interesting to see the exam questions that you will use. Your students are probably also glad to see the questions in advance.

    1. @ Jarle

      Hmmm... Thank you for your assistance. This worksheet is still a rough copy and I see I need to parse my words more carefully. I now see some other changes I also need to make.

      First of all - I forgot to include one link towards the end, where I am prompting students to make connections (albeit tentative and speculative) without my direct spoon-feeding

      BTW endosymbiosis is more or less important wrt HGT... is all

      Endosymbiosis is no longer a hypothesis but clearly scientific theory sharing the same status of rectitude as the Earth is Round and revolves about the Sun as a reference point (and not flat and the other way around)

      BTW - I do not use the same exam questions from year to year.

    2. @Tom

      "Endosymbiosis is no longer a hypothesis but clearly scientific theory"
      It has been treated as a truth, yes. And it has even been "proven" several times. I have discussed the ultimate proof, that criticized earlier proofs, and I have shown that they have disregarded an important possibility:

    3. @Tom

      You say: "endosymbiosis is more or less important wrt HGT". I would say it like this: HGT is needed due to the belief in endosymbiosis. With my OET theory there is no need for HGT.

  9. Mitochondrial genes are ours, and are eubacterial genes. And many mitochondrial genes have transferred to the nucleus. So I would expect many of our genes to be eubacterial genes. We would also get some due to horizontal gene transfer.

    1. "So I would expect many of our genes to be eubacterial genes. We would also get some due to horizontal gene transfer."

      If we don't, it won't be for lack of opportunity.

      "Some use “microbiome” to mean all the microbes in a community. We and others use it to mean the full collection of genes of all the microbes in a community. The human microbiome (all of our microbes’ genes) can be considered a counterpart to the human genome (all of our genes). The genes in our microbiome outnumber the genes in our genome by about 100 to 1."