7 The Origin and Radiation of Eucaryotes

Back

Hervй Philippe

95

The inference of the universal Tree of Life has been a major

quest in biology since the publication of the theory of evolution

by Charles Darwin in 1859 (Darwin 1859). The first attempt

was done by Haeckel seven years later (Haeckel 1866).

Yet, although this early phylogeny still appears reasonable,

progress toward the resolution of the universal tree remained

elusive for decades. This was in part because of the lack of rigorous

method (the famous “art” of taxonomy) but was greatly

resolved by the German entomologist Willy Hennig through

the development of the so-called cladistic method (Hennig

1966). Indeed, the main difficulty was the scarcity of morphological

characters (sensu lato, e.g., including ultrastructural or

biochemical). The best example of this difficulty is provided

by the study of prokaryotes. After many years of trials, Stanier

and Van Niel were forced to conclude that “any systematic

attempt to construct a detailed scheme of natural relationships

becomes the purest speculation . . . the ultimate scientific goal

of biological classification cannot be achieved in the case of

bacteria” (Van Niel 1955:5). Similar difficulties, albeit to a lesser

extent, were encountered for the phylogeny of unicellular

eucaryotes (protists; Taylor 1978).

The discovery that molecular data (protein, and later,

DNA sequences) contained information about the history of

the organisms harboring them has revolutionized the field

of phylogeny (Zuckerkandl and Pauling 1965). Until the

1980s, sequencing remained a limiting factor and reduced

the impact of molecular phylogeny. Only the study of ribosomal

RNA (rRNA), first through oligonucleotide catalogs

and then through sequencing, allowed the construction of

the universal Tree of Life (Woese 1987, Woese and Fox

1977). The main achievement was the proposal that prokaryotes

should be divided into two groups, called domains, the

Bacteria (Eubacteria) and the Archaea (Archaebacteria). A

short time later, following the suggestion of Schwartz and

Dayhoff (1978), two groups located the root of the universal

Tree of Life through the use of anciently duplicated genes

[i.e., elongation factors (Iwabe et al. 1989) and ATPases

(Gogarten et al. 1989)]. The root fell within the bacterial

branch, making Archaea and Eucarya sister groups, rendering

the prokaryotes paraphyletic. Quite surprising, the quest

for the universal Tree of Life, which has been very elusive

for more than a century, was considered as generally solved

thanks to the molecular phylogenetic studies of the 1980s.

In 1990, a rooted universal tree was published (Woese et al.

1990), and since then it has generally been used as the reference

tree in textbooks and review papers.

The fact that scientists consider this question as fairly

solved is very peculiar. Indeed, microbiologists have shown

that the majority of biochemical, physiological, or morphological

characters each tell a different story about the relationships

among prokaryotes (Van Niel 1955). This is to be

expected for organisms that evolved over billions of years,

given it is also true for organisms that diversified much more

recently (e.g., mammals, birds, or angiosperms). The use of

molecular data clearly allowed systematists to increase the

number of informative characters, but not to avoid the in96

The Origin and Radiation of Life on Earth

herent difficulty of inferring ancient events. The first molecular

phylogenies, which are often quoted for showing the efficiency

of the method, contain serious and indisputable

errors. I will discuss only the most famous example: the

phylogeny of eucaryotes based on cytochrome c (Fitch and

Margoliash 1967). In this tree, primates emerge at the base

of the mammals, well before the marsupials, and snakes at

the base of amniotes, far from their generally accepted position

(diapsids, represented by turtle and birds). Thus,

despite the known theoretical and practical difficulties of

inferring the universal Tree of Life, a phylogeny based on very

few data (mainly 1000 positions for rRNA) was perceived as

an accurate estimate.

At least three major problems have recently challenged

this universal tree. First, the discovery of many uncultured

organisms through molecular ecology techniques has generated

many new phyla, especially in prokaryotes (see Pace,

ch. 5 in this vol.). Second, lateral gene transfer (LGT) between

distantly related organisms has been revealed as a much more

common phenomenon than previously thought (Koonin

et al. 2001). Even if one can demonstrate that tens of genes

share the same historical pattern within Bacteria (Brochier

et al. 2002) and Archaea (Matte-Tailliez et al. 2002), LGT

raises serious questions about our view of prokaryotic evolution

(see Doolittle, ch. 6 in this vol.). Third, the impact of

tree reconstruction artifacts is not negligible, and in this chapter

I focus on this problem. After a brief overview of the Tree

of Life based on rRNA (Woese et al. 1990), I discuss the most

frequent artifacts and provide a brief explanation of their

causes. Then, I will detail the case of the bacterial phylogeny

based on rRNA. This will allow pinpointing the sections of

the current universal Tree of Life that are likely incorrect.

After summarizing recent progress toward their resolution,

I present my personal view of the universal Tree of Life and

its implication for the origin of eucaryotes.

The rRNA Tree

The rRNA tree (fig. 7.1) is so well known that I will only

discuss a few points. The advantages of rRNA as a universal

marker are enormous (Woese 1987): (1) universality, (2)

large size (a few thousand nucleotides), (3) high degree of

conservation, and (4) extremely low probability of being affected

by LGT. These advantages were empirically confirmed

because clades well established through morphological analysis

(e.g., spirochaetes, cyanobacteria, animals, red algae, ciliates)

were recovered with rRNA. Moreover, rRNA phylogenies also

disclosed a number of assemblages that are not expected, based

on previous morphological analysis. For example, an ensemble

containing the morphologically very diverse ciliates,

dinoflagellates, and apicomplexans emerged (Gajadhar et al.

1991). Indeed, when looking for a derived morphological

character that may be shared by these three phyla, the only

one that emerged was the presence of submembranar vesicles,

closely apposed to the plasma membrane and known as alveoli

in ciliates. Some very curious eucaryotic organisms

were unambiguously located within well-known clades

[e.g., Pneumocystis within Fungi (Edman et al. 1988), Dientamoeba

within trichomonads (Silberman et al. 1996a),

Blastocystis within stramenopiles (Silberman et al. 1996)]. Let

me discuss now the phylogenetic pattern related to the early

evolution of eucaryotes.

The location of the root between Bacteria and a clade

containing Archaea and Eucarya, which is based on the analysis

of a few anciently duplicated genes (Brown and Doolittle

1997), has profound implications about the nature of the

“last universal common ancestor” (LUCA). The most parsimonious

interpretation is that LUCA was a prokaryote-like

organism, because a eucaryote-like LUCA implies two major

transitions from eucaryotes to prokaryotes, one to Bacteria,

the other to Archaea. It should nevertheless be noted that,

because of the RNA-world hypothesis, this possibility has

been envisioned (Poole et al. 1999). The RNA-world hypothesis

predicts a biota antecedent to our own that used an RNAlike

molecule for a variety of tasks today performed by RNA,

DNA, and proteins together (Yarus 2002). This hypothesis

is widely accepted as a probable stage in the early evolution

of life. Accordingly, proteins have gradually replaced RNA

as the main biological catalysts. Therefore, the numerous

RNA-based mechanisms of eucaryotes would be remnants

of the RNA world, suggesting that prokaryotes derived from

a eucaryotic-like organism (Poole et al. 1999). According to

the tree in figure 7.1, LUCA was a prokaryote-like organism

and had a circular chromosome with a single origin of replication,

and many genes organized with operons. Yet, contrary

to a frequent belief (e.g., Gupta and Singh 1994, Martin

and Mьller 1998, Slesarev et al. 1998), nothing can be said

about the machinery of replication, transcription, and translation.

It is clear that this machinery is more similar between

Archaea and Eucarya. However, even with a root in the bacterial

branch, the ancestral state can be equally parsimoniously

similar to the bacterial one or to the eucaryotic one. In

both cases, a transition from one type to another is required.

Thus, the similarity between Archaea and Eucarya for the

informational genes cannot be considered as a synapomorphy

supporting the monophyly of this clade.

A second point is that, in the bacterial portion of the tree

(fig. 7.1), the first two lineages to emerge are the Aquificales

and the Thermotogales (Burggraf et al. 1992, Woese 1987).

Because these two phyla mainly contain hyperthermophilic

organisms (e.g., Aquifex and Thermotoga), and because most

of the basal lineages within Archaea are also hyperthemophilic,

the most parsimonious explanation is that LUCA was

a hyperthermophilic organism (Stetter 1996). This implies

that adaptation to life at low temperatures (below 60°C)

occurred many times independently. In particular, in classical

scenarios of eucaryotic origin, the archaeal lineage at the

origin of eucaryotic cells must have become mesophilic.

Moreover, the hyperthermophilic nature of LUCA led to the

The Origin and Radiation of Eucaryotes 97

hypothesis of a hyperthermophilic origin of life, most likely

in hydrothermal ecosystems (Nisbet and Sleep 2001, Pace

1991, Reysenbach and Shock 2002, Russell and Hall 1997,

Stetter 1996, Woese 1987). Although elongation of oligopeptides

(Imai et al. 1999) and synthesis of amino acids (Amend

and Shock 1998) are favored at high temperature, the degradation

of RNA at such temperature argues against a hot

origin of life if one accepts the RNA-world hypothesis (Levy

and Miller 1998, Moulton et al. 2000).

Finally, within eucaryotes, the first three lineages to

emerge (diplomonads, microsporidia, and trichomonads)

are all devoid of mitochondria (Sogin 1991). This seems to

strongly confirm the Archezoa hypothesis (Cavalier-Smith

1987) that these three lineages are primitively devoid of

mitochondria and that the mitochondrial endosymbiosis

from an a-proteobacteria occurred relatively late during

eucaryotic evolution, after the emergence of these three

groups. However, the discovery of genes of mitochondrial

origin (e.g., those encoding cpn60, HSP70, and Val-tRNA

synthetase) in all the amitochondriate organisms in which

they have been looked for (e.g., Entamoeba, Trichomonas,

Nosema, Encephalitozoon, Giardia, Neocallimastix) suggests a

secondary loss of mitochondria (for a review, see Embley and

Hirt 1998). In Entamoeba, trichomonads, and microsporidia,

several such genes have been found, and their products have

been shown to be located in a double-bound organelle

(hydrogenosome and mitosome/crypton; Bui et al. 1996, Mai

et al. 1999, Tovaret al. 1999, Williams et al. 2002). Similarly,

the diplomonad Giardia intestinalis has specialized membranes

with electron transport and membrane-potentialgenerating

functions (Lloyd et al. 2002). This further indicates

that these organisms have lost their mitochondria. Yet,

at least one gene, Val-tRNA synthetase, which was first believed

to be of mitochondrial origin (Hashimoto et al. 1998),

has probably been acquired by LGT from g-proteobacteria

(Gribaldo and Philippe 2002). This is not unexpected because

LGTs are frequent, especially for amitochondriate eucaryotes

(Andersson et al. 2003). Because only a few genes of

mitochondrial origin were found in the genome of a microsporidia

(Encephalitozoon cuniculi; Katinka et al. 2001) and of

a diplomonad (Giardia lamblia; McArthur et al. 2000), it is not

impossible that these genes have also been acquired by LGT

from other eucaryotes (Sogin 1997), and therefore it is not

possible on these grounds to completely reject the hypothesis

that at least some of the amitochondriate eucaryotes

never did harbor a mitochondrion.

Tree Reconstruction Artifacts

The information that is used to infer molecular phylogeny

consists of the mutations that have been fixed in an ancestral

species, which are called substitutions. If, for a given

position, a substitution occurred only once over the phylogenetic

tree under study, then an unambiguous signal would

be provided: a partition of the species into the ones possessing

a given new character state (e.g., a change to A) and the

ones possessing the alternative primitive state (e.g., G) would

EUCARYA

BACTERIA ARCHAEA

Slime molds

High G+C Gram-positives Thermotogales

Proteobacteria

Cyanobacteria

Planctomycetales

Aquificales

Sulfolobales

Thermoproteales

Methanococcales

Archaeoglobales

Methanosarcinales

Halobacteriales

Desulfurococcales

Diplomonads

Flagellates

Fungi

Green plants

Animals

Ciliates

Microsporidia

Trichomonads

Low G+C Gram-positives

Green non-sulfur

Entamoeba

ROOT

Figure 7.1. Universal Tree of

Life based on rRNA and rooted

with anciently duplicated genes,

modified from Stetter (1996).

The thick branches with

boldface names are likely

misplaced by LBA artifact.

98 The Origin and Radiation of Life on Earth

provide support for one node on the phylogeny. If many

characters of this type are available, they will define many

different compatible partitions that will allow inferring the

correct phylogenetic tree. Unfortunately, in real sequences,

such perfect characters with a single substitution are extremely

rare, and almost all base positions have undergone

many more than one substitution. If, for example, a base

position has undergone 25 substitutions across a tree connecting

50 species, the taxon partitions suggested by the

sharing of the various nucleotides will almost certainly be at

odds with the correct phylogeny. This base position, therefore,

has evolved too fast for the phylogeny under study and

will contribute more noise than signal (such a position is said

to be saturated).

In practice, an alignment of homologous sequences contains

a mixture of slow- and fast-evolving positions (the situation

is indeed more complicated because of heterotachy; see

below). If there were no bias, fast-evolving positions will

contribute random noise that will not favor any specific

phylogeny, and the correct phylogeny will be inferred primarily

on the basis of the slow-evolving positions. Unfortunately,

several biases exist that can confound phylogenetic

inference. The easiest biases to understand are those of nucleotide

or amino acid composition. Assume that two lineages

increased the G+C (guanosine + cytosine) content of their

sequences independently. In that case, the noise contributed

by fast-evolving positions will not be random but will favor

the grouping of two G+C-rich lineages (Hasegawa and

Hashimoto 1993, Lockhart et al. 1992). Another very important

bias is the existence of unequal evolutionary rate among

lineages. In the case of four species in which two are slowly

evolving and two are fast evolving, the noise will favor the

grouping of the two slowly evolving lineages because they

share many ancestral characters. As a result, the two fastevolving

species will be grouped together, a phenomenon

called the long-branch attraction (LBA) artifact (Felsenstein

1978).

These problems are known since the beginning of molecular

phylogeny, and many attempts have been made to

develop methods of inference less sensitive to nonrandom

bias (for a review, see Swofford et al. 1996). To deal with the

noise created by fast-evolving positions, it is necessary to have

a model of sequence evolution as realistic as possible in order

to infer the existence of multiple substitutions. Starting

from the very simple model of Jukes and Cantor (1969),

researchers have developed very complex models such as the

general time-reversible model (Waddell and Steel 1997) or

the G model that deals with among-site rate variation (Yang

1996). Other models that are not reversible have been implemented,

particularly to avoid the bias due to nucleotide composition

(Galtier and Gouy 1998). Nevertheless, even the

most complex model is far from biological reality. One of the

most important phenomena that is just beginning to be considered

(Galtier 2001, Huelsenbeck 2002, Penny et al. 2001)

is heterotachy, the variation of evolutionary rate of a given

position over time (i.e., fast in one part of the tree and slow

in another one). Many studies have shown that this phenomenon

is quite common (Galtier 2001, Huelsenbeck

2002, Lockhart et al. 2000, Lopez et al. 1999, Miyamoto

and Fitch 1995, Penny et al. 2001); for example, up to 95%

of the variable positions cytochrome b are heterotachous

for a sample of ~2000 vertebrate sequences (Lopez et al.

2002). Heterotachy can increase the impact of LBA artifacts

when two fast-evolving lineages display a higher number

of variable positions (Germot and Philippe 1999). In fact,

when a distant outgroup is used, the fast-evolving species

and the outgroup have long branches that often attract each

other. This leads to a very simple principle: early-emerging

lineages are often fast-evolving ones misplaced by the LBA

artifact. On the universal tree based on rRNA, all the basal

branches (indicated in bold in fig. 7.1) are thus potentially

erroneous.

The Case of the Bacterial Phylogeny

Based on rRNA

The first two lineages to emerge in eubacterial phylogeny

(Aquificales and Thermotogales) display rather short

branches and for this reason are generally assumed to not be

misplaced because of LBA (Burggraf et al. 1992, Stetter 1996).

We recently reanalyzed the rRNA based phylogeny of Bacteria

using a large data set, 95 species and 1147 positions

(Brochier and Philippe 2002). If one examines the distribution

of the number of substitutions per site (solid bars in

fig. 7.2), it appears that most of the changes are contributed

by fast-evolving positions. More precisely, there are many

slowly evolving positions (e.g., 373 without changes, 154

with a single substitution) and relatively few fast-evolving

positions (e.g., only 154 positions with more than 16 substitutions).

This distribution of the observed substitutions

is expected when the substitution rate is distributed according

to a G law with a low a parameter (0.4 here). However,

the point that is rarely discussed is the relative contributions

of the slowly and fast-evolving positions to tree selection.

Within a parsimony framework, the criterion to select the

best phylogeny is the minimum total number of steps. Yet,

as shown by the shaded bars in figure 7.2, the importance

of slow- and fast-evolving sites is completely the reverse of

the distribution of these sites. In fact, the slowly evolving sites

(fewer than five changes) contribute very few of the total

number of changes (~900 steps), whereas the fast-evolving

ones are the major contributors (~3800 steps). As a result,

the fast-evolving sites are the most influent in the selection

of the tree topology, whereas the slowly evolving ones contain

the most reliable signal.

To investigate this fundamental issue of molecular phylogeny,

we used the Slow-Fast (SF) method (Brinkmann and

Philippe 1999), which evaluates the evolutionary rate of

positions in terms of the sum of the number of steps in preThe

Origin and Radiation of Eucaryotes 99

defined monophyletic groups (here, the bacterial phyla) and

thus allows study of the phylogenetic relationships among

these groups. Interestingly, the first bacteria to emerge in the

tree based on the most reliable positions (fewer than five substitutions)

are, with a reasonable statistical support, Planctomycetes

(Brochier and Philippe 2002). This phylum is a major

division of Bacteria, whose members share several original features

such as the lack of peptidoglycan in their cell walls or a

budding mode of reproduction (Fuerst 1995). The most intriguing

feature is the existence of a single or double membrane

around the bacterial chromosome in Gemmata and Pirellula

species, which has been compared with the eucaryotic nucleus

(Fuerst 1995). Yet, evolutionary homology with the eucaryotic

nucleus has not been proved. Despite these unique characteristics,

this group remains little studied, although it was

recently implied in anaerobic ammonia oxidation (Strous et al.

1999). If the early emergence of Planctomycetales were confirmed

by genomic data (Jenkins et al. 2002), the early emergence

of the most “eucaryote-like” bacteria at the base of the

tree would challenge the current view on the nature of LUCA.

In contrast, the hyperthermophilic bacteria robustly emerged

late in the tree based on slowly evolving positions (Brochier

and Philippe 2002). This is in agreement with the growing

evidence that they secondarily adapted to high temperature

(Aravind et al. 1998, Forterre et al. 2000, Galtier et al. 1999,

Nelson et al. 1999), which seriously weakened the hypothesis

that LUCA was hyperthermophile. Finally, in this tree,

hyperthermophylic bacteria show a very high evolutionary

rate, which was masked in standard analysis by the fast-evolving

positions (Bromham et al. 2000, Philippe and Laurent

1998, Philippe et al. 1994). Therefore, contrary to recent

claims (Dawson and Pace 2002), apparently slowly evolving

lineages (e.g., Aquificales and Thermotogales; fig. 7.1) can

be misplaced by the LBA artifact.

Recent Advances into the Eucaryotic Phylogeny

The impact of LBA artifact is not limited to the bacterial

phylogeny but applies to all the branches indicated in bold

in figure 7.1 (Brinkmann and Philippe 1999, Philippe et al.

2000b). This is especially dramatic in the case of eucaryotes,

for which more than 10 early-branching lineages could be

artificially located (Philippe and Adoutte 1998). Indeed, the

eucaryotic tree was previously divided into two parts: (1) the

so-called crown, in which the branching order between phyla

was very poorly resolved, which is interpreted as the result

of an adaptive radiation (Knoll 1992); and (2) the base, which

contains “primitive” eucaryotes, especially the amitochondriate

ones. We have proposed that all the lineages of the classical

base are very likely misplaced and in fact belong to the

crown, what we called the “big bang” hypothesis (Philippe

and Adoutte 1998).

As recently reviewed (Philippe et al. 2000a), many lines

of evidence are in agreement with the hypothesis that the

eucaryotes branching early in the rRNA are misplaced because

of LBA. First, the evolutionary rates of different eucaryotic

phyla have been estimated for several genes, and it has

been shown that the faster a phylum evolves, the earlier it

emerges (e.g., euglenozoans for rRNA and ciliates for actin;

Moreira et al. 2002, Philippe and Adoutte 1998). Second, the

addition of new sequences in phylogenetic analyses, which

is known to reduce the impact of the LBA artifact (Hendy

and Penny 1989), results in an upward movement of the

early-branching species in the tree (Moreira et al. 1999).

Third, the use of more realistic models of sequence evolution,

also known to attenuate the impact of LBA (Huelsenbeck

1998), leads, in rRNA trees, to a later emergence of

euglenozoans (Peyretaillade et al. 1998, Tourasse and Gouy

1998), microsporidia (Peyretaillade et al. 1998, Van de Peer

et al. 2000), Physarum (Peyretaillade et al. 1998), and trichomonads

and heteroloboseans (Silberman et al. 1999). In

fact, the most recent analyses that used a G law to model the

rate heterogeneity among sequence sites showed that the

“classical” tree cannot be statistically differentiated from

the ones that locate all the lineages within the crown (Philippe

and Germot 2000, Simpson et al. 2002). Fourth, several

characteristics [highly heterogeneous rRNA length, large

number of unique substitutions, attraction by artificial random

sequences, and high Relative Apparent Synapomorphy

Analysis (RASA) taxon variance] suggested that the basal lineages

of the rRNA tree are fast evolving (Stiller and Hall

1999). Fifth, if a basal emergence in the rRNA tree is correct,

one expects that the slowly evolving positions, which

contain most of the ancient phylogenetic information, will

provide strong support for the basal branching. Yet, as for

Bacteria, when using the S-F method, the basal taxa in the

standard rRNA tree do not emerge early when only slowevolving

positions are used, but display very long branches

(Philippe et al. 2000b). Sixth, phylogenies based on protein

sequences generally suggest a late emergence for the taxa

0

200

400

0 1 2 3 4 5

[6,10]

[11,15]

[16,20]

[21,25]

[26,50]

0

200

400

600

800

1000

1200

1400

1600

#positions

#changes

#changes per position

Figure 7.2. Distribution of the number of substitutions per

position for the rRNA of 95 prokaryotic species (solid bars). The

number of substitutions brought by each class of positions is

indicated by the shaded bars.

100 The Origin and Radiation of Life on Earth

emerging early in the rRNA tree. A clear example is provided

by microsporidia, which are located very close to the base of

eucaryotes in rRNA tree (fig. 7.1) but are indeed highly derived

fungi (for review, see Keeling and Fast 2002).

The phylogenetic relationships within the crown of the

eucaryotic rRNA tree are known to be difficult quite to resolve,

possibly because of a rapid diversification (Knoll 1992,

Sogin 1991). Indeed, eucaryotic rRNA phylogenies inferred

with a comprehensive taxonomic sampling and a G law model

are very poorly resolved, the bootstrap values for the nodes

connecting the major phyla being almost all below 50%

(Brugerolle et al. 2002, Cavalier-Smith 2002, Simpson et al.

2002). Because many more lineages than first acknowledged

(the artifactually early-branching phyla and the newly discovered,

uncultured groups; Dawson and Pace 2002, Lopez-

Garcia et al. 2001) belong to the already poorly resolved

crown, The complete resolution of the eucaryotic phylogeny

constitutes a great challenge.

Two quite different approaches can be used, which we

have called statistician and Hennigian (Philippe and Laurent

1998). The statistician approach consists in the analysis of

very large data sets, with tree reconstruction methods as refined

as possible. The underlying idea is that the resolving

power will increase and that the biases brought by different

genes will be different and thus will be minimized. The

Hennigian approach consists in the use of very slowly evolving

characters, such as insertion/deletion or gene fusion

events [also called rare genomic events (Rokas and Holland

2000)]. The assumption is that these characters are less homoplastic,

and therefore the most simple tree reconstruction

method (i.e., maximum parsimony) will provide a good estimate

of the good phylogeny. These two approaches have

been applied to the case of eucaryotes, with both more and

less success.

In the statistician approach, because of the limited

amount of available sequences, one has to choose between

many genes/few species (13/12; Moreira et al. 2000) and few

genes/many species (4/60; Baldauf et al. 2000). As expected

(Graybeal 1998, Lecointre et al. 1993, 1994), the first approach

provided a fully resolved tree (Moreira et al. 2000)

but is very sensitive to LBA, whereas the second is not severely

affected by LBA but is very poorly resolved. For example,

the Euglenozoa and the Apicomplexa emerge strongly

but artificially at the base when few species are used (Moreira

et al. 2000). On the contrary, they belong to a large group of

protists (including also stramenopiles and heteroloboseans)

when many species are used (Baldauf et al. 2000), but with

a weak support (bootstrap value around 50%). In contrast,

red algae and green plants strongly group together in the

clade Plantae in the first analysis but very weakly (bootstrap

value below 50%) in the second one. The monophyly of

Plantae found with nuclear genes strongly suggests the hypothesis

of a unique primary endosymbiosis of a cyanobacteria

at the origin of chloroplast, as already proposed by

plastid and mitochondrial data (Palmer 2000).

We recently tried to make a compromise between these

two extremes in order to increase simultaneously both accuracy

and resolving power (Bapteste et al. 2002). We used

123 genes for 30 species, representing about 25,000 unambiguously

aligned positions. The corresponding phylogeny

is shown in figure 7.3. Not surprisingly, the results are in

between the previous ones (with 12 and 60 species, respectively;

Baldauf et al. 2000, Moreira et al. 2000), which is illustrated

by three examples. (1) One fast-evolving species, a

parasitic amitochondriate amoeba Entamoeba histolytica, is

strongly grouped with a free-living amitochondriate amoeba

(Mastigamoeba), this clade being a sister group of Mycetozoa,

represented here by Dictyostelium. The monophyly of

this large clade of amoeboid organisms contrasts with their

pronounced polyphyly on classical rRNA trees (Sogin 1991).

The statistician approach has provided convincing evidence

for a difficult phylogenetic question. (2) The early emergence

of diplomonads and Euglenozoa (fig. 7.3) is very likely due

to LBA. In fact, when we added microsporidia to our data

set, we found very strong support for their early emergence

(H. Brinkman, M. van der Giezen, T. M. Embley, and H.

Philippe, unpubl. obs.). However, the evidence for considering

microsporidia as derived fungi is very strong (Keeling

and Fast 2002), but many of the genes used evolved very fast

in this group, thus generating LBA. The number of species

used in our study is thus insufficient to eliminate LBA, all

the more so because a very distant outgroup (Archaea) is

used. It is likely that the use of genes of mitochondrial origin,

with a very close a-proteobacterial outgroup, will be a

good way to avoid this problem (Philippe 2000). (3) Several

nodes (e.g., the grouping of stramenopiles and alveolates) are

weakly supported. This indicates that the number of genes

used is still insufficient, and/or, as proposed by the big bang

hypothesis, the time between speciation events is too short

to discriminate branching orders. In summary, the statistician

approach has allowed, and will allow, progress in the

resolution of the phylogeny of eucaryotes. However, because

it is very sensitive to the inconsistency of the methods, it is

of prime importance to improve the tree reconstruction

methods, especially by taking into account heterotachy

(Galtier 2001, Huelsenbeck 2002, Penny et al. 2001).

In the Hennigian approach, very few characters useful for

resolving the phylogeny of eucaryotes have been discovered.

First, a few insertion/deletions have been proposed. In particular,

an insertion of about 12 amino acids in the elongation

factor EF-1a is shared only by animals and fungi (Baldauf

and Palmer 1993), and also by microsporidia (Van de Peer

et al. 2000), suggesting the monophyly of this clade, called

Opisthokonta. However, the same insertion is also present

in some green algae but not in land plants (H. Philippe,

unpubl. obs.). Similarly, two small indels of one amino acid

in enolase are shared by trichomonads and prokaryotes,

suggesting that trichomonads constitute the first lineage to

emerge within eucaryotes (Keeling and Palmer 2000). However,

the same indels are also present in several independent

The Origin and Radiation of Eucaryotes 101

lineages (e.g., in several members of Archaea and in a few of

Bacteria; Bapteste and Philippe 2002, Hannaert et al. 2000),

casting doubts on the use of this character as a phylogenetic

marker. In fact we have found in enolase, IMPDH, and Val

tRNA synthetase several large indels that contradict each

other and also the phylogeny inferred from the very same

gene containing the indel (Bapteste and Philippe 2002,

Gribaldo and Philippe 2002). This indicates that indels are

not always very good characters, because they are prone to

convergence and that they are very sensitive to LGT (with or

without recombination; see Bapteste and Philippe 2002). It

is thus very hazardous to base phylogenetic inference on a

single indel. Finally, an insertion in a very highly conserved

gene (ubiquitin) for which a comprehensive taxonomic sampling

is available provide convincing evidence for the sistergroup

relationship of Cercozoa and Foraminifera (Archibald

et al. 2003).

Other rare genomic events are more promising. The first

case is the nonhomologous replacement of the mitochondrial

RNA polymerase by the T3/T7-like one. In all the mitochondriate

eucaryotes, except the jakobids (e.g., Reclinomonas

americana), the original bacterial polymerase encoded in the

mitochondria has been replaced by T3/T7 polymerase

(Cermakian et al. 1996, Lang et al. 1997). This replacement

suggests that jakobids are the first eucaryotic lineage to

emerge. However, in the plastid of land plants, the bacterial

and the T3/T7-like RNA polymerases are known to have

coexisted for several hundred of millions years (Gray and

Lang 1998), and the bacterial form has been lost in one parasitic

nonphotosynthetic plant (Wolfe et al. 1992). It is therefore

quite possible that different lineage sorting has affected

the RNA polymerase of mitochondria. Nevertheless, jakobids

are good candidates for being the first emerging eucaryotes.

A second case of a rare genomic event is the fusion of the

dihydrofolate reductase and thymidylate synthase genes.

These two genes are separated in all the bacteria and all the

opistokonts, but are fused, when present, in the other eucaryotes

(Philippe et al. 2000b, Stechmann and Cavalier-

Smith 2002). This is a strong argument to locate the root of

the eucaryotic tree between opistikonts and all the other

eucaryotes. Yet, it should be noted that these genes have been

lost in several lineages (e.g., Entamoeba and Giardia) and replaced

by nonhomologous genes in some others (e.g.,

Dictyostelium; Dynes and Firtel 1989). This gene fusion suggests

that opistokonts are also very good candidates for being

the first emerging eucaryotes. In summary, the use of rare

genomic events has provided some interesting hypotheses

for rooting the eucaryotic tree. If such a root is reliably inferred,

it will be possible to construct eucaryotic phylogenies

without the need of a non-eucaryotic outgroups, thus seriously

reducing the importance of LBA.

As expected from the results based on rRNA, the eucaryotic

phylogeny turned to be a very difficult question.

The very large amount of new molecular data has recently

allowed resolving several nodes (fig. 7.4). The resolution

will continue to be improved thanks to the sequencing of

Figure 7.3. Phylogenetic tree

based on 123 genes, redrawn

from Bapteste et al. (2000). The

tree was inferred by a separate

maximum likelihood analysis,

taking into account among-sites

rate variation (JTT + G model).

For reducing computational

time, several nodes, which were

recovered through preliminary

analyses, were constrained

(indicated by asterisks). The

bootstrap values were obtained

by bootstrapping the 123 genes,

a modification of the RELL

method (Kishino et al. 1990).

0.1

*

*

*

62

*

*

*

*

*

*

*

*

*

*

*

98

97

93

68

96

100

Archaea

Nucleomorph of

Guillardia theta

Diplomonads

Trypanosoma

Leishmania

Stramenopiles

Ciliates

Sarcocystidae

Plasmodium falciparum

Red algae

Green algae

Arabidopsis thaliana

Monocots

Basidiomycetes

Schizosaccharomyces pombe

Neurospora crassa

Candida albicans

Saccharomyces cerevisiae

Mammals

Caenorhabditis elegans

Drosophila melanogaster

Dictyostelium discoideum

Entamoeba histolytica

Mastigamoeba balamuthi

Conosa

Animals

Fungi

Alveolates

Kinetoplastids

Plants

102 The Origin and Radiation of Life on Earth

complete genomes and of a large sample of cDNAs (http://

megasun.bch.umontreal.ca/pepdb/pep_main.html) for

many protists.

A Personal Point of View on the Universal

Tree of Life

In conclusion, several basal branches of the universal Tree of

Life based on rRNA (indicated in bold in fig. 7.1), which may

be misplaced because of LBA artifact, have been relocated

upper in the tree (e.g., hyperthermophilic bacteria and

microsporidia). For some others (e.g., diplomonads and the

root of the Tree of Life), it appeared that their high evolutionary

rates for numerous genes prevented their reliable placement,

because current tree reconstruction methods are still

sensitive to LBA. The support in favor of their early emergence

has thus been weakened. Nevertheless, the global picture provided

by rRNA remains correct, and one can still consider

rRNA as one of the best phylogenetic markers, despite some

weaknesses. The progresses to fix the potential errors highlighted

in figure 7.1 are summarized in figure 7.4. It should

be noted that several nodes are supported with little support

(e.g., a single gene) and reflect my working hypothesis rather

than a robust and widely accepted consensus.

I would like to emphasize two general issues that are especially

relevant to the origin and evolution of eucaryotes. The

first is that we are strongly influenced by the Aristotelian view

that simple organisms are primitive organisms (the famous

scala natura). It is for this reason that we easily believe that

prokaryotes precede eucaryotes and that amitochondriate

eucaryotes predate the mitochondrial endosymbiosis. Yet, the

study of eucaryotic phylogeny (Embley and Hirt 1998, Philippe

et al. 2000a) has shown that simplification is a major evolutionary

trend. As brilliantly argued more than 50 years ago

(Lwoff 1943), we have a major psychological reluctance to

accept the importance of simplification, because we associate

evolution, progress, and complexity (Gould 1996). The second

is that molecular phylogeneticists, because of the constraint

of having to study extant organisms, often forget extinct

organisms. In fact, extinction is a very common phenomenon,

and one should take extinct organisms into account for every

evolutionary scenario. Even if a lot of speculations are required

to infer the characteristics of past microorganisms, the numerous

extinct organisms quite different from extant eucaryotes

and prokaryotes should not be ignored (e.g., the organisms

thriving during the hypothetical RNA world). As a result, the

absence of early-branching eucaryotes proposed by the “big

bang” hypothesis does not imply that complex eucaryotes

suddenly evolved from scratch. As shown in figure 7.4, this

can just be due to the extinction of all the intermediary forms,

as is well known for mammals and birds.

Finally, as explained in detail elsewhere (Forterre and

Philippe 1999), we favor the hypothesis that LUCA was an

eucaryote-like organism that would have evolved through

simplification into a prokaryote-like form. The main argument

is that many RNA-based mechanisms inherited from

the RNA world have been replaced by protein-based mechanisms

in prokaryotes (Poole et al. 1999). Nevertheless, this

argument is not decisive, because RNA-based mechanisms

can appear in prokaryotes (e.g., transfer-messenger RNA in

Bacteria).

Acknowledgments

This chapter is dedicated to the memory of Andrй Adoutte

(1947–2002), who was my Ph.D. supervisor and, as early as

1987, was concerned by the limitation of tree reconstruction

methods. Most of the work discussed in this chapter was due to

his brilliant intuitions. I also dedicate this chapter to the

memory of Stephen J. Gould (1941–2002), whose books

motivated me to move from mathematics to evolutionary

biology. I thank Simonetta Gribaldo for careful reading of the

manuscript and Joel Cracraft for many helpful suggestions.

Literature Cited

Amend, J. P., and E. L. Shock. 1998. Energetics of amino acid

synthesis in hydrothermal ecosystems. Science 281:1659–

1662.

Andersson, J. O., A. M. Sjogren, L. A. Davis, T. M. Embley, and

A. J. Roger. 2003. Phylogenetic analyses of diplomonad

OTHER BACTERIA

ARCHAEA

PLANCTOMYCETALES

(MASS?)

EXTINCTIONS

?

X

X

X X

X

X

X

X

X

X

X X

FUNGI

ANIMALS

MICROSPORIDIA

JAKOBIDS

DIPLOMONADS

TRICHOMONADS

EUGLENOZOA

ALVEOLATES

STRAMENOPILES

RED ALGAE

GREEN PLANTS

FORAMINIFERANS

CERCOZOANS

AMOEBOZOA

RNA

WORLD

DNA

WORLD

Figure 7.4. Simplified universal Tree of Life. The root was

located in the eukaryotic branch based on the analysis of slowly

evolving positions (Brinkmann and Philippe 1999). The

mitochondrial endosymbiosis (arrow) is supposed to have

occurred before the diversification of extant eukaryotes and

before the major increase of atmospheric oxygen (Canfield and

Teske 1996). A few extinct lineages are indicated in broken

lines, to show that the diversity of extant organisms provides a

very sparse sampling of ancient diversity.

The Origin and Radiation of Eucaryotes 103

genes reveal frequent lateral gene transfers affecting

eukaryotes. Curr. Biol. 13:94–104.

Aravind, L., R. L. Tatusov, Y. I. Wolf, D. R. Walker, and E. V.

Koonin. 1998. Evidence for massive gene exchange between

archaeal and bacterial hyperthermophiles. Trends Genet.

14:442–444.

Archibald, J. M., D. Longet, J. Pawlowski, and P. J. Keeling.

2003. A novel polyubiquitin structure in cercozoa and

foraminifera: evidence for a new eukaryotic supergroup.

Mol. Biol. Evol. 20:62–66.

Baldauf, S. L., and J. D. Palmer. 1993. Animals and fungi are

each other’s closest relatives: congruent evidence from

multiple proteins. Proc. Natl. Acad. Sci. USA 90:11558–

11562.

Baldauf, S. L., A. J. Roger, I. Wenk-Siefert, and W. F. Doolittle.

2000. A kingdom-level phylogeny of eukaryotes based on

combined protein data. Science 290:972–977.

Bapteste, E., H. Brinkmann, J. A. Lee, D. V. Moore, C. W.

Sensen, P. Gordon, L. Durufle, T. Gaasterland, P. Lopez,

M. Muller, and H. Philippe. 2002. The analysis of 100 genes

supports the grouping of three highly divergent amoebae:

Dictyostelium, Entamoeba, and Mastigamoeba. Proc. Natl.

Acad. Sci. USA 99:1414–1419.

Bapteste, E., and H. Philippe. 2002. The potential value of

indels as phylogenetic markers: position of trichomonads as

a case study. Mol. Biol. Evol. 19:972–977.

Brinkmann, H., and H. Philippe. 1999. Archaea sister group of

Bacteria? Indications from tree reconstruction artifacts in

ancient phylogenies. Mol. Biol. Evol. 16:817–825.

Brochier, C., E. Bapteste, D. Moreira, and H. Philippe 2002.

Eubacterial phylogeny based on translational apparatus

proteins. Trends Genet. 18:1–5.

Brochier, C., and H. Philippe. 2002. Phylogeny: a nonhyperthermophilic

ancestor for bacteria. Nature 417:244.

Bromham, L., D. Penny, A. Rambaut, and M. D. Hendy. 2000.

The power of relative rates tests depends on the data. J. Mol.

Evol. 50:296–301.

Brown, J. R., and W. F. Doolittle. 1997. Archaea and the

prokaryote-to-eukaryote transition. Microbiol. Mol. Biol.

Rev. 61:456–502.

Brugerolle, G., G. Bricheux, H. Philippe, and G. Coffe. 2002.

Collodictyon triciliatum and Diphylleia rotans (= Aulacomonas

submarina) form a new family of flagellates (Collodictyonidae)

with tubular mitochondrial cristae that is phylogenetically

distant from other flagellate groups. Protist 153:59–70.

Bui, E. T., P. J. Bradley, and P. J. Johnson. 1996. A common

evolutionary origin for mitochondria and hydrogenosomes.

Proc. Natl. Acad. Sci. USA 93:9651–9656.

Burggraf, S., G. J. Olsen, K. O. Stetter, and C. R. Woese. 1992.

A phylogenetic analysis of Aquifex pyrophilus. Syst. Appl.

Microbiol. 15:352–356.

Canfield, D. E., and A. Teske. 1996. Late proterozoic rise in

atmospheric oxygen concentration inferred from phylogenetic

and sulphur-isotope studies. Nature 382:127–132.

Cavalier-Smith, T. 1987. Eukaryotes with no mitochondria.

Nature 326:332–333.

Cavalier-Smith, T. 2002. The phagotrophic origin of eukaryotes

and phylogenetic classification of Protozoa. Int. J. Syst.

Evol. Microbiol. 52:297–354.

Cermakian, N., T. M. Ikeda, R. Cedergren, and M. W. Gray.

1996. Sequences homologous to yeast mitochondrial and

bacteriophage T3 and T7 RNA polymerases are widespread

throughout the eukaryotic lineage. Nucleic Acids Res.

24:648–654.

Darwin, C. 1859. The origin of species by means of natural

selection. Murray, London.

Dawson, S. C., and N. R. Pace. 2002. Novel kingdom-level

eukaryotic diversity in anoxic environments. Proc. Natl.

Acad. Sci. USA 99:8324–8329.

Dynes, J. L., and R. A. Firtel. 1989. Molecular complementation

of a genetic marker in Dictyostelium using a genomic DNA

library. Proc. Natl. Acad. Sci. USA 86:7966–7970.

Edman, J. C., J. A. Kovacs, H. Masur, D. V. Santi, H. J. Elwood,

and M. L. Sogin. 1988. Ribosomal RNA sequence shows

Pneumocystis carinii to be a member of the fungi. Nature

334:519–522.

Embley, T. M., and R. P. Hirt. 1998. Early branching eukaryotes?

Curr. Opin. Genet. Dev. 8:624–629.

Felsenstein, J. 1978. Cases in which parsimony or compatibility

methods will be positively misleading. Syst. Zool. 27:401–

410.

Fitch, W. M., and E. Margoliash. 1967. Construction of

phylogenetic trees. Science 155:279–284.

Forterre, P., C. Bouthier De La Tour, H. Philippe, and

M. Duguet. 2000. Reverse gyrase from hyperthermophiles:

probable transfer of a thermoadaptation trait from archaea

to bacteria. Trends Genet. 16:152–154.

Forterre, P., and H. Philippe. 1999. Where is the root of the

universal tree of life? Bioessays 21:871–879.

Fuerst, J. A. 1995. The planctomycetes: emerging models for

microbial ecology, evolution and cell biology. Microbiology

141:1493–1506.

Gajadhar, A. A., W. C. Marquardt, R. Hall, J. Gunderson, E. V.

Ariztia-Carmona, and M. L. Sogin. 1991. Ribosomal RNA

sequences of Sarcocystis muris, Theileria annulata and

Crypthecodinium cohnii reveal evolutionary relationships

among apicomplexans, dinoflagellates, and ciliates. Mol.

Biochem. Parasitol. 45:147–154.

Galtier, N. 2001. Maximum-likelihood phylogenetic analysis

under a covarion-like model. Mol. Biol. Evol. 18:866–873.

Galtier, N., and M. Gouy. 1998. Inferring pattern and process:

maximum-likelihood implementation of a nonhomogeneous

model of DNA sequence evolution for phylogenetic analysis.

Mol. Biol. Evol. 15:871–879.

Galtier, N., N. Tourasse, and M. Gouy. 1999. A nonhyperthermophilic

common ancestor to extant life forms.

Sciences 283:220–221.

Germot, A., and H. Philippe. 1999. Critical analysis of eukaryotic

phylogeny: a case study based on the HSP70 family.

J. Eukaryot. Microbiol. 46:116–124.

Gogarten, J. P., H. Kibak, P. Dittrich, L. Taiz, E. J. Bowman,

B. J. Bowman, M. F. Manolson, R. J. Poole, T. Date, T.

Oshima, et al. 1989. Evolution of the vacuolar H+-ATPase:

implications for the origin of eukaryotes. Proc. Natl. Acad.

Sci. USA 86:6661–6665.

Gould, S. J. 1996. Full house: the spread of excellence from

Plato to Darwin. Harmony Books, New York.

Gray, M. W., and B. F. Lang. 1998. Transcription in chloroplasts

and mitochondria: a tale of two polymerases. Trends

Microbiol. 6:1–3.

104 The Origin and Radiation of Life on Earth

Graybeal, A. 1998. Is it better to add taxa or characters to a

difficult phylogenetic problem? Syst. Biol. 47:9–17.

Gribaldo, S., and H. Philippe. 2002. Ancient phylogenetic

relationships. Theor. Pop. Biol. 61:391–408.

Gupta, R. S., and B. Singh. 1994. Phylogenetic analysis of 70 kD

heat shock protein sequences suggests a chimeric origin for

the eukaryotic cell nucleus. Curr. Biol. 4:1104–1114.

Haeckel, E. 1866. Generelle Morphologie der Organismen:

Allgemeine Grundzьge der organischen Formen-

Wissenschaft, mechanisch begrьndet durch die von Charles

Darwin reformirte Descendenz-Theorie. 2 vols. Georg

Reimer, Berlin.

Hannaert, V., H. Brinkmann, U. Nowitzki, J. A. Lee, M.-A.

Albert, C. W. Sensen, T. Gaasterland, M. Mьller, P. Michels,

and W. Martin. 2000. Enolase from Trypanosoma brucei,

from the amitochondriate protist Mastigamoeba balamuthi,

and from the chloroplast and cytosol of Euglena gracilis:

pieces in the evolutionary puzzle of the eukaryotic glycolytic

pathway. Mol. Biol. Evol. 17:989–1000.

Hasegawa, M., and T. Hashimoto. 1993. Ribosomal RNA trees

misleading? Nature 361:23.

Hashimoto, T., L. B. Sanchez, T. Shirakura, M. Muller, and

M. Hasegawa. 1998. Secondary absence of mitochondria in

Giardia lamblia and Trichomonas vaginalis revealed by valyltRNA

synthetase phylogeny. Proc. Natl. Acad. Sci. USA

95:6860–6865.

Hendy, M., and D. Penny. 1989. A framework for the quantitative

study of evolutionary trees. Syst. Zool. 38:297–309.

Hennig, W. 1966. Phylogenetic systematics. University of

Illinois Press, Urbana.

Huelsenbeck, J. P. 1998. Systematic bias in phylogenetic

analysis: is the Strepsiptera problem solved? Syst. Biol.

47:519–537.

Huelsenbeck, J. P. 2002. Testing a covariotide model of DNA

substitution. Mol. Biol. Evol. 19:698–707.

Imai, E., H. Honda, K. Hatori, A. Brack, and K. Matsuno. 1999.

Elongation of oligopeptides in a simulated submarine

hydrothermal system. Science 283:831–833.

Iwabe, N., K. Kuma, M. Hasegawa, S. Osawa, and T. Miyata.

1989. Evolutionary relationship of archaebacteria, eubacteria,

and eukaryotes inferred from phylogenetic trees of

duplicated genes. Proc. Natl. Acad. Sci. USA 86:9355–9359.

Jenkins, C., V. Kedar, and J. A. Fuerst. 2002. Gene discovery

within the planctomycete division of the domain Bacteria

using sequence tags from genomic DNA libraries. Genome

Biol. 3: research0031.1-0031.11.

Jukes, T. H., and C. R. Cantor. 1969. Evolution of protein

molecules. Pp. 21–132 in Mammalian protein metabolism

(H. N. Munro, ed.). Academic Press, New York.

Katinka, M. D., S. Duprat, E. Cornillot, G. Metenier,

F. Thomarat, G. Prensier, V. Barbe, E. Peyretaillade,

P. Brottier, P. Wincker, et al. 2001. Genome sequence and

gene compaction of the eukaryote parasite Encephalitozoon

cuniculi. Nature 414:450–453.

Keeling, P. J., and N. M. Fast. 2002. Microsporidia: biology and

evolution of highly reduced intracellular parasites. Annu.

Rev. Microbiol. 56:93–116.

Keeling, P. J., and J. D. Palmer. 2000. Parabasalian flagellates

are ancient eukaryotes. Nature 405:635–637.

Kishino, H., T. Miyata, and M. Hasegawa. 1990. Maximum

likelihood inference of protein phylogeny, and the origin of

chloroplasts. J. Mol. Evol. 31:151–160.

Knoll, A. H. 1992. The early evolution of eukaryotes: a

geological perspective. Science 256:622–627.

Koonin, E. V., K. S. Makarova, and L. Aravind. 2001. Horizontal

gene transfer in prokaryotes: quantification and classification.

Annu. Rev. Microbiol. 55:709–742.

Lang, B. F., G. Burger, C. J. O’Kelly, R. Cedergren, G. B.

Golding, C. Lemieux, D. Sankoff, M. Turmel, and M. W.

Gray. 1997. An ancestral mitochondrial DNA resembling a

eubacterial genome in miniature. Nature 387:493–497.

Lecointre, G., H. Philippe, H. L. V. Le, and H. Le Guyader.

1993. Species sampling has a major impact on phylogenetic

inference. Mol. Phylogenet. Evol. 2:205–224.

Lecointre, G., H. Philippe, H. L. V. Le, and H. Le Guyader.

1994. How many nucleotides are required to resolve a

phylogenetic problem? The use of a new statistical method

applicable to available sequences. Mol. Phylogenet. Evol.

3:292–309.

Levy, M., and S. L. Miller. 1998. The stability of the RNA bases:

implications for the origin of life. Proc. Natl. Acad. Sci. USA

95:7933–7938.

Lloyd, D., J. C. Harris, S. Maroulis, R. Wadley, J. R. Ralphs,

A. C. Hann, M. P. Turner, and M. R. Edwards. 2002. The

“primitive” microaerophile Giardia intestinalis (syn. lamblia,

duodenalis) has specialized membranes with electron

transport and membrane-potential-generating functions.

Microbiology 148:1349–1354.

Lockhart, P. J., C. J. Howe, D. A. Bryant, T. J. Beanland, and

A. W. Larkum. 1992. Substitutional bias confounds

inference of cyanelle origins from sequence data. J. Mol.

Evol. 34:153–162.

Lockhart, P. J., D. Huson, U. Maier, M. J. Fraunholz, Y. Van De

Peer, A. C. Barbrook, C. J. Howe, and M. A. Steel. 2000.

How molecules evolve in Eubacteria. Mol. Biol. Evol.

17:835–838.

Lopez, P., D. Casane, and H. Philippe. 2002. Heterotachy, an

important process of protein evolution. Mol. Biol. Evol.

19:1–7.

Lopez, P., P. Forterre, and H. Philippe. 1999. The root of the

tree of life in the light of the covarion model. J. Mol. Evol.

49:496–508.

Lopez-Garcia, P., F. Rodriguez-Valera, C. Pedros-Alio, and

D. Moreira. 2001. Unexpected diversity of small eukaryotes

in deep-sea Antarctic plankton. Nature 409:603–607.

Lwoff, A. 1943. L’йvolution physiologique. Etude des pertes de

fonctions chez les microorganismes. Hermann et Cie, Paris.

Mai, Z., S. Ghosh, M. Frisardi, B. Rosenthal, R. Rogers, and J.

Samuelson. 1999. Hsp60 is targeted to a cryptic mitochondrion-

derived organelle (“crypton”) in the microaerophilic

protozoan parasite Entamoeba histolytica. Mol. Cell. Biol.

19:2198–2205.

Martin, W., and M. Mьller. 1998. The hydrogen hypothesis for

the first eukaryote. Nature 392:37–41.

Matte-Tailliez, O., C. Brochier, P. Forterre, and H. Philippe.

2002. Archaeal phylogeny based on ribosomal proteins.

Mol. Biol. Evol. 19:631–639.

McArthur, A. G., H. G. Morrison, J. E. Nixon, N. Q. Passamaneck,

U. Kim, G. Hinkle, M. K. Crocker, M. E. Holder, R. Farr, C. I.

Reich, et al. 2000. The Giardia genome project database.

The Origin and Radiation of Eucaryotes 105

Federation European Microbiological Societies Microbiology

Letters 189:271–273.

Miyamoto, M. M., and W. M. Fitch. 1995. Testing the covarion

hypothesis of molecular evolution. Mol. Biol. Evol. 12:503–

513.

Moreira, D., S. Kervestin, O. Jean-Jean, and H. Philippe. 2002.

Evolution of eukaryotic translation elongation and termination

factors: variations of evolutionary rate and genetic code

deviations. Mol. Biol. Evol. 19:189–200.

Moreira, D., H. Le Guyader, and H. Philippe. 1999. Unusually

high evolutionary rate of the elongation factor 1 alpha genes

from the Ciliophora and its impact on the phylogeny of

eukaryotes. Mol. Biol. Evol. 16:234–245.

Moreira, D., H. Le Guyader, and H. Philippe. 2000. The origin

of red algae: implications for the evolution of chloroplasts.

Nature 405:69–72.

Moulton, V., P. P. Gardner, R. F. Pointon, L. K. Creamer, G. B.

Jameson, and D. Penny. 2000. RNA folding argues against a

hot-start origin of life. J. Mol. Evol. 51:416–421.

Nelson, K. E., R. A. Clayton, S. R. Gill, M. L. Gwinn, R. J.

Dodson, D. H. Haft, E. K. Hickey, J. D. Peterson, W. C.

Nelson, K. A. Ketchum, et al. 1999. Evidence for lateral

gene transfer between archaea and bacteria from genome

sequence of Thermotoga maritima. Nature 399:323–329.

Nisbet, E. G., and N. H. Sleep. 2001. The habitat and nature of

early life. Nature 409:1083–1091.

Pace, N. R. 1991. Origin of life—facing up to the physical

setting. Cell 65:531–533.

Palmer, J. D. 2000. A single birth of all plastids? Nature

405:32–33.

Penny, D., B. J. McComish, M. A. Charleston, and M. D. Hendy.

2001. Mathematical elegance with biochemical realism: the

covarion model of molecular evolution. J. Mol. Evol.

53:711–723.

Peyretaillade, E., C. Biderre, P. Peyret, F. Duffieux, G. Metenier,

M. Gouy, B. Michot, and C. P. Vivares. 1998. Microsporidian

Encephalitozoon cuniculi, a unicellular eukaryote

with an unusual chromosomal dispersion of ribosomal

genes and a LSU rRNA reduced to the universal core.

Nucleic Acids Res. 26:3513–3520.

Philippe, H. 2000. Long branch attraction and protist phylogeny.

Protist 51:307–316.

Philippe, H., and A. Adoutte. 1998. The molecular phylogeny of

Eukaryota: solid facts and uncertainties. Pp. 25–56 in

Evolutionary relationships among protozoa (G. Coombs,

K. Vickerman, M. Sleigh and A. Warren, eds.). Kluwer,

Dordrecht.

Philippe, H., and A. Germot. 2000. Phylogeny of eukaryotes

based on ribosomal RNA: long-branch attraction and

models of sequence evolution. Mol. Biol. Evol. 17:830–834.

Philippe, H., A. Germot, and D. Moreira. 2000a. The new

phylogeny of eukaryotes. Curr. Opin. Genet. Dev. 10:596–

601.

Philippe, H., and J. Laurent. 1998. How good are deep

phylogenetic trees? Curr. Opin. Genet. Dev 8:616–623.

Philippe, H., P. Lopez, H. Brinkmann, K. Budin, A. Germot,

J. Laurent, D. Moreira, M. Mьller, and H. Le Guyader.

2000b. Early branching or fast evolving eukaryotes? An

answer based on slowly evolving positions. Philos. Trans.

R. Soc. Lond. B 267:1213–1221.

Philippe, H., U. Sцrhannus, A. Baroin, R. Perasso, F. Gasse, and

A. Adoutte. 1994. Comparison of molecular and paleontological

data in diatoms suggests a major gap in the fossil

record. J. Evol. Biol. 7:247–265.

Poole, A., D. Jeffares, and D. Penny. 1999. Early evolution:

prokaryotes, the new kids on the block. Bioessays 21:880–

889.

Reysenbach, A. L., and E. Shock. 2002. Merging genomes with

geochemistry in hydrothermal ecosystems. Science

296:1077–1082.

Rokas, A., and P. W. H. Holland. 2000. Rare genomic changes

as a tool for phylogenetics. Trends Ecol. Evol. 15:454–

459.

Russell, M. J., and A. J. Hall. 1997. The emergence of life from

iron monosulphide bubbles at a submarine hydrothermal

redox and pH front. J. Geol. Soc. Lond. 154:377–402.

Schwartz, R. M., and M. O. Dayhoff. 1978. Origins of prokaryotes,

eukaryotes, mitochondria, and chloroplasts. Science

199:395–403.

Silberman, J. D., C. G. Clark, L. S. Diamond, and M. L. Sogin.

1999. Phylogeny of the genera Entamoeba and Endolimax as

deduced from small-subunit ribosomal RNA sequences.

Mol. Biol. Evol. 16:1740–1751.

Silberman, J. D., C. G. Clark, and M. L. Sogin. 1996a. Dientamoeba

fragilis shares a recent common evolutionary history

with the trichomonads. Mol. Biochem. Parasitol. 76:311–

314.

Silberman, J. D., M. L. Sogin, D. D. Leipe, and C. G. Clark.

1996b. Human parasite finds taxonomic home. Nature

380:398.

Simpson, A. G., A. J. Roger, J. D. Silberman, D. D. Leipe, V. P.

Edgcomb, L. S. Jermiin, D. J. Patterson, and M. L. Sogin.

2002. Evolutionary history of “early-diverging” eukaryotes:

the excavate taxon Carpediemonas is a close relative of

Giardia. Mol. Biol. Evol. 19:1782–1791.

Slesarev, A. I., G. I. Belova, S. A. Kozyavkin, and J. A. Lake.

1998. Evidence for an early prokaryotic origin of histones

H2A and H4 prior to the emergence of eukaryotes. Nucleic

Acids Res. 26:427–430.

Sogin, M. 1997. History assignment: when was the mitochondrion

founded? Curr. Opin. Genet. Dev. 7:792–799.

Sogin, M. L. 1991. Early evolution and the origin of eukaryotes.

Curr. Opin. Genet. Dev 1:457–463.

Stechmann, A., and T. Cavalier-Smith. 2002. Rooting the

eukaryote tree by using a derived gene fusion. Science

297:89–91.

Stetter, K. O. 1996. Hyperthermophiles in the history of life.

Ciba Found. Symp. 202:1–10.

Stiller, J., and B. Hall. 1999. Long-branch attraction and the

rDNA model of early eukaryotic evolution. Mol. Biol. Evol.

16:1270–1279.

Strous, M., J. A. Fuerst, E. H. Kramer, S. Logemann, G. Muyzer,

K. T. van de Pas-Schoonen, R. Webb, J. G. Kuenen, and

M. S. Jetten. 1999. Missing lithotroph identified as new

planctomycete. Nature 400:446–449.

Swofford, D. L., G. J. Olsen, P. J. Waddell, and D. M. Hillis.

1996. Phylogenetic inference. Pp. 407–514 in Molecular

systematics (D. M. Hillis, C. Moritz and B. K. Mable, eds.).

Sinauer Associates, Sunderland, MA.

Taylor, F. J. R. 1978. Problem in the development of an explicit

106 The Origin and Radiation of Life on Earth

hypothetical phylogeny of the Lower Eukaryotes. Biosystems

10:67–89.

Tourasse, N. J., and M. Gouy. 1998. Evolutionary relationships

between protist phyla constructed from LSU rRNAs

accounting for unequal rates of substitution among sites.

Pp. 57–75 in Evolutionary relationships among protozoa

(G. Coombs, K. Vickerman, M. Sleigh and A. Warren, eds.).

Chapman and Hall, London.

Tovar, J., A. Fischer, and C. G. Clark. 1999. The mitosome, a

novel organelle related to mitochondria in the amitochondrial

parasite Entamoeba histolytica. Mol. Microbiol. 32:1013–1021.

Van de Peer, Y., A. Ben Ali, and A. Meyer. 2000. Microsporidia:

accumulating molecular evidence that a group of amitochondriate

and suspectedly primitive eukaryotes are just

curious fungi. Gene 246:1–8.

Van Niel, C. B. 1955. The microbe as a whole. Pp. 3–12 in

Perspectives and horizons in microbiology (S. A. Waskman,

ed.). Rutgers University Press, New Brunswick, NJ.

Waddell, P. J., and M. A. Steel. 1997. General time-reversible

distances with unequal rates across sites: mixing gamma

and inverse Gaussian distributions with invariant sites. Mol.

Phylogenet. Evol. 8:398–414.

Williams, B. A., R. P. Hirt, J. M. Lucocq, and T. M. Embley.

2002. A mitochondrial remnant in the microsporidian

Trachipleistophora hominis. Nature 418:865–869.

Woese, C. R. 1987. Bacterial evolution. Microbiol. Rev. 51:221–

271.

Woese, C. R., and G. E. Fox. 1977. Phylogenetic structure of the

prokaryotic domain: the primary kingdoms. Proc. Natl.

Acad. Sci. USA 74:5088–5090.

Woese, C. R., O. Kandler, and M. L. Wheelis. 1990. Towards a

natural system of organisms: proposal for the domains

Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. USA

87:4576–4579.

Wolfe, K. H., C. W. Morden, and J. D. Palmer. 1992. Function

and evolution of a minimal plastid genome from a nonphotosynthetic

parasitic plant. Proc. Natl. Acad. Sci. USA

89:10648–10652.

Yang, Z. 1996. Among-site rate variation and its impact on

phylogenetic analyses. Trends Ecol. Evol. 11:367–370.

Yarus, M. 2002. Primordial genetics: phenotype of the ribocyte.

Annu. Rev. Genet. 36:125–151.

Zuckerkandl, E., and L. Pauling. 1965. Evolutionary divergence

and convergence in proteins. Pp. 97–166 in Evolving genes

and proteins 9 V. Bryson and H. J. Vogel, eds.). Academic

Press, New York.