8 Viruses and the Tree of Life

Back

David P. Mindell

Joshua S. Rest

Luis P. Villarreal

107

Viruses, Taxa, and Life

Viruses are rarely included in syntheses regarding the common

origin and history for all life forms. There are many

reasons for this, including our ignorance of their deep history,

an earlier reluctance to consider them as living organisms,

and their extreme changeability. However, increasing

amounts of molecular sequence data enable more comparisons

among viruses and between viruses and other organisms,

and we attempt here a brief perspective on the

integration of viruses and the Tree of Life. At the outset,

we wish to emphasize that viruses have arisen on multiple,

independent occasions, being a grade rather than a single

clade, and to alert readers to the limitations of the “tree of

life” metaphor when applied to virus histories.

Viruses are obligate intracellular parasites averaging 30 nm

long, or 1/100th the size of many bacteria. They are the last

major kind of organisms to be described, and may represent

the last and broadest organismal frontier. Many viruses, when

in reproductive mode, can produce thousands of offspring per

hour in each of the hundreds or thousands of cells infected in

a single host individual. This provides copious grist for the evolutionary

mill, in producing a multitude of “winning” virus

forms and lifestyles that have ultimately succeeded in colonizing

all other organisms, from bacteria to algae, fungi, plants,

and animals, and moving with them to all regions and habitats

on Earth. The associations between viruses and their hosts

range from ephemeral one-time visits without consequence to

chronic, fatal associations. In a longer time frame, the associations

range from a possibly crucial, transformational role for

life’s earliest forms, to extinctions of host populations, to an

ongoing and deeply integrated role in the evolution of host

organisms and their genomes. Success in being small requires

great economy in structure and content. Whereas the human

nuclear genome includes roughly three billion bases of DNA

and about 35,000 genes, many common viruses, such as HIV,

carry a mere 10,000 bases of RNA or DNA and nine or so

genes. Therefore, an HIV genome is only 0.0003% the size of

a human’s genome.

The International Committee on Taxonomy of Viruses

has published a series of reports seeking to bring order to

the expanding catalog of known virus diversity using the

familiar nested taxonomic categories of species, genus, and

family. The most recent report (Van Regenmortel et al. 2000)

names roughly 3600 species and estimates that at least

30,000 viruses, strains, and subtypes are being actively studied

in research labs around the world. There is a sense that a

“significant fraction” of the primary kinds of viruses are now

known, based on the low frequency for discovery of viruses

that do not fit into existing families. However, the lower level

viral taxa described represent just the tip of the iceberg, because

little survey work has been done for viruses outside of

those infecting humans and our domestic animals and plants.

We have no idea how many different viruses with unique

capabilities infect archaebacteria, whales, slime molds, or

other of the myriad forms of life.

108 The Origin and Radiation of Life on Earth

Early classification for viruses centered on the similarity

in diseases or symptoms caused, the means of transmission,

or the kinds of organisms or even body organs infected. For

example, viruses able to induce swelling of the liver with

accompanying fever and yellowing of the skin (jaundice)

caused by buildup of a bile pigment were classified together

as the “hepatitis viruses.” This included what are now seen

as distantly related groups such as hepatitis A virus, hepatitis

B virus, yellow fever virus, and Rift Valley fever virus. Biochemical

and molecular studies in the 1960s and early 1970s

facilitated classification of viruses based on the nature of their

genetic material, whether RNA or DNA, and whether the

genome was double or single stranded and, if single stranded,

whether that strand was identical to the messenger RNA

(mRNA) transcript (positive-stranded) or complementary to

it (negative-stranded; Baltimore 1971). About this time, an

approach to classification of viruses was widely adopted in

which as many characteristics as possible were considered,

and weighted as criteria for classifying viruses into families,

genera, and species. The relative weight accorded to different

characteristics was arbitrary and potentially biased toward

maintenance of groupings that fit preconceived notions of

relationships. Beginning in the 1980s and 1990s, biologists

sought to develop a taxonomy for viruses based on phylogenetic

analyses of shared traits, primarily DNA sequences,

although this is a work very much in progress with no guarantee

of advance after the most obvious relationships are

determined. Based on similarity in the nature of the viral

genome, strandedness [(+)sense or (–)antisense] of the viral

genome, capacity (or not) for reverse transcription, and polarity

of the viral genome, six primary groups are generally

recognized, composed of at present 62 families and 233 genera

(Van Regenmortel et al. 2000; table 8.1).

Because viruses reproduce asexually, the “biological”

species definition, with species recognized on the basis of

reproductive isolation among sexually reproducing individuals,

is not relevant. This is also the case with the vast majority

of other life forms, including bacteria and many

eukaryotes, where species and higher level taxa are recognized

on the basis of common descent and either relative age

of divergence or degree of differentiation. The concept of

“quasi-species” was initially developed to describe a wild-type

genome of RNA molecules accompanied by a distribution of

its mutants in studies of the origin of life (Eigen 1971), and

has been extended to RNA viruses. However, the term “quasispecies”

is derived from chemistry, in which “species” refers

to an assembly of identical molecules, rather than being derived

from evolutionary biology in which “species” generally

refers to gene flow among individuals or diagnosable evolutionary

units. Although the quasi-species concept has been

useful as a population genetic model, it has no direct application

to systematics and taxonomy. As an indication of this,

any particular RNA sequence may belong to more than one

quasi-species, depending on which traits the wild-type selected

for study is intended to model. A recent definition

explicitly for virus species is as “a polythetic class of viruses

that constitute a replicating lineage and occupy a particular

ecological niche” (Van Regenmortel 2000). A polythetic class

is one in which no single feature is essential for membership.

Viruses have traditionally been excluded from considerations

of the Tree of Life. Initially, some biologists balked at

recognizing them as life forms and did not consider them to

be taxa (a term used loosely to designate any evolutionary

lineage), because they depend on their hosts for replication

of their own DNA or RNA. In retrospect, this view appears

arbitrary and unnecessarily restrictive. Viruses exhibit many

features common to other life forms, including structural

organization based on heritable nucleic acid sequences, reproduction,

use of material resources from their environment,

internal homeostatic controls within individuals

(virions) to promote survival in changing environments, diversity

in form and function of parts, and the capacity to

adjust to changing conditions over time and to evolve. There

are many obligate parasites that we do not hesitate to call

“alive” or recognize as taxa, including the specialized and

entirely dependent Escherichia coli in our digestive tracts and

the many forms of mycorrhizal fungi dependent on and restricted

to life on plant roots, as just two examples. Although

viruses closely resemble mobile genetic elements, including

plasmids, episomes, transposons, and retrotransposons, viruses

differ in having individuals mature within proteinaceous

capsid and envelope structures that permit efficient

target cell receptor specificity and transmission among cells

and among host individuals.

Many biological terms, units and concepts defy exact

definition and application, due in part to the dynamic processes

involved in evolution and the existence of variable

intermediates between so many of the recognized units.

Consider the difficulty in defining some of the most frequently

invoked biological units such as “species” and “gene.”

“Life” may be seen as similarly difficult to define, and ultimately,

its definition is a matter of human convention.

T. Dobzhansky famously remarked that nothing in biology

makes sense except in the light of evolution, and by extension,

it is now widely recognized that nothing in evolution

makes sense except in the light of phylogeny. Thus, understanding

virus evolution, which is often distinct from that of

their hosts, requires a phylogenetic perspective and, ultimately,

inclusion in the phylogenetic Tree of Life. Evolution

of viruses is increasingly seen as a key component in the history

of life.

The more substantive, empirical reason that viruses have

been excluded from Tree of Life discussions in the past involves

the difficulty and frequent impossibility of finding

homologous traits suitable for phylogenetic analyses relating

diverse viruses and relating viruses to other organisms

(Holland and Domingo 1998), as well as widespread recombination

among lineages (see Worobey and Holmes 1999).

The shortage of homologous traits will be a lasting impediment

to direct comparisons and phylogenetic analyses for

Viruses and the Tree of Life 109

Table 8.1

Six Classes and 62 Recognized Families of Viruses.

Virus familya Representative common name(s) Known Hostsb

Double-strand DNA viruses

Myoviridae (6) Phage T4 Arc, Eub

Siphoviridae (6) Phage 1 Arc, Eub

Podoviridae (3) Phage T7 Eub

Tectiviridae (1) Phage PRD1 Eub

Corticoviridae (1) Phage PM2 Eub

Plasmaviridae (1) Phage L2 Eub

Lipothrixviridae (1) Thermoproteus virus 1 Arc

Rudiviridae (1) Sulfolobus virus SIRV-1 Arc

Fuselloviridae (1) Sulfolobus virus SSV-1 Arc

Poxviridae (13) Vaccinia virus, cowpox Inv, Ver

Asfarviridae (1) African swine fever virus Ver

Iridoviridae (4) Lymphocystis disease virus 1 Inv, Ver

Phycodnaviridae (3) Paramecium bursaria Chlorella virus 1 Alg

Baculoviridae (2) Cydia pomonella granulovirus (CpGV) Inv

Herpesviridae (9) Human herpesvirus 1, bald eagle herpesvirus Ver

Adenoviridae (2) Human adenovirus A, snake adenovirus Ver

Polyomaviridae (1) Simian virus 40 (SV-40), bovine polyomavirus Ver

Papillomaviridae (1) Human papillomavirus, canine oral papillomavirus Ver

Polydnaviridae (2) Campoletis aprilis ichnovirus Inv

Ascoviridae (1) Diadromus pulchellus ascovirus Inv

Single-strand DNA viruses

Inoviridae (2) Phage M13, Vibrio phage v6 Eub

Microviridae (4) Phage fX174, Chlamidia phage 1 (Ch-1) Eub

Geminiviridae (3) Maize streak virus (MSV), beet curly top virus Pla

Circoviridae (1) Chicken anemia virus, porcine circovirus Ver

Parvoviridae (6) Canine parvovirus, Aedes aegypti densovirus Inv, Ver

DNA–RNA reverse-transcribing viruses

Hepadnaviridae (2) Hepatitis B virus Ver

Caulimoviridae (6) Petunia vein clearing-like virus Pla

Pseudoviridae (2) Saccharomyces cerevisiae Ty-1 virus Fun, Inv, Pla

Metaviridae (2) Drosophila melanogaster gypsy virus Fun, Inv, Pla

Retroviridae (7) HIV-1, avian leukosis virus Ver

Double RNA viruses

Cystoviridae (1) Phage f6 Eub

Reoviridae (9) Mammalian orthoreovirus, rice dwarf virus Inv, Pla, Ver

Birnaviridae (3) Infectious pancreatic necrosis virus Inv, Ver

Totiviridae (3) Giardia lamblia virus Fun, Pro

Partitiviridae (4) Penicillium chrysogenum virus Fun, Pla

Hypoviridae (1) Cryphonectria hypovirus 1–EP713 Fun

(–) Sense single-strand RNA viruses

Bornaviridae (1) Borna disease virus Ver

Filoviridae (2) Marburg virus, Zaire ebola virus Ver

Paramyxoviridae (11) Mumps virus, measles virus Pla, Ver

Rhabdoviridae (6) Rabies virus, potato yellow dwarf virus Ver, Pla

Orthomyxoviridae (4) Influenza A virus Ver

Bunyaviridae (5) Hantaan virus, tomato spotted wilt virus Pla, Ver

Arenaviridae (1) Hepatitis delta virus Ver

(+) Sense single-strand RNA viruses

Leviviridae (2) Phage MS2 Eub

Narnaviridae (2) Saccharomyces cerevisiae narnavirus 20S Fun

Picornaviridae (6) Poliovirus, hepatitis A virus Ver

Sequiviridae (2) Parsnip yellow fleck virus Pla

Comoviridae (3) Tobacco ringspot virus Pla

Potyviridae (6) Ryegrass mosaic virus Pla

Caliciviridae (4) Rabbit hemorrhagic disease virus Ver

Astroviridae (1) Human astrovirus 1 Ver

Nodaviridae (2) Striped jack nervous necrosis virus Inv, Ver

(continued)

110 The Origin and Radiation of Life on Earth

many virus groups, particularly higher level taxa. However,

as new molecular data for both viruses and their hosts are

collected, and as comparative evolutionary analyses proceed,

an increasing number of explicit hypotheses regarding virus

relationships, especially among close relatives, are being developed.

Minimally, these provide hypotheses for further

testing. In the following sections we provide a brief overview

of existing hypotheses regarding virus evolutionary history,

recognizing them to be speculative and in many cases only

weakly supported.

Virus Origins

Our understanding of ancient virus origins is extremely limited,

because of their fast pace of evolutionary change, recombination

among lineages, and the very small number of

homologous characters available, if any, for comparison between

viruses and other organisms. Despite these severe limitations,

three general hypotheses for the mechanism of viral

origins have been identified and can be referred to as (1) the

primordial, (2) the escaped transcript, and (3) the regressive

hypotheses (reviewed in Strauss et al. 1996, DeFilippis and

Villareal 2001; fig. 8.1). These rely on the same evolutionary

mechanisms, including mutation, recombination, and

natural selection, known to operate in more recent times and

throughout the history of life. These three hypotheses are not

mutually exclusive, and more than one may apply in any

particular case. These hypotheses of virus origins are distinct

from hypotheses of phylogenetic relationship showing patterns

of common ancestry among virus lineages subsequent

to their origins from nonviruses.

The primordial hypothesis holds that some RNA viruses

have been present since the beginnings of life on Earth about

3.8 billion years ago. In this primordial hypothesis, simple

RNA molecules, with strings of concatenated nucleotides,

arose from pools of free nucleotides as a result of the chemical

and physical attractions among singleton nucleotides.

Simple RNA molecules have now been shown to be capable

of copying themselves by serving as a polymerase enzyme.

They are also able to cut other nucleotide strings and successfully

integrate themselves into the cut site. Discovery of

these abilities, together with the observation that RNA sequences

self-assemble more readily but are less stable over

time compared with DNA sequences, have fueled the view

of early life being encoded by RNA. Eventually, information

storage by reactive RNA molecules was replaced, via reverse

transcriptase (RT) activity, by information storage in more

stable DNA molecules (reviewed in Joyce 2002). Although

some of these early self-replicating molecules eventually collected

and organized into duplicating units that we can call

“host cells,” other molecules were packaged into virus particles

that coevolved with host cells and parasitized them. The

fact that viruses and their related genetic elements are ubiquitous

within the cells or genomes of all life forms also suggests

an early origin (fig. 8.1, upper panel). Evidence and

scenarios for evolution at the RNA level that may have taken

place in simple viral or previral systems are reviewed in

Robertson (1992) and Robertson and Neel (1999).

The escaped transcript hypothesis posits that viruses

arose from mRNAs or other host-cell RNA or DNA molecules

that acquired the ability to be replicated and packaged in a

proteinaceous coat, enabling an escape from their cellular

confines. mRNAs routinely pass through the membrane of

the nucleus, on their way to the ribosomes in the cellular

cytoplasm, where they are translated into amino acids. Successful

passage through the nuclear membrane makes navigation

of the cell wall seem feasible as well, although the

mechanisms differ significantly. In this scenario, viruses

evolved through a series of intermediate forms, from an obligate

intracellular progenitor. Figure 8.1’s lower panel illustrates

the escaped transcript hypothesis, with the dashed line

indicating viral origin from a set of characters that eventually

obtained features (additional genes) enabling survival and

Table 8.1

(continued)

Virus familya Representative common name(s) Known Hostsb

Tetraviridae (2) Southern bean mosaic virus Inv

Luteoviridae (3) Barley yellow dwarf virus-PAV Pla

Tombusviridae (8) Oat chlorotic stunt virus Pla

Coronaviridae (2) Equine torovirus Ver

Arteviridae (1) Equine arteritis virus Ver

Flaviviridae (3) Hepatitis C virus, dengue virus Ver

Togaviridae (2) Rubella virus, tobacco mosaic virus Ver, Pla

Bromoviridae (5) Cucumber mosaic virus Pla

Closteroviridae (2) Grapevine virus A Pla

Barnaviridae (1) Mushroom bacilliform virus Fun

aNumbers in parentheses denote number of recognized genera. Some genera are currently unassigned to a family and are not included here.

bArc, Archaea; Eub, Eubacteria; Fun, fungi; Inv, invertebrates; Pro, protists; Ver, vertebrates; Pla, plants; Alg, lower animals.

Follows Van Regenmortel et al. (2000).

Viruses and the Tree of Life 111

evolution as a distinct biological entity. Initially developed

by Lwoff (1957) and Temin (1980), this hypothesis is widely

held for DNA viruses and retroviruses. Despite its appeal, no

virus family can be firmly linked to an origin of this kind at

present.

The regressive hypothesis supposes that viruses are descended

from formerly free-living bacteria that have lost functions

and the DNA and structures associated with them. This

seemed plausible in the past, given the existence of parasitic,

intracellular bacteria that are entirely dependent on their hosts

for energy and synthesis of proteins. However, with the advent

of molecular data, this model now appears untenable,

given the many structural, functional, and molecular sequence

traits known to be shared between viruses and various nonbacterial

genetic elements, and as the many disparities between

viruses and bacteria become better known.

Virus Phylogenies

Sixty-two different virus families have been recognized (table

8.1), and support for them as monophyletic groups varies

from strong to limited. However, only a small number of

these families have been related to each other in higher level

taxonomic groupings based on phylogenetic considerations,

and these include the only three currently recognized orders:

Caudovirales, Mononegavirales, and Nidovirales. Other hypothesized

relationships among families exist, although the

hypothesized clades have not been named. In the following

section, we briefly review some of the phylogenetic hypotheses

among as well as within virus families.

The earliest classification encompassing all viruses is

phenetic, being based on the nature of their genetic material,

as mentioned above. These fundamental differences in

the genomes, and the associated differences in their molecular

biology, suggest the hypothesis that these groups stem from

independent and mechanistically different origins. In addition,

sets of viruses within the three primary groups (DNA

viruses, RNA viruses, reverse-transcribing viruses) mentioned

have basic differences from each other [e.g., (+)strand = sense

strand vs. (–)strand = sense strand, segmented vs. nonsegmented

genomes) that might also be the result of independent

origins. Based on these differences in form and

function and the apparent feasibility of repeated, independent

origins, most researchers would agree that the viral

lifestyle has arisen on multiple occasions. If this is the case,

viruses as a group comprise a grade, rather than a clade.

Grades share a particular lifestyle or form of organization,

rather than common ancestry, and that makes them a group

sharing convergent similarity, as opposed to a clade, which

denotes a monophyletic group representing all and only the

descendents of a particular common ancestor. Recognizing

viruses as a grade underscores their potential for future independent

origins.

RNA Viruses

RNA viruses have RNA genomes and do not replicate via a DNA

intermediate as in the reverse-transcribing viruses. The taxonomic

majority have single-strand positive [(+)strand = sense

strand] genomes, others have single-strand negative (or antisense)

genomes, and the rest have double-stranded genomes.

Phylogenetic analyses using conserved RNA-dependent RNA

polymerase (RdRp) amino acid sequences for representatives

Primordial hypothesis

Eukarya

RNA viruses

Bacteria

Origin of life

Escaped transcript hypothesis

Bacteria or Eukarya or Archaea

virus

Origin of life

Figure 8.1. Hypotheses for virus origins. (Upper panel)

Primordial hypothesis: RNA viruses arise early in the history of

life, concomitant with evolution of first cells; dark shading for

Eucarya lineage denotes viral genetic contribution to early

evolution of Eucarya. (Lower panel) Escaped transcript

hypothesis: RNA viruses arise from mRNAs or other host-cell

RNA or DNA molecules that acquired the ability to be replicated

and packaged in a proteinaceous coat. The polygon base of the

diagram denotes early history of life before and including

evolution of first cells and horizontal transfer of genetic

material.

112 The Origin and Radiation of Life on Earth

of all three RNA virus groups mentioned have been controversial.

Zanotto et al. (1996) found that RdRp sequences cannot

be used for simultaneous phylogenetic analysis of all RNA viruses

based on a lack of sequence similarity and reliable phylogenetic

signal, with alternative alignments and phylogenetic

methods yielding incongruent topologies and none of the hypothesized

multifamily supergroups (described below) receiving

significant support. More recently, Gibbs et al. (2000)

present analyses supporting monophyly of RdRp sequences

from the postulated alpha-like virus supergroup of single-strand

positive RNA (ss+RNA) viruses (including alfamoviruses and

closteroviruses, among others), although their analyses also do

not support simultaneous analysis of all RdRp sequences.

Previously, a single, common origin for this RdRp in all

RNA viruses had been postulated (Gorbalenya 1995), consistent

with the notion of a single origin for RNA viruses

(Strauss et al. 1996; fig. 8.2, upper panel). Analyses of RdRp

together with helicase and chymotrypsin-like proteases had

suggested that each of the three primary RNA virus genomic

classes [ss+, single-strand negative (ss–), double-strand (ds)]

represents a monophyletic group (Gorbalenya 1995). Some

researchers had suggested that dsRNA viruses originated

multiple times independently from ss+RNA viruses (Koonin

and Dolja 1993, Ward 1993), which comprise about 80%

of known RNA viruses. Others interpreted phylogenetic evidence

to suggest that dsRNA viruses gave rise to ss+RNA

viruses, which gave rise, in turn, to ss–RNA viruses (Bruenn

1991, Goldbach and De Haan 1994). There is no consensus

on this, and utility of RdRp at this level is problematic. Further,

RNA viruses had been classified into six “supergroups”

(Carmo-like, Sobemo-like, Picorna-like, Flavi-like, Alphalike,

and Corona-like viruses), each including multiple families,

based on morphologic and genomic characteristics as

well as phylogenetic analysis of conserved protein sequences

(Gorbalenya and Koonin 1989, Gorbalenya 1995). Among the

ss+RNA viruses, the families Coronaviridae and Arteriviridae

were placed together as the only two members of the order

Nidovirales. An explicit hypothesis for phylogeny among

ss+RNA Picorna-like viruses is presented in figure 8.3, upper

panel, and among Tombusviridae taxa, in figure 8.3, lower left

panel. Among the ss–RNA viruses, four families of enveloped,

linear, nonsegmented viruses (Bornaviridae, Filoviridae, Paramyxoviridae,

and Rhabdoviridae) were placed together in

the order Mononegavirales (fig. 8.3, lower right panel).

Bornaviridae differs from the others in having a unique pattern

of mRNA processing. These high-level groupings remain

speculative.

Although both the RNA viruses and the reverse-transcribing

viruses have RNA genomes, their use of different virally

encoded polymerases (RdRp and RT, respectively) suggests

separate origins for them. However, an alternative view,

which assumes a common ancestor for RNA viruses and the

reverse-transcribing viruses, or at least their polymerases, has

been used in rooting phylogeny for RT sequences with RdRp

(e.g., Eickbush 1997). The structures of two RTs and three

RdRps have been determined, and the similarity between these

structures, in configuration and order of domains, is consistent

with the view that RNA-dependent polymerases of picornaviruses,

flaviviruses, and retroviruses share a common

ancestor (e.g., Bressanelli et al. 1999, Ago et al. 1999). However,

alignments for RdRp and RT must still be viewed cautiously

because of relatively low similarity between RT and

RdRp sequences, and the possibility that their similarity might

be due to similar functions and convergent evolution.

Reverse-Transcribing Viruses

The five families in this group (table 8.1, fig. 8.2, lower panel)

all replicate by reverse transcription and encode the enzyme

RT. All five families are thought to share common ancestry,

possibly via descent from host genomic elements with RT

known as long-terminal-repeat (LTR) retrotransposons, and

Origin of life

Eukarya

RNA viruses

Bacteria

ss+ RNA viruses

ds RNA viruses

ss- RNA viruses

Hepadnaviridae

Caulimoviridae Retroviridae

Pseudoviridae Metaviridae

Origin of life

Bacteria Eukarya

Figure 8.2. Hypotheses for phylogeny and origins among RNA

viruses showing potential monophyly after a primordial origin

(upper panel) and reverse-transcribing viruses showing

potential monophyly and an escaped transcript origin (lower

panel). ds, double strand; ss, single strand.

Viruses and the Tree of Life 113

to comprise a monophyletic group. Position of the root is

not known, and correspondingly, relationships among families

remain uncertain. It is also possible, however, that two

or more of the five families denote independent origins (see

Temin 1980, Xiong and Eickbush 1990, Eickbush 1997,

McClure 1999, Boeke et al. 2000). Retroviridae, Metaviridae,

and Pseudoviridae have RNA genomes, whereas

Caulimoviridae and Hepadnaviridae (including hepatitis B

virus) have DNA genomes, transcribed by host DNA polymerase,

and then reverse transcribed by the virus’s own RT.

A phylogenetic hypothesis for seven genera within the bestknown

family, Retroviridae, is presented in figure 8.4, left

panel.

Phylogenetic analyses of conserved RT domains unite

an impressive array of elements, including RT from reversetranscribing

virus families, numerous cellular and organellar

retroelements, and the cellular gene telomerase, which performs

elongation of telomeres (repeated DNA sequences

capping chromosome ends) in eukaryotes. RT analyses

rooted with RdRp indicate monophyly for a set of RT sequences

from prokaryotic and mitochondrial genomes, including

group II introns and retrons as sister groups, with

successively basal divergences for non-LTR retrotransposons,

telomerases, and LTR retrotransposons, which include

retroviruses (Eickbush 1997). Analyses excluding RdRps and

using the prokaryotic retroelements as the outgroup yield a

POL3L

EV22

PRSVA

TVMV

TEV

PEMVE

PSBMV

PPVRA

BAMMV

BAYMG

BSMRV

APV

LORDV

SOUV3

SMSV1

FCVF9

HUECV

RHDV

INFV

HPAV2

AIV

FMDVA

ERV1

TMEVD

EMCUB

ERV2

TOMRV

TOBRV

TBRVS

GFLV

CPSMV

RCMV

MCDV

DCV RHPV HIPV

PSIV RTSV

PYFV1

HRV89

COMOVIRIDAE

Cowpea mosaic virus

SEQUIVIRIDAE

Parsnip yellow fleck virus

PICORNAVIRIDAE

Polio virus

POTYVIRIDAE

CALICIVIRIDAE

Rabbit hemorrhagic disease virus

Potato virus Y

Figure 8.3. (Upper panel)

Phylogenetic hypothesis for

ss+RNA viruses of the Picornalike

supergroup based on RNA

polymerase 3Dpol (Gromeier et

al. 1999). Two provisional

groups are unassigned to a

family. (Lower left panel)

Phylogenetic hypothesis for

select Tombusviridae genera

based on DNA polymerase.

(Lower right panel) Phylogenetic

hypothesis for order

Mononegavirales based on DNA

polymerase (Pringle and Easton

1997). Note the non-monophyly

for Paramyxoviridae.

Common names are given for

family representatives.

TNV-A

OCSV

RCNMV

CarMV

PoLV

TBSV

MCMV

PMV

TMV

TOMBUSVIRIDAE

Oat chlorotic stunt virus

Tobacco necrosis virus

HPIV-2

SV-41

MEV CDV

SV-5

MUV

NDV

SeV

HPIV-3

HRSV

APV

RV

VSV

MARV

PARAMYXOVIRIDAE

Measles virus, Sendai virus

FILOVIRIDAE

Marburg virus

RHABDOVIRIDAE

Rabies virus

Vesicular stomatitis virus

114 The Origin and Radiation of Life on Earth

different topology, with LTR retrotransposons and telomerases

as sister taxa and non-LTR retrotransposons as sister

to them. This difference in topology implies different scenarios

for the relative timing of origin for telomerase, retrotransposons,

and reverse-transcribing viruses. Telomerases

and non-LTR retrotransposons have similar catalytic mechanisms,

in which the 3' hydroxyl group of a DNA end is used

to prime reverse transcription. Their functional similarity is

demonstrated even more dramatically by the finding that

non-LTR retrotransposons (TART and HeT-A) appear to have

replaced telomerase for telomere replication in Drosophila

melanogaster (Levis et al. 1993). Regardless of which topology

for the vast array of RT sequences is correct, gene trees

like those described above indicate the dynamic nature of RT

and reverse-transcribing virus evolution, and the important

role of RT in evolutionary history.

DNA Viruses

The DNA viruses are a heterogeneous group. Some have

double-stranded genomes, and others have single-stranded

genomes. Some are enveloped, and others are not; some

encode polymerase, and some others do not. They vary in

size from <2 to >670 kilobases. There is no evidence indicating

monophyly for DNA viruses overall, and it appears

likely that DNA viruses have had multiple origins, possibly

via the hypothesized escaped element mechanism outlined

above. Like RT, all DNA-dependent DNA polymerases

(DdDps), whether from DNA viruses or from the genomes

of eukaryotes and prokaryotes, appear to have evolved from

a single common ancestor (Knopf 1998, Wang 1991). The

ordering of functional domains for these proteins appears well

conserved. However, DNA viruses with DdDp (including

phycodnaviruses, poxviruses, baculoviruses, and mycobacteriophages,

among others) are highly divergent and cannot

be linked by evidence to form a monophyletic group. Filйe

et al. (2002) present phylogenetic analyses for five different

DNA polymerase families, also indicating a complex history

of lateral gene transfer among viruses, plasmids, and their

diverse hosts. Among the dsDNA viruses, three diverse

families of tailed viruses infecting bacteria (Myoviridae,

Siphoviridae, and Podaviridae) are placed together in the

order Caudovirales. The ssDNA viruses all use a proteinprimed

DNA replication mechanism that is distinct from that

of other viruses. Poxviridae is an example of a large and wellknown

DNA virus family with well-supported phylogenetic

structure (fig. 8.4, right panel).

Why Try to Integrate Viruses

in the Tree of Life?

Efforts to determine the phylogenetic origins and subsequent

pattern of evolution for viruses, obscured as they are, can be

justified on the same basis as all Tree of Life research: we

desire a comprehensive understanding of life’s history. This

comprehensive understanding entails inclusion of all taxa,

to whatever extent possible, for two reasons: first, so all major

groups are accounted for (i.e., so the vastness of our ignorance

is appropriately exposed, and not hidden for convenience),

and second, so the record of character and

organismal change can be recovered as accurately as possible.

One of the lessons of phylogenetics is that our understanding

of the record of evolutionary change generally improves

as we integrate more taxa and more characters into our analyses.

Although most events in the long and varied evolutionary

histories for the grade we call “viruses” are unrecoverable,

viruses are not unique in this regard. As one example, pale-

Lentivirus

Spumavirus

Epsilonretrovirus

Gammaretrovirus

Betaretrovirus

Alpharetrovirus

Deltaretrovirus

Feline

immunodeficiency

virus

Mouse mammary

tumor virus

Rous sarcoma virus

Bovine leukemia virus

Gibbon ape leukemia virus

Perch

hyperplasia

virus

Bovine foamy virus

HIV-2

SIV-cpz HIV-1

FIV

EIA

Visna

MMTV

MPMV

RSV

BLV

HTLV-1

HTLV-2

SnRV

BFV

HFV

GALV

FeLV

MuLV

PHV

WDSV

WEHV-2

WEHV-1

Entomopoxvirus

Avipoxvirus

Suipoxvirus Yatapoxvirus

Capripoxvirus

Orthopoxvirus Leporipoxvirus

Cotia

POXVIRIDAE

Figure 8.4. (Left panel) Phylogenetic hypothesis for the seven Retroviridae genera based on RT sequences (Hunter et al. 2000,

Dimmic et al. 2002). Common names are given for genus representatives. (Right panel) Phylogenetic hypothesis for select Poxviridae

genera based on thymidine kinase DNA sequences (Moyer et al. 2000).

Viruses and the Tree of Life 115

ontologists also work with small amounts of fragmentary data

to reconstruct history based on one or a few representatives

of diverse (and in their case often extinct) clades. The unique

and significant role of viruses (see below) in the evolution of

life makes the effort of placing them in the context of the Tree

of Life particularly compelling.

Reverse Transcriptase and Transition

from an RNA to a DNA World

An early difficulty in studies of the origin and evolution of

life had been in explaining DNA synthesis. DNAs are synthesized

with the help of enzymes, which are themselves

encoded by DNA. This leaves one wondering how those early

DNA-synthesizing proteins came into being. Beginning in the

late 1960s a series of hypotheses and, later, discoveries were

made that led to our current view of an early RNA world as

a precursor to our current DNA world, where all organisms

other than viruses have DNA genomes. The ribonucleotides

in RNA were found to be more readily synthesized than the

deoxyribonucleotides in DNA, and most important, some

RNAs (ribozymes) were indeed capable of self-replication.

The finding that RNAs are less stable over time than DNAs

provided the underlying pressure for natural selection to

effect a change from RNA to DNA as the heritable material

for storing information that encodes organisms. RT is the only

known enzyme capable of synthesizing DNA from RNA

templates and has apparently played a pivotal role in the

transition between RNA and DNA worlds. This enzyme is

the defining feature of the reverse-transcribing viruses

(table 8.1) and for a larger, encompassing group of genetic

elements (retroids, e.g., retrons, retrotransposons, retroplasmids).

As a consequence, understanding the history of RT

evolution, in the reverse-transcribing families of viruses (table

8.1) and other retroids, gives us a fuller picture of the capabilities

and past activities of this apparently seminal agent.

The extent to which retroids have been involved in ancient

and recent events of genome evolution is just beginning to

be assessed (e.g., McClure 1999, Moran et al. 1999, Kidwell

and Lisch 2000).

Viruses and Eukaryotic Genomes

Phylogeneticists are silent regarding diversification among

RNA world entities, because none survive as such, with the

possible exception of some RNA viruses, as mentioned above.

The three extant, primary lineages of DNA-based organisms

are recognized as Bacteria, Archaea, and Eucarya (Woese

1987). Hypotheses regarding the origin of eukaryotic cells

generally invoke symbioses between eubacterial and

methanogenic archaeal taxa (e.g., Lake and Rivera 1994,

Martin and Muller 1998, Moreira and Lopez-Garcia 1998),

although this view has been questioned recently, with emphasis

given to “communal” genomic evolution and horizontal

gene transfer as a primary force (Woese 2002). There is

limited evidence suggesting a possible role for horizontal gene

transfer from some dsDNA viruses, in the early evolution of

Eucarya. Phylogenetic evidence suggesting a viral contribution

to eukaryotic cellular evolution entails finding of sister

relationships for orthologous viral and eukaryotic (nuclear)

genes, which are preceded by divergences among virus

orthologs. Such interpretations are, of course, critically

dependent on assumptions regarding position of the phylogenetic

root. For example, combined analyses of guanyltransferases

and related ATP-dependent ligases from diverse

Poxviridae and Asfarviridae taxa (e.g., African swine fever

virus) and diverse eukaryotes (including Homo, Saccharomyces,

and Methanococcus) support earlier divergence among

virus orthologs relative to divergence among eukaryotic

orthologs (Bell 2001). Similar phylogenetic patterns have

been found for various DNA polymerases (Knopf 1998,

Villarreal 1999), DNA topoisomerase (Garcia-Beato et al.

1992), and possibly RNA polymerase large subunit (Sonntag

and Darai 1996). Similar phylogenetic patterns relating these

viral and eukaryotic sets of orthologs is consistent with a

common evolutionary history for each set, and their presence

in an ancestral virus, possibly residing within an archaeal

host, before the emergence of eukaryotes. Horizontal transfer

can be multidirectional, and phylogenetic analyses are

revealing instances of eukaryotic gene capture by viruses as

well (e.g., Hughes 2002). As more eukaryotic genes and genomes

are sequenced, more evidence for past colonization

events by viruses is coming to light (especially for retroviruses;

e.g., Dimcheff et al. 2000).

Applications to Individual and Public Health

Traditionally, viral pathogens are identified on the basis of

disease symptoms and in the context of epidemiological

(population) analyses. However, as molecular sequencing

becomes routine and databases grow, rapid identification of

viral isolates can often be done based on explicit sequence

comparisons of unknown isolates with known sequences.

Quick characterizations based on presence or absence of

particular sequences often suffice for basic diagnosis, but

phylogenetic analyses allow much greater detail. For some

viruses, phylogenetic identification is particularly important

for identifying particular strains or subtypes (as for HIV-1)

having a small number of unique changes that can underlie

significant differences in virulence, transmissibility, drug

resistance, or other traits of interest. Further, phylogenetic

analyses ensure that identification is based on evolutionary

relatedness rather than just similarity, which can reflect convergence.

Thus, having virus phylogenies available, in as

much detail as possible, helps in rapid, accurate identification

of unknown viral isolates and in understanding of the

health risks and preventative measures that might be taken.

We can better understand a virus epidemic’s origin and

work more effectively to reduce future epidemics, if we understand

the pathogen’s phylogenetic history, host species

116 The Origin and Radiation of Life on Earth

range, and the geographic ranges of both host and pathogen.

For example, understanding phylogeny of Lentiviridae

taxa, including HIV and other primate immunodeficiency

viruses (e.g., Sharp et al. 2001), informs us about the importance

of avoiding direct contact with blood or other infected

tissues from other primates, particularly chimpanzees harboring

a closely related SIV (simian immunodeficiency virus).

Detailed phylogeny for HIV-1 taxa helps in tracking the

spread of the most virulent lineages and understanding

which sequence-level changes are associated with enhanced

transmissibility and virulence, and which particular sequence

sites are subject to accelerated rates of change due to selection

pressure imposed by hosts’ immune systems. Similarly,

understanding the phylogenetic position for West Nile viruses

(Flaviviridae) can potentially help in determining the

source and the cause for its recent spread to the Western

Hemisphere as well as its history of change (e.g., Anderson

et al. 2001). Accurate phylogeny for pathogens is important

in understanding any zoonosis (disease transmitted from

nonhuman to human hosts). If we can determine phylogeny

for the viral lineages we can potentially infer the molecular

changes that are associated with cross-species transmission and

increased virulence and can potentially enhance remediation

efforts, including, in some cases, development of antiviral

medications.

Phylogeny can contribute to improved vaccine development,

because identification of viruses best suited for development

of host immunity generally entails choice of the same

lineage as, or one closely related to, that in circulation. Information

on relatedness is also relevant in constructing

chimeric (recombinant) virus vaccines. Attenuated (weakened)

chimeric viruses used as vaccines may include the genes

whose products elicit development of the desired antibodies,

as well as including other sequence regions bearing

mutations that keep the virus benign. Further, consensus

sequences or even phylogenetically inferred ancestral sequences

could be used in vaccine design to minimize the

differences between engineered vaccine strains and diverse

strains in circulation (e.g., Gaschen et al. 2002).

Recent work on wildlife infectious diseases indicates that

the majority are viral in origin and that their spread into new

wildlife species is often mediated by human disturbances

(Dobson and Foufopoulos 2001). Understanding the virus

phylogeny can help inform enlightened management practices.

This may include reducing human disturbances that

foster cross-species transmission for viruses related to the

known pathogen, restricting introductions of species associated

with viruses closely related to those known to cross

host-species boundaries, and restricting the handling of live

individuals or of tissues harboring similarly related viruses.

Gene therapy is a novel form of molecular medicine attempting

to correct genetic disorders and inhibit disease

progression. Functional copies of human genes are inserted

into viral expression vectors and carried by them into cells,

where they are integrated into the host’s genome or maintained

as autonomous units (Pfeifer and Verma 2001). The

potential exists to influence the outcome of many diseases,

ranging from birth defects, to cancer, to neurological disorders.

Most work to date has focused on a small set of animal

viruses, including SV40 (Polyomaviridae), murine lukemia

virus, HIV (Retroviridae), adenovirus (Adenoviridae), and

adeno-associated virus (Parvoviridae). As suitable viruses and

viral components are identified, knowledge of their phylogenetic

relationships may crucially inform the search for

additional candidates, given that the desired traits are more

likely to be shared with closely related groups than with distantly

or unrelated groups.

Outlook

The problems faced by biologists working on the origins and

phylogeny of viruses are severe and quantitatively, although

not qualitatively, different from those faced by systematists

working on other taxa. The two primary challenges may be

summarized as (1) identifying as many homologous traits

(Mindell and Meyer 2001) as possible for comparisons

among viruses and between viruses and other organisms, and

(2) identifying recombination among lineages and its role in

diversification of taxa. Shortages of homologous characters

are inherent in the study of viruses, because of small genome

sizes, apparent independent origins for multiple groups,

rapid rates of sequence evolution (for RNA viruses in particular)

confounding alignments, and high levels of viral lineage

extinction. Frequent recombination is also inherent

among and within viral lineages, stemming from the ability

of multiple viruses to coinfect individual host cells and their

general capacity for dramatic change. Although problematic

for systematists, the capability for recombination is a key

feature in the evolutionary success of viruses. One form of

recombination (reassortment) is particularly well known as

a successful strategy for influenza A viruses (Orthomyxoviridae),

mixing genome segments from different parental

lineages in progeny, yielding novel genotypes not recognized

by hosts’ immune systems. Recombination among viral lineages,

due to template switching, is also common in the proliferation

and spread of HIV-1 among human populations

(Robertson et al. 1995) and dengue fever viruses (Flaviviridae)

as well (Worobey et al. 1999).

As a consequence of these inherent difficulties, much of

the complex evolutionary history for viruses is unrecoverable.

However, in assembling the Tree of Life, we seek a

maximally comprehensive understanding of life’s history,

which means that all life forms, including viruses, must still

be considered. Continued study of virus evolution has important

applied uses as well, for individual health, public

health, and environmental health. Despite limitations, increasingly

sophisticated methods for sequence alignments

and phylogenetic analyses, combined with an expanding

molecular sequence database for diverse viral taxa, will alViruses

and the Tree of Life 117

low systematists to improve resolution of some, although by

no means all, ancient relationships. Secondary and tertiary

structure of proteins are a promising source of conserved

characters, and additional phylogenetic insights for ancient

events are likely to be found as structural databases grow and

are used in comparative analyses. Increased understanding

of viral history, for both virus lineages and virus genes, has

begun and will continue to transform our view of the shape,

the shaping, and the interconnectedness of the Tree of Life.

Finally, we can ask how well the “tree of life” metaphor,

coined by Darwin, describes complex virus histories that

include recombination among lineages, occasional horizontal

transfer of genes with hosts, and possible origination from

sets of escaped genetic characters (rather than the usual mode

of whole organismal population divergence and lineage splitting).

Trees as phylogenetic diagrams give the impression of

organismal diversification resulting from a series of nearly

instantaneous lineage bifurcations, with single lines dividing

neatly into two, and continuing in splendid genetic isolation

from each other. Although there are many well-defined

monophyletic viral groups, one can only conclude that the

overall fit of the metaphor is poor. Nonetheless, the metaphor

of the Tree of Life is useful and deeply entrenched in

biological discourse, even if simplistic or misleading in some

ways. Interestingly, before settling on the phrase “tree of life,”

Darwin wrote of a “coral of life” (Barrett et al. 1987; see Gould

2002). With occasional connections among branches for

some forms, corals may provide a better depiction of viral

origins and diversification.

Acknowledgments

We thank Eddie Holmes for valuable comments on an earlier

draft of the manuscript, and we thank the editors of this book

for their willingness to try something new. D.P.M. was supported

by National Science Foundation grant DBI 9974525.

Literature Cited

Ago, H., T. Adachi, A. Yoshida, M. Yamamoto, N. Habuka,

K. Yatsunami, and M. Miyano. 1999. Crystal structure of

the RNA-dependent RNA polymerase of hepatitis C virus.

Structure 7:1417–1426.

Anderson, J. F., C. R. Vossbrinck, T. G. Andreadis, A. Iton,

W. H. Beckwith, and D. R. Mayo. 2001. A phylogenetic

approach to following West Nile virus in Connecticut. Proc.

Natl. Acad. Sci. USA 98:12885–12889.

Baltimore, D. 1971. Expression of animal virus genomes.

Bacteriol. Rev. 35:235–241.

Barrett, P. H., P. J. Gautey, S. Herbert, D. Kohn, and S. Smith.

1987. Charles Darwin’s notebooks, 1836–1844. Cambridge

University Press, Cambridge.

Bell, P. J. L. 2001. Viral eukaryogenesis: was the ancestor of the

nucleus a complex DNA virus? J. Mol. Evol. 53:251–256.

Boeke, J. D., T. H. Eickbush, S. B. Sandmeyer, and D. F. Voytas.

2000. Pp. 349–357 in Virus taxonomy (M. H. V. Van

Regenmortel, C. M. Fauquet, D. H. L. Bishop, E. B.

Carstens, M. D. Estes, S. M. Lemon, J. Maniloff, M. A.

Mayo, D. J. McGeoch, C. R. Pringle, and R. B. Wickner,

eds.). Academic Press, San Diego.

Bressanelli, S., L. Tomei, A. Roussel, I. Incitti, R. L. Vitale,

M. Mathieu, R. De Francesco, and F. A. Rey. 1999. Crystal

structure of the RNA-dependent RNA polymerase of

hepatitis C virus. Proc Natl. Acad. Sci. USA. 96:13034–

13039.

Bruenn, J. A. 1991. Relationships among the positive strand and

double-strand RNA viruses as viewed through their RNAdependent

RNA polymerases. Nucleic Acids Res. 19:217–

226.

DeFilippis, V. R., and L. P. Villarreal. 2001. Virus evolution.

Pp. 353–370 in Field’s virology, 4th ed. (D. M. Knipe and

P. M. Howley, eds.), vol. 1. Lippincott, Williams and

Wilkins, New York.

Dimcheff, D. E., S. V. Drovetski, M. Krishnan, and D. P.

Mindell. 2000. Cospeciation and horizontal transmission of

avian sarcoma and leukosis virus gag genes in galliform

birds. J. Virol. 74:3984–3995.

Dimmic, M. W., J. S. Rest, D. P. Mindell, and R. A. Goldstein.

2002. rtREV: a substitution matrix for inference of retrovirus

and reverse transcriptase phylogeny. J. Mol. Evol.

55:65–73.

Dobson, A., and J. Foufopoulos. 2001. Emerging infectious

pathogens of wildlife. Philos. Trans. R. Soc. Lond. B

356:1001–1012.

Eickbush, T. H. 1997. Telomerase and retrotransposons: which

came first? Science 277:911–912.

Eigen, M. 1971. Self-organization of matter and the evolution of

biological macromolecules. Naturwissenschaften 58:465–

523.

Filйe, J., P. Forterre, T. Sen-Lin, and J. Laurent. 2002. Evolution

of DNA polymerase families: evidences for multiple gene

exchange between cellular and viral proteins. J. Mol. Evol.

54:763–773.

Garcia-Beato, R., J. M. P. Freue, C. Lopez-Otin, R. Blasco,

E. Vinuela, and M. L. Salas. 1992. A gene homologous to

topoisomerase II in African swine fever virus. Virology

188:938–947.

Gaschen, B., J. Taylor, D. Yusim, B. Foley, F. Gao, D. Lang,

V. Novitsky, B. Haynes, B. Hahn, T. Bhattacharya, and

B. Korber. 2002. Diversity considerations in HIV-1 vaccine

selection. Science 296:2354–2360.

Gibbs, M. J., R. Koga, H. Moriyama, P. Pfeiffer, and T. Fukuhara.

2000. Phylogenetic analysis of some large double-stranded

RNA replicons from plants suggests they evolved from a

defective single-stranded RNA virus. J. Gen. Virol. 81:227–

233.

Goldbach, R., and P. De Haan. 1994. RNA viral supergroups

and the evolution of RNA viruses. Pp. 105–119 in The

evolutionary biology of viruses (S. S. Morse, ed.). Raven

Press, New York.

Gorbalenya, A. E. 1995. Origin of RNA viral genomes: approaching

the problem by comparative sequence analysis.

Pp. 49–66 in Molecular basis of virus evolution (A. J. Gibbs,

C. H. Calisher, and F. Garacia-Arenal, eds.). Cambridge

University Press, Cambridge.

118 The Origin and Radiation of Life on Earth

Gorbalenya, A. E., and E. V. Koonin. 1989. Viral-proteins

containing the purine ntp-binding sequence pattern.

Nucleic Acids Res. 17:8413–8440.

Gould, S. J. 2002. The structure of evolutionary theory. Harvard

University Press, Cambridge, MA.

Gromeier, M., E. Wimmer, and A. E. Gorbalenya. 1999.

Genetics, pathogenesis and evolution of picornaviruses.

Pp. 287–343 in Origin and evolution of viruses (Domingo,

E., R. G. Webster, and J. J. Holland, eds.). Academic Press,

San Diego.

Holland, J. J., and E. Domingo. 1998. Origin and evolution of

viruses. Virus Genes 16:13–21.

Hughes, A. L. 2002. Origin and evolution of viral interleukin-10

and other DNA virus genes with vertebrate homologues. J.

Mol. Evol. 54:90–101.

Hunter, E., J. Casey, B. Hahn, M. Hayami, B. Korber, R. Kurth, J.

Neil, A. Rethwilm, P. Sonigo, and J. Stoye. 2000.

Pp. 369–387 in Virus taxonomy (M. H. V. Van Regenmortel,

C. M. Fauquet, D. H. L. Bishop, E. B. Carstens, M. D. Estes,

S. M. Lemon, J. Maniloff, M. A. Mayo, D. J. McGeoch, C. R.

Pringle, and R. B. Wickner, eds.). Academic Press, San Diego.

Joyce, G. F. 2002. The antiquity of RNA-based evolution.

Nature 418:214–221.

Kidwell, M. G., and D. R. Lisch. 2000. Transposable elements

and host genome evolution. Trends Ecol. Evol. 15:95–99.

Knopf, C. W. 1998. Evolution of viral DNA-dependent

polymerases. Virus Genes 16:47–58.

Koonin, E. V., and V. V. Dolja. 1993. Evolution and taxonomy

of positive-strand RNA viruses—implications of comparative-

analysis of amino-acid-sequences. Crit. Rev. Biochem.

Mol. 28:375–430.

Lake, J. A., and M. C. Rivera. 1994. Was the nucleus the 1st

endosymbiont? Proc. Natl. Acad. Sci. USA 91:2880–2881.

Levis, R. W., R. Ganesan, K. Houtchens, L. A. Tolar, and

F. Sheen. 1993. Transposons in-place of telomeric repeats at

a Drosophila telomere. Cell 75:1083–1093.

Lwoff, A. 1957. The concept of virus. J. Gen. Microbiol.

17:239–253.

Martin, W., and M. Muller. 1998. The hydrogen hypothesis for

the first eukaryote. Nature 392:37–41.

McClure, M. A. 1999. The retroid agents: disease, function and

evolution. Pp. 163–195 in Origin and evolution of viruses

(E. Domingo, R. G. Webster, and J. J. Holland, eds.).

Academic Press, San Diego.

Mindell, D. P., and A. Meyer. 2001. Homology evolving. Trends

Ecol. Evol. 16:434–440.

Moran, J. V., R. J. De Barardinis, and H. H. Kazazian. 1999. Exon

shuffling by L1 retrotransposition. Science 283:1530–1534.

Moreira, D., and P. Lopez-Garcia. 1998. Symbiosis between

methanogenic archaea and delta-proteobacteria as the origin

of eukaryotes: the syntrophic hypothesis. J. Mol. Evol.

47:517–530.

Moyer, R. W., B. M. Arif, D. N. Black, D. B. Boyle, R. M. Buller,

K. R. Dumbell, J. J. Esposito, G. McFadden, B. Moss, A. A.

Mercer, et al. 2000. Poxviridae. Pp. 137–157 in Virus

taxonomy (M. H. V. Van Regenmortel, C. M. Fauquet,

D. H. L. Bishop, E. B. Carstens, M. D. Estes, S. M. Lemon,

J. Maniloff, M. A. Mayo, D. J. McGeoch, C. R. Pringle, and

R. B. Wickner, eds.). Academic Press, San Diego.

Pfeifer, A., and I. M. Verma. 2001. Virus vectors and their

applications. Pp. 469–491 in Field’s virology, 4th ed. (D. M.

Knipe and P. M. Howley, eds.), vol. 1. Lippincott, Williams

and Wilkins, New York.

Pringle, C. R., and A. J. Easton. 1997. Monopartite negative

strand RNA genomes. Semin. Virol. 8:49–57.

Robertson, D. L., P. M. Sharp, F. E. McCutchan, and B. H. Hahn.

1995. Recombination in HIV-1. Nature 374:124–126.

Robertson, H. D. 1992. Replication and evolution of viroid-like

pathogens. Curr. Top. Microbiol. Immunol. 176:214–219.

Robertson, H. D., and O. D. Neel. 1999. Virus origins: conjoined

RNA genomes as precursors to DNA genomes. Pp. 25–35 in

Origin and evolution of viruses (E. Domingo, R. Webster,

and J. Holland, eds.). Academic Press, San Diego.

Sharp, P. M., E. Bailes, R. R. Chaudhuri, C. M. Rodenburg,

M. O. Santiago, and B. H. Hahn. 2001. The origins of

acquired immune deficiency syndrome viruses: where and

when? Philos. Trans. R. Soc. Lond. B 356:867–876.

Sonntag, K-C., and G. Darai. 1996. Evolution of viral DNAdependent

RNA polymerases. Virus Genes 11:271–284.

Strauss, E. G., J. H. Strauss, and A. J. Levine. 1996. Virus

evolution. Pp. 153–171 in Field’s virology, 3rd ed. (B. N.

Field, D. M. Knipe, P. M. Howley, et al., eds.). Lippincott-

Raven, Philadelphia.

Temin, H. M. 1980. Origin of retroviruses from cellular

moveable genetic elements. Cell 21:599–600.

Van Regenmortel, M. H. V. 2000. Introduction to the species

concept in virus taxonomy. Pp. 3–16 in Virus taxonomy

(M. H. V. Van Regenmortel, C. M. Fauquet, D. H. L. Bishop,

E. B. Carstens, M. D. Estes, S. M. Lemon, J. Maniloff, M. A.

Mayo, D. J. McGeoch, C. R. Pringle, and R. B. Wickner,

eds.). Academic Press, San Diego.

Van Regenmortel, M. H. V., C. M. Fauquet, D. H. L. Bishop,

E. B. Carstens, M. D. Estes, S. M. Lemon, J. Maniloff, M. A.

Mayo, D. J. McGeoch, C. R. Pringle, and R. B. Wickner,

(eds.). 2000. Virus taxonomy. Academic Press, San Diego.

Villarreal, L. P. 1999. DNA virus contribution to host evolution.

Pp. 391–420 in Origin and evolution of viruses (E. Domingo,

R. Webster, and J. Holland, eds.). Academic Press, New York.

Wang, T. S. 1991. Eukaryotic DNA polymerases. Annu. Rev.

Biochem. 60:513–552.

Ward, C. W. 1993. Progress towards a higher taxonomy of

viruses. Res. Virol. 144:419–453.

Woese, C. R. 1987. Bacterial evolution. Microbiol. Rev. 51:221–

271.

Woese, C. R. 2002. On the evolution of cells. Proc. Natl. Acad.

Sci. USA 99:8742–8747.

Worobey, M., and E. C. Holmes. 1999. Evolutionary aspects of

recombination in RNA viruses. J. Gen. Virol. 80:2535–

2543.

Worobey, M., A. Rambaut, and E. C. Holmes. 1999. Widespread

intra-serotype recombination in natural populations

of dengue virus. Proc. Natl. Acad. Sci. USA 96:7352–7357.

Xiong, Y., and T. H. Eickbush. 1990. Origin and evolution of

retroelements based upon their reverse transcriptase

sequences. EMBO J. 9:3353–3362.

Zanotto, P. M. de A., M. J. Gibbs, E. A. Gould, and E. C.

Holmes. 1996. A reevaluation of the higher taxonomy of

viruses based on RNA polymerases. J. Virol. 70:6083–6096.