Section iii Genetics
vii Contents
Chapter 9 Mendelian and Non-Mendelian Characters 131 Introduction 131 Alleles ....................................................................................................................... 132 The Basics of Mendelian Inheritance....................................................................... 132 Codominance, Incomplete Dominance, Overdominance, and Underdominance 137 Epistasis .................................................................................................................... 138 Quantitative Trait Loci ............................................................................................. 140 Recombination and Linkage 141 Non-Mendelian Traits .............................................................................................. 141 Key Points ................................................................................................................ 146 Additional Readings 146 Chapter 10 Population Genetics 147 Introduction 147 Hardy–Weinberg Equilibrium .................................................................................. 148 Population Size ......................................................................................................... 150 Life Histories 151 Modes of Reproduction ............................................................................................ 154 Key Points ................................................................................................................ 158 Additional Readings 158 Chapter 11 Alleles through Time 159 Introduction 159 Natural Selection ...................................................................................................... 160 Levels of Selection 162 Random Genetic Drift 165 Mating and Dispersal ............................................................................................... 166 Gene Flow 167 Other Factors Affecting Allelic Proportions 170 Key Points ................................................................................................................ 170 Additional Readings 171 Chapter 12 Changes to DNA 173 Introduction 173 Classes of Mutations ................................................................................................ 173 Causes of Mutations ................................................................................................. 174 Mutation during Replication 179 DNA Repair .............................................................................................................. 180 Genetic Recombination ............................................................................................ 182 Key Points 185 Additional Readings ................................................................................................. 185
Section iV Multicellularity
viii Contents Chapter 13 Infectious Changes to DNA: Viruses, Plasmids, Transposons, and Introns ............ 187 Introduction .............................................................................................................. 187 Integration into Chromosomes 191 Viruses...................................................................................................................... 192 Introns ...................................................................................................................... 196 Transposable Elements 204 Plasmids ................................................................................................................... 209 Key Points ................................................................................................................ 210 Additional Readings 211
Chapter 14 Multigene Families 215 Introduction 215 Ribosomal RNA Gene Family ................................................................................. 217 Globin Gene Family 218 Bacterial Flagella Gene Family 222 Laccase Gene Family ............................................................................................... 223 Histone Gene Family 223 Orthologs and Paralogs 224 Polyploidization and Multigene Family Evolution ................................................... 224 Key Points 227 Additional Readings 227 Chapter 15 Horizontal Gene Transfer 229 Introduction 229 Plasmids ................................................................................................................... 231 Viruses...................................................................................................................... 234 Symbionts and Organelles 235 Parasites and Pathogens ........................................................................................... 239 Origin of Gram-Negative Bacteria ........................................................................... 242 Signs of HGT 243 Introns ...................................................................................................................... 244 Key Points ................................................................................................................ 246 Additional Readings 246 Chapter 16 Development: Part I—Cooperation among Cells 247 Introduction 247 Quorum Sensing ....................................................................................................... 249 Development in Animals .......................................................................................... 251 Nematode Development 252 Homeotic Genes and Proteins .................................................................................. 253 Arthropod Development ........................................................................................... 261 Development in Vertebrates 264 Hierarchy and Evolution of Homeotic Genes ........................................................... 266 Key Points ................................................................................................................ 268 Additional Readings 269
ix Contents Chapter 17 Development: Part II—Plants .................................................................................. 271 Introduction .............................................................................................................. 271 Plant Morphology 273 Development in Plants .............................................................................................. 274 Gene Expression during Development ..................................................................... 277 Formation of Leaves and Floral Organs 279 Plants versus Animals .............................................................................................. 287 Key Points ................................................................................................................ 291 Additional Readings 292 Chapter 18 Cancer 293 Introduction 293 Progression of Cancer .............................................................................................. 295 Genes Involved in Cancer ........................................................................................ 298 Types of Cancer 301 Causes of Mutations in Carcinogenesis.................................................................... 301 Point Mutations.................................................................................................... 302 Recombination 302 Amplification ....................................................................................................... 305 Viruses ................................................................................................................. 306 DNA Viruses 309 Hormones ................................................................................................................. 311 Key Points ................................................................................................................ 312 Additional Readings 312 Section V Molecular Biology and Bioinformatic Methods Chapter 19 Extraction and Quantification of Biological Molecules 315 Introduction 315 Extraction of Nucleic Acids Using CTAB ............................................................... 319 Purification of Organellar DNA 321 Extraction of RNA 322 Quantification of Nucleic Acids ............................................................................... 324 Agarose Gel Electrophoresis 324 Extraction of Proteins 329 Quantification of Proteins ........................................................................................ 330 Polyacrylamide Gel Electrophoresis 331 Key Points 332 Additional Readings ................................................................................................. 333 Chapter 20 Recombinant DNA and Characterization of Biological Molecules ......................... 335 Introduction .............................................................................................................. 335 Polymerase Chain Reaction 335 Recombinant DNA Methods .................................................................................... 338 Southern Hybridization ............................................................................................ 343 Determination of Gene Copy Number 349
x Contents Microscopy ............................................................................................................... 351 Protein Analysis 353 Key Points ................................................................................................................ 355 Additional Readings ................................................................................................. 355 Chapter 21 Sequencing and Alignment Methods ....................................................................... 357 Introduction .............................................................................................................. 357 Development of DNA Sequencing Methods 357 High-Throughput Technologies ................................................................................ 360 Next-Generation Sequencing.................................................................................... 361 Protein Sequencing 365 Sequence Homology Searches ................................................................................. 367 Aligning Sequences.................................................................................................. 368 Key Points 370 Additional Readings ................................................................................................. 371 Chapter 22 Omics: Part I ............................................................................................................ 373 Introduction .............................................................................................................. 373 Genomics 373 Transcriptomics 376 Metagenomics/Metatranscriptomics ........................................................................ 378 Microbiomics 378 Key Points 381 Additional Readings ................................................................................................. 382 Chapter 23 Omics: Part II ........................................................................................................... 383 Introduction .............................................................................................................. 383 Proteomics 383 Structural Genomics................................................................................................. 386 RNAomics ................................................................................................................ 387 Epigenomics 388 Metabolomics ........................................................................................................... 389 Functional Genomics ............................................................................................... 389 Key Points 391 Additional Readings ................................................................................................. 391 Chapter 24 Species Concepts and Phylogenetics ........................................................................ 393 Introduction .............................................................................................................. 393 What Is a Species? .................................................................................................... 393 Classification of Life 396 Reconstruction of Evolutionary History .................................................................. 399 Phylogenetics ............................................................................................................ 400 Tree Terminology 401 Choosing a Genomic Region for Phylogentics ......................................................... 402 Other Considerations When Performing Phylogenetic Analyses ............................ 408 Models of Mutation 409 Analyzing Aligned Sequences ................................................................................. 410
xi Contents Unweighted Pair Group Method with Arithmetic Mean ......................................... 410 Neighbor Joining 410 Maximum Parsimony ............................................................................................... 411 Maximum Likelihood .............................................................................................. 414 Bayesian Phylogenetic Analysis 415 Bootstrapping ........................................................................................................... 415 Vertical versus Horizontal Evolutionary Events ...................................................... 416 Key Points 417 Additional Readings ................................................................................................. 417 Chapter 25 Phylogenetic Networks and Reticulate Evolution..................................................... 419 Introduction .............................................................................................................. 419 Phylogenetic Analyses of Reticulate Events............................................................. 420 Advantages of Phylogenetic Networks 420 Horizontal Gene Transfers ....................................................................................... 422 Species Hybridization .............................................................................................. 423 Recombination 424 Transposition ............................................................................................................ 425 Reassortment ............................................................................................................ 425 Examples of Reticulate Evolution Events 426 Key Points ................................................................................................................ 427 Additional Readings ................................................................................................. 428 Chapter 26 Phylogenomics and Comparative Genomics 429 Introduction 429 Improvements in Sequencing and Phylogenomics 429 What to Compare ..................................................................................................... 432 Single-Nucleotide Polymorphisms 433 Microsatellites and Minisatellites 433 How to Compare ...................................................................................................... 434 Testing for Selection 434 Incongruent Trees 434 Comparative Genomics ............................................................................................ 435 Synteny 440 Key Points 441 Additional Readings ................................................................................................. 442
Genomes Chapter 27 RNA Viruses ............................................................................................................ 445 Introduction .............................................................................................................. 445 C-Value Paradox 448 Genomes and Genomics ........................................................................................... 448 RNA Virus Genomes ............................................................................................... 449 Human Immunodeficiency Virus 449 Influenza A Virus ..................................................................................................... 453 Ebola Virus............................................................................................................... 457
Section Vi
xii Contents Key Points ................................................................................................................ 458 Additional Readings 459 Chapter 28 DNA Viruses 461 Introduction 461 Bacteriophage ϕX174................................................................................................ 461 Bacteriophage Lambda (λ) ....................................................................................... 463 Bacteriophage T4 468 Mimivirus ................................................................................................................. 471 Key Points ................................................................................................................ 474 Additional Readings 474 Chapter 29 Bacteria and Archaea 475 Introduction 475 Escherichia coli........................................................................................................ 475 Photosynthetic Bacteria 477 Aquifex 479 Euryarchaeota........................................................................................................... 480 Crenarchaeota 482 Key Points 483 Additional Readings ................................................................................................. 483 Chapter 30 Mutualists and Pathogens ......................................................................................... 485 Introduction .............................................................................................................. 485 Termite Gut Microbes 487 Smallest Bacterial Genome ...................................................................................... 487 Coresident Symbionts ............................................................................................... 489 Animal Parasite 490 Genome Mixing and Sorting .................................................................................... 491 Key Points ................................................................................................................ 492 Additional Readings 492 Chapter 31 Endosymbionts and Organelles 495 Introduction 495 Intracellular Endosymbionts .................................................................................... 495 Mitochondria ............................................................................................................ 496 How Many Genes Make a Functional Mitochondrion? 503 Chloroplasts .............................................................................................................. 506 How Many Genes Make a Functional Chloroplast? ................................................ 508 Differential Development and Function 510 Chimeric Pathways ................................................................................................... 510 Endosymbioses Leading to Other Organelles .......................................................... 512 Key Points 514 Additional Readings ................................................................................................. 514
xiii Contents Chapter 32 P rotein Trafficking ................................................................................................... 515 Introduction .............................................................................................................. 515 Signal Peptides in Bacteria 515 Signal Peptide Systems in Eukarya .......................................................................... 518 Protein Trafficking in Mitochondria ........................................................................ 520 Protein Trafficking in Chloroplasts 520 Evolution of Protein Trafficking Systems ................................................................ 523 Key Points ................................................................................................................ 524 Additional Readings 525 Chapter 33 Eukaryotic Genomes 527 Introduction 527 Origin of the Nucleus and Mitochondrion ............................................................... 528 Multicellularity ......................................................................................................... 531 Chromalveolata 531 Opisthokonta ............................................................................................................ 532 Saccharomyces cerevisiae ................................................................................... 532 Caenorhabditis elegans 534 Drosophila melanogaster .................................................................................... 537 Archaeplastida .......................................................................................................... 538 Arabidopsis thaliana 538 Oryza sativa......................................................................................................... 540 Key Points ................................................................................................................ 543 Additional Readings 544 Chapter 34 Human Genome 545 Introduction 545 The Human Genome ................................................................................................ 545 Medical Genetics 548 Single-Nucleotide Polymorphisms 550 Forensics ................................................................................................................... 551 Human Migration 551 Key Points 554 Additional Readings ................................................................................................. 554
Preface
As with the first edition of this book, it was written with students in mind. It is meant to introduce the major topics of molecular evolution in a way that will encourage students to delve deeper into each of the topics. The book started as a series of notes, overheads, and digital slides that comprised a course in molecular evolution. The course was designed as an integrated approach to this field, for which there was no single textbook available. It draws from concepts in evolution, geology, chemistry, biochemistry, molecular biology, genetics, taxonomy, bioinformatics, various OMICS fields, and, of course, molecular evolution. Because it discusses aspects of each of these disciplines, students with broad backgrounds (as well as those with very focused backgrounds) should be able to grasp the concepts, principles, and details of this book. It presents some of the usual information regarding various aspects of cell function, but also details the variety of mechanisms that have evolved. This has been done to present a broader view of evolution that is meant to show that some processes have been approached in very different ways by the diversity of species during their evolution on Earth.
Although the first edition was organized into 18 chapters, including 197 figures, the second edition has been substantially expanded into 34 chapters, with 413 figures, essentially doubling the size of the first edition. It has been divided into six sections. Section I, Life and Evolution , covers the topics of evolution on Earth, prebiotic production of organic molecules, and definitions of life. Section II, Biomolecules, details the structures and functions of biological molecules, as well as the basic mechanisms that produce the molecules. Section III, Genetics, presents the basic genetic mechanisms that lead to the evolution of genomes and organisms. Section IV, Multicellularity, outlines the basic mechanisms of cell-to-cell communications and other processes that have led to the evolution of developmental processes in multicellular organisms. Section V, Molecular Biology and Bioinformatic Methods, consists of overviews of some of the methods used in molecular biological and bioinformatics research. Section VI, Genomes, is a survey of a set of genomes that represents a compendium of some of the important aspects of the evolution of genomes, in general. Chapters 1 through 5 (included in Sections I and II) are nearly identical to the first five chapters in the first edition. Chapter 1 is an overview of life on Earth, as well as the possible origins of life. The definitions of life are discussed in detail. At the end of the chapter, there is an exercise designed to help the reader imagine and visualize the components of a cell in their true dimensions. Chapter 2 details the evolution of organisms on Earth. It also presents a history of the study of evolution. Chapter 3 covers the basic structures of DNA, RNA, proteins, and other biological molecules, and the syntheses of each. Chapter 4 begins with the central dogma of molecular biology, but then goes into detail about the complexities of all of the central processes of this basic principle. Chapter 5 discusses the largest ribozyme in the cell, the ribosome. This includes details about the structure and function of ribosomes, as well as a discussion about its evolution. Also discussed are the mechanisms for assuring that enough rRNA is produced to supply each cell with all of the ribosomes they need. Ribosomes have been one of the key structures in cells that have led to the success of life on Earth. The remainder of Section II includes Chapters 6 through 8. Chapter 6 is new, although it was partly covered in chapter 5 of the first edition. The new chapter details some of the recent research into the origin of ribosomes, translation, and the genetic code. Conceptually, this might be one of the more difficult to understand chapters, because an alternative organization of the universal genetic code table is presented, which is organized according to the possible evolutionary events that led to translation and the current genetic code. Although the genetic code table that has been used for several decades is informative and useful, it may not reflect the evolution of the genetic code itself. Alternative tables may more accurately reflect the evolution of the genetic code. Chapter 7 discusses the various forms of DNA replication, and how they have been important in evolution. Chapter 8 presents the common, as well as many of the
xv
uncommon, modes of separating chromosomes, from separation of chromosomes in bacteria to mitosis and meiosis in a variety of eukaryotes.
Section III includes three new chapters on genetic mechanisms. Chapter 9 is focused on Mendelian genetic mechanisms, as well as non-Mendelian mechanisms of inheritance. Chapter 10 discusses the basic concepts of population genetics, including Hardy–Weineberg equilibria, population size, life histories, and modes of reproduction, all of which affect allele proportions in populations. Chapter 11 further details some of the phenomena that affect allelic proportions in populations through time, including natural selection, random genetic drift, mating and dispersal, and gene flow. Chapters 12 and 13 present the major causes and mechanisms of mutation, including repair of mutations. These were chapters 8 and 9 in the first edition.
Section IV includes one of the first edition chapters (Chapter 11 on multigene families), which is now Chapter 14 in this second edition. The other four chapters in this section are new. Chapter 15 details horizontal gene transfers (HGTs) that occur very often, and they have been occurring for billions of years. Chapter 16 begins a discussion of development, which depends on coordinated cell-to-cell communication and precise programming of gene expression in each cell. This chapter includes details regarding quorum sensing in bacteria and development in animals. Chapter 17 continues with the details of development in higher plants. Chapter 18 is focused on the genetic changes and mechanisms in carcinogenesis. Some of the same mechanisms that cause evolutionary changes are the same mechanisms that cause cancer.
Section V outlines some of the basic methods used to study molecular evolutionary processes. Chapter 19 explains some of the methods for purifying and quantifying nucleic acids and proteins. Chapter 20 presents some of the basic recombinant and characterization methods used in molecular evolutionary studies. Chapter 21 details the various methods of sequencing of DNA (and cDNA from RNA), as well as proteins. Chapters 22 and 23 are surveys of various OMIC methods to analyze DNA, RNA, protein, and other molecular data. Chapter 24, Species Concepts and Phylogenetics, is an amalgamation of two of the first edition chapters (Chapters 10 and 12). This seemed to be a logical combination. Chapter 25, Phylogenetic Networks and Reticulate Evolution, is new in this second edition. It outlines evolutionary processes that are beyond simple bifurcating trees and events (e.g., HGTs—described in Chapter 15), and explains how they are determined. Chapter 26, Phylogenomics and Comparative Genomics, explains some of the processes and challenges in using genomic data in genomic studies of evolutionary processes.
The final section, Section VI, describes specific genomes. Each has been chosen either as a representative of a specific taxonomic group, or to illustrate one or more principles of the processes that occur during the evolution of the species and their genomes. Chapters 27 (RNA Viruses), 28 (DNA Viruses), 29 (Bacteria and Archaea), 30 (Mutualists and Pathogens), and 31 (Endosymbionts and Organelles) parallel Chapters 13 through 17 of the first edition. Chapter 32 is new. Its focus is on protein trafficking in cells. It begins with trafficking in bacteria, and proceeds into more complex trafficking in Eukarya. Chapter 33, Eukaryotic Genomes, has been edited, including the deletion of the human genome. Discussion of the human genome has been expanded in Chapter 34. Part of the additions to this chapter includes details about how the human genome has led to some practical applications of the information.
Scott Orland Rogers
Bowling Green State University, Ohio
xvi Preface
1 Definitions of Life
INTRODUCTION
Before one can understand evolution, one must first decide on the boundaries of life, and what is at and beyond those boundaries. Complex cells, such as bacteria, probably did not spontaneously assemble from a set of chemical compounds. Fossil evidence of organisms resembling bacteria first appears in 3.5-billion-year-old rocks. Therefore, simpler organisms must have existed prior to this time. These organisms probably assembled and evolved long before one of their members evolved into the bacteria that were present 3.5 billion years ago. But, what were these first organisms? The initial answer is that no one knows, but we can make some educated guesses about the characteristics of the first organisms, as well as the processes that led to them. The first life on the Earth may have had its beginning from sets of chemicals and reactions that may have been derived in different ways, which eventually mixed in chemical pools or near undersea volcanic vents on the Earth. Stanley Miller published the first studies that produced amino acids and some other compounds from the simple molecules thought to be present on early Earth (Figure 1.1). He mixed water, ammonia, methane, and hydrogen into a sealed container, added heat and electrical discharges (to simulate lightning), and withdrew the products from time to time. He found that at least four of the amino acids found in modern cells were formed, and several precursors of nucleic acids were also produced. Since then, other experiments have been performed, including those that used different sets of gasses to more accurately reflect the early Earth, some performed under high pressure, and some done under cold conditions. In total, at least 16 of the biologically relevant amino acids, as well as fatty acids, and rudimentary nucleic acids were formed. More recently, in vitro experiments have been performed that have produced nucleic acids under conditions thought to have existed on the Earth early in its history. Also, amino acids and other biological compounds have been found in meteorites and comets, and peptides (linked chains of amino acids) can be formed under warm to hot conditions under high pressure, similar to the conditions in a meteorite when it passes through the atmosphere, or near deep-sea volcanic vents.
In order to begin on the pathway to an organism, amino acids, nucleic acids, fatty acids, ions, water, energy (e.g., heat or lightning), and time all are needed (Figure 1.2). The Earth is very large, and even from the beginning, it was not uniform. Some parts were hot, while other areas were cooler. The mix of chemicals and water varied. The atmospheric gasses consisted primarily of H 2 , H 2O (vapor), CO2 , N2 , CH4, and NH3 as well as smaller amounts of CO, S2 , SO2 , and Cl2 . Pressures in the oceans varied with depth. Thus, given the nearly infinite set of conditions on early Earth, the presence of some simple chemicals (at the surface, as well as those emitted by volcanic activity), water, and energy, as well as the arrival of some chemicals on comets and asteroids, it is likely that all of the compounds necessary to produce the first truly biological compartments (which we now call cells) were present very early on the Earth. Water became more and more abundant, originating from comets and volcanoes. By 4.3 billion years ago, significant amounts of water were present in the form of worldwide oceans (which probably were not nearly as salty as they are today). The atmosphere contained huge amounts of water vapor because of the high surface temperatures. This caused convection currents that led to lightning. Vapor blown out of volcanoes, and the convection currents that were formed, also resulted in lightning. The atmosphere contained a mix of other gasses, which created reducing conditions. Sometime during the first billion years, all of the components combined to create the first biological reactions and organisms. Having the chemicals around to interact probably was not alone sufficient to produce a biological organism. Organisms carry out
3
FIGURE 1.1 Diagram of the Miller–Urey apparatus used in 1953. The entire apparatus was made from glass, which is inert, except for the tungsten electrodes. Initially, the reservoir at the lower left was half-filled with distilled water, and the remainder of the apparatus was filled with hydrogen (H 2), methane (CH4), and ammonia (NH3) gasses. Heat was applied to the water-containing vessel, and simulated lightning was produced by sending an electrical current through the electrodes. Gasses were condensed into liquid using a cold-water condenser. As the U part of the tube was filled, its contents spilled into the reservoir. Samples were then removed from the reservoir and were tested for the presence of various chemicals, including amino acids. Many different amino acids were found, including several present in biological systems.
many chemical reactions, but they also reproduce with high fidelity. It is thought that RNA probably was one of the first molecules central to the origin of the first organisms. This is because RNA can be replicated to produce copies of itself, many RNAs are catalytic (called ribozymes) and can perform many different chemical reactions, and RNAs are central to all past and present organisms. They perform a myriad of functions, including encoding proteins, translation of proteins, control of gene expression, assembly of ribosomes, control of RNA concentrations, specificity of translation, initiation of DNA synthesis, addition of chromosome telomeres, and many others. It is very possible that the first self-replicating biological molecules were combinations of RNAs and peptides. These may have been concentrated and preserved by being enveloped by lipid membranes. Thus, there would be a mechanical separation between the inside and the outside, so that desirable components could be concentrated and protected inside, and waste or damaging chemicals could be kept outside of the rudimentary cell.
Because of the varied conditions on the Earth, the time involved (0.5 to 1.0 billion years), and the presence of chemicals that could become catalytic and self-replicating (e.g., some RNAs), life was initially assembled on the Earth spontaneously. While this probably was a rare event, given at least 500 million years, the 510 million square kilometers of surface area of the Earth,
4 Integrated Molecular Evolution
Heat
Condenser
Outflow
Cold water
Electrode
Electrode
Spark chamber
CH4, NH3, H2
Stopcock
Tube for sample removal
Volcanoes H2OCO2
CH4 N2
Heat
Comets, asteroids
Amino
Peptides
Chemical evolution
Biological evolution
Moon
Gravitational energy
Other Metals (catalysts)
250 million years
RNA
Amino acids
Peptides
Fatty acids
250 million years
Pressure (deep oceans)
Wide variety of conditions on Earth
Carbohydrates
Wide variety of conditions on Earth
RNA -based cells
Wide variety of conditions on Earth H2O
250 million years
DNA-based cells
FIGURE 1.2 Summary of the components that led to chemical evolution and eventually biological evolution on the Earth. The Earth provided many of the basic components for these processes, primarily in the form of volcanism. Volcanoes released hydrogen (H 2), water (H 2O), ammonia (NH 3), methane (CH4), and carbon dioxide (CO2) in large quantities, and much of the ammonia was converted into nitrogen (N2) gas. They also released carbon monoxide (CO), sulfur (S2), sulfur dioxide (SO2), and chlorine (Cl 2) gasses. Lightning was produced by the convection currents produced during volcanic eruptions into the atmosphere, as well as heating from the Sun. Comets and asteroids brought water and amino acids. As they entered the atmosphere, they heated up due to friction with the air, and they were exposed to high pressure. This can produce short polypeptides. Additionally, volcanoes in deep oceans under high pressures can produce of peptides. The UV irradiation arriving from the Sun may have also caused some additional reactions to occur. There was no ozone layer at that time, and UV irradiation reached the surface of the Earth at high intensity. Not only could the UV stimulate some reactions, but it could destroy some biological molecules, and therefore may have been more detrimental than beneficial. This probably means that early reactions and organisms on the Earth probably occurred mainly in deeper water, which would block the UV irradiation. Metals that existed on clay particles also could catalyze certain reactions, and many of them still are necessary for biological systems today. For example, iron is used extensively in enzymes, electron transport systems, and photosynthesis. Magnesium is required for many proteins that interact with nucleic acids and is required in some ribozymes as well. Finally, gravitational energy from the Moon was vital to the Earth. It produced tides in the oceans, but also distorted the Earth as both rotated. This helped to keep the iron core of the Earth liquid, which led to the magnetic field around the Earth. This channeled damaging high energy particles toward the poles and deflected the solar wind, which preserved the atmosphere, and protected life then, and still protects life on the Earth now.
the multitude of conditions, the huge mixture of chemicals, and energy (e.g., heat, lightning, and solar), it was inevitable that some sort of self-replicating, catalytic, self-contained life form would arise. Other such rudimentary life forms probably got started during this time, but many probably went extinct. Because extinction is almost as frequent as speciation, probably many of the original genetic lineages of these protocell species disappeared often and completely. But, at least one lineage did survive and led to the diversity of life that exists today. It is fortunate for you and I that it led to us, but if it had not, it would have eventually led to other life forms that would have been very different from the ones alive today, as well as from the ones that existed in the past.
5 Definitions of Life
lightening Sun Heat Lightening
acids H2 NH3 UV irradiation
RNA AND LIFE
All of the cellular organisms that exist today have ribosomes. These are complex ribozymes (enzymatic RNAs) that rely on sets of proteins to hold them in their catalytic conformations. Since they are found in all cellular life on the Earth and they all are genetically related to each other, they must have existed (albeit in a simpler form) in the progenote cells that were the ancestors of all subsequent cells on the Earth. The first indication of cellular life on the Earth is from 3.5-billion-year-old stromatolites (mushroom-shaped structures formed by sets of bacteria that live in shallow areas of oceans and seas). Therefore, by this time, there existed bacteria-like cells that contained ribosomes (or, at least, protoribosomes), which produced proteins using amino acids attached to tRNAs, reading a code on mRNAs. This is a complex process involving three different classes of RNA that must have taken tens or hundreds of millions of years to evolve. The cells by that time must have had a cell membrane composed of a lipid bilayer and was capable of concentrating needed chemicals and processes inside the cell and moving waste and toxic chemicals to the outside. At that time, the hereditary material may have been RNA. RNA mutates faster, so evolution could have been much more rapid in these organisms. However, because of the higher mutation rates, more lethal mutations are produced with each replication cycle, thus limiting the number of viable progeny and the sizes of the genomes.
Eukaryotes are not hugely more complex than bacteria and archaea. In fact, eukaryotes essentially are combinations of bacterial and archaeal cells. Genomic studies (studies based on the determination of the entire nucleotide sequence for the organisms) have shown that the bread and beer yeast, Saccharomyces cerevisiae (a single-celled eukaryote), has about 30% more genes than the bacterium Escherichia coli. The fruit fly (Drosophila melanogaster) genome has about 3.5 times more genes than E. coli, and the 2-mm-long worm Caenorhabditis elegans has about 4.4 times as many genes as the bacterium. Even humans have only about 5–6 times the number of genes as E. coli. Additionally, the majority of genes in the larger genomes are traceable back to bacterial and archaeal genomes.
The central pathway of information transfer in all membrane-bound organisms is from DNA to RNA to proteins (Figure 1.3). This is the so-called central dogma of molecular biology.
This characteristic ties all membrane-bound organisms on the Earth together and indicates a common origin for all. Through the process of transcription, an RNA copy of one of the two DNA strands is made. The messenger RNA (mRNA) is then translated on ribosomes into polymers of amino acids, known as polypeptides or proteins. This is the usual flow of information transfer in the cell. However, while these mRNAs (from 450 to over 50,000 different mRNAs in membranebound organisms) make up the complexity of a cell, they only make up roughly 5% (by mass) of the total RNA per cell. The remainder of the RNA in a cell is either transfer RNAs (tRNAs) and other small RNAs (e.g., small interfering RNAs and micro RNAs), comprising about 15% of the total, or ribosomal RNAs (rRNAs), making up about 80% of the total, respectively. These RNAs are never translated into proteins, but instead are molecules that aid in synthesizing, processing, or controlling DNA, RNA, and/or proteins. Transfer RNAs are short molecules (50–70 nucleotides) that are
FIGURE 1.3 The central dogma of molecular biology. After the structure of DNA, transcription of mRNA, and translation of proteins using a triplet code had been discovered, Francis Crick proposed the central dogma of molecular biology. It relates the information flow from DNA to RNA via transcription and the translation of mRNA by ribosomes to produce proteins. Additionally, DNA can be replicated to make additional copies. There is much more complexity to these processes, and these processes will be discussed in later Chapters 4 through 6.
6 Integrated Molecular Evolution
DNA Transcription Translation Replication RNA Protein
involved in translation, in that they bind specifically to the codon (made up of 3 nucleotides in a row) on the mRNA. Each carries a specific amino acid that will be added to the growing polypeptide chain.
Ribosomal RNAs, the central molecules of ribosomes, catalyze all of the reactions of translation, while the ribosomal proteins are all structural, in that they hold the rRNAs in their enzymatic conformations. This is true for all ribosomes, be they bacterial, archaeal, eukaryotic, or organellar. Ribosomes are complex, containing at least 3 rRNAs and 50 proteins in bacteria, and at least 4 rRNAs and 80 proteins in eukaryotes. All bacteria, archaea, mitochondria, plastids, and eukaryotes have ribosomes. Even viruses and viroids require ribosomes to make their proteins, although they take over host cells to accomplish this. Thus, all free-living organisms, as well as obligate parasites, mutualists, endosymbionts, and some organelles, have, or otherwise use, ribosomes. Since they are so complex, it is thought that they evolved only once on the Earth. Additionally, since they are universal to all cellular organisms on the Earth, the progenote cells that are the ancestors of these organisms also must have possessed and used ribosomes (or protoribosomes) for protein (and peptide) synthesis. Because rRNAs are so ancient, they have been used extensively in molecular evolutionary studies to answer questions of genealogy for the major groups of organisms on the Earth.
What are the signs of life? Of course, for large animals, you can look for signs of movement, breathing, growth, and a pulse. But, the absence of these does not necessarily mean that the organisms are dead or that they were never alive. There are many methods for identifying life employing culturing, microscopy, molecular biology, specific chemicals, ultraviolet light, X-rays, computers, and others. With culturing and microscopy, the results can be fairly clear. In the case of culturing, an organism either grows or fails to grow. If it grows, one can examine it in light and electron microscopes, study its growth requirements, and extract some DNA for molecular studies. Samples can also be examined directly, prior to culturing, by microscopy to look for cells, parts of cells, or groups of cells. The results are not always so clear, because some groups of cells and cell pieces look like nondescript debris, and it is often difficult to determine whether or not the cells are alive. Also, some mineral formations resemble cells.
The most common molecular method used currently is DNA sequence analysis. The sample is subjected to polymerase chain reaction (PCR) amplification to produce millions of copies of a DNA or RNA template molecule (Figure 1.4), and then, those molecules are analyzed by sequencing. Sequences can also be determined by other methods that amplify small pieces of DNA immobilized on small plates and computer chips and analyzed by detectors and computers. The DNAs can be from entire genomes, or from various samples, including those from environmental sources. The sequences can then be compared to others to determine whether the sequences match any others that have been analyzed by other researchers around the world. Sequencing of entire bacterial genomes (whole chromosome sequences) can be accomplished in a few days. For environmental samples, total RNA or DNA can be extracted from the sample and sequenced to produce millions of base pairs of information. These metagenomic studies provide an overall view of the diversity of organisms and metabolic pathways that exist in the particular environment (Figure 1.5). Most often, rRNA genes are used for broad comparisons, since these are conserved regions of DNA that are in common among bacteria, archaea, and eukaryotes, and sequences from a very large number of organisms have been determined. However, other regions of DNA can be examined. Mitochondrial genomic sequences have been determined for a large number of eukaryotes, and a large number of plastid genome sequences have been determined. Entire genomes have been determined for thousands of viruses, thousands of bacteria, hundreds of archaea, and hundreds of eukaryotes. Sequences from these can be chosen dependent on the study to be performed.
Some chemical reactions are indicative of life, although some of these have to be carefully evaluated. For example, on one of the unmanned Mariner probes that landed on Mars a few decades ago, a chemical reaction was measured by dropping some Martian soil into a specific solution. The gas that evolved appeared to indicate the presence of life. However, as was later surmised, it could also be indicative of the presence of a particular mineral that is common on Mars. Metabolic processes
7 Definitions of Life
Template (target) DNA
Primer
Long products
Short product
Repeat cycle 30–100 times
1. Mix template DNA, primers, nucleotides, and Taq DNA polymerase
2. Heat to 95°C to denature template DNA.
3. Cool to temperature where the primers will anneal to the template DNA.
+ + Polymerase
Polymerase
Incubate at 72°C to polymerize new DNA strands.
1. Heat to 95°C to denature DNA.
2. Cool to annealing temperature.
Incubate at 72°C to polymerize new DNA strands.
Short product
Long products
Sequence and characterize short products (defined by primer sequences at ends)
FIGURE 1.4 Diagram of the polymerase chain reaction (PCR). This procedure was first developed by Kary Mullis in 1985, but is used extensively worldwide to rapidly produce large quantities of DNA fragments of specific types. The reaction is started by mixing the sample DNA (that will include the template sequences to be copied/amplified), with short DNA primers (chosen or synthesized for each region to be copied/amplified), deoxynucleoside triphosphates (dNTPs), and a thermostable DNA polymerase (e.g., Taq DNA polymerase, isolated from the thermophilic bacterium Thermus aquaticus). Next, the mixture is heated to 95°C to denature the sample DNA. It is then cooled to a temperature that will allow the primers to anneal to the template by complementary hydrogen bonding in places where the sequences are complementary to the template strand. The temperature is increased to 72°C, which is the optimal temperature for the polymerization activity of Taq DNA polymerase. The temperature change steps are then repeated from 30 to over 100 times to amplify the desired fragments of DNA. During the first few such cycles, two different classes of amplification products are produced, long products and short products. The long products are DNAs of varying lengths that are being synthesized from the original template (sample) strands. The short products are being synthesized from the already copied pieces, which have ends defined by the primers that were added to the reaction. The number of long products increases linearly with each cycle, while the number of short products increases exponentially. Therefore, after 30 cycles, the reaction mixture contains primarily short products, which can be visualized by agarose gel electrophoresis (not shown).
can be measured by the addition of a radioactive substrate into an environmental sample. The products of the reaction are collected and the radioactive compounds are examined and measured. Other than the inherent problems in dealing with radioactive compounds, this method appears to be generally reliable for indicating life processes.
It is likely that organisms have been blasted off of the Earth by meteor impacts and volcanism. Therefore, there may be organisms that originated on the Earth that have fallen on other bodies in the solar system. In an analogous way, if organisms evolved first elsewhere in the solar system or beyond, this could have been the inoculum for the origination of life on the Earth. When extraterrestrial life is investigated, the first tests will likely be for ribosomal RNA or rRNA genes, since all free-living organisms are based on ribosomes. Bacteria, archaea, and eukaryotes all replicate their DNA in roughly the same way, transcribe RNA from similar DNA templates, and translate those mRNA messages into polypeptides using macromolecular assemblages of structural proteins,
8 Integrated Molecular Evolution