Whole-genome resequencing of Escherichia coli K-12 MG1655 undergoing short-term laboratory evolution in lactate minimal media reveals flexible selection of adaptive mutations

Background Short-term laboratory evolution of bacteria followed by genomic sequencing provides insight into the mechanism of adaptive evolution, such as the number of mutations needed for adaptation, genotype-phenotype relationships, and the
of 12
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  Genome Biology   2009, 10: R118 Open Access 2009Conradet al. Volume 10, Issue 10, Article R118 Research  Whole-genome resequencing of Escherichia coli K-12 MG1655 undergoing short-term laboratory evolution in lactate minimal media reveals flexible selection of adaptive mutations TomMConrad * , AndrewRJoyce † , M KenyonApplebee * , ChristianLBarrett † , BinXie ‡ , YuanGao ‡§  and BernhardØPalsson ‡  Addresses: * Department of Chemistry and Biochemistry, University of California San Diego, 9500 Gilman Drive, La Jolla, California, 92093-0332, USA. † Department of Bioengineering, University of California San Diego, 9500 Gilman Drive, La Jolla, California, 92093-0412, USA. ‡ Department of Computer Science, Virginia Commonwealth University, 401 West Main Street, Richmond, Virginia, 23284-3019, USA. § Center for the Study of Biological Complexity, Virginia Commonwealth University, 1000 W. Cary St., Richmond, Virginia, 23284-3068, USA. Correspondence: BernhardØPalsson. Email: © 2009 Conrad  et al  .; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the srcinal work is properly cited Laboratory evolution<p>Escherichia coli strains that have evolved in the laboratory in response to lactate minimal media show a wide range of different genetic adaptations.</p> Abstract Background: Short-term laboratory evolution of bacteria followed by genomic sequencingprovides insight into the mechanism of adaptive evolution, such as the number of mutations neededfor adaptation, genotype-phenotype relationships, and the reproducibility of adaptive outcomes. Results: In the present study, we describe the genome sequencing of 11 endpoints of Escherichiacoli that underwent 60-day laboratory adaptive evolution under growth rate selection pressure inlactate minimal media. Two to eight mutations were identified per endpoint. Generally, eachendpoint acquired mutations to different genes. The most notable exception was an 82 base-pairdeletion in the rph - pyrE operon that appeared in 7 of the 11 adapted strains. This mutationconferred an approximately 15% increase to the growth rate when experimentally introduced tothe wild-type background and resulted in an approximately 30% increase to growth rate whenintroduced to a background already harboring two adaptive mutations. Additionally, mostendpoints had a mutation in a regulatory gene ( crp or relA , for example) or the RNA polymerase. Conclusions: The 82 base-pair deletion found in the rph - pyrE operon of many endpoints mayfunction to relieve a pyrimidine biosynthesis defect present in MG1655. In contrast, a variety of regulators acquire mutations in the different endpoints, suggesting flexibility in overcomingregulatory challenges in the adaptation. Background One hundred and fifty years after the publication of The Ori-gin of Species , evolution is still a topic of great interest forresearchers today due in large part to advances in DNA sequencing technology.  De novo genomic sequencing is beingcarried out on a massive scale and large databases of biologi-cal sequence data, such as the NCBI Entrez Genome Project[1] and Genomes OnLine Database (GOLD) [2], are con- stantly expanding. This genomic information has been inter-rogated using comparative genomics to infer evolutionary  Published: 22 October 2009 Genome Biology   2009, 10: R118(doi:10.1186/gb-2009-10-10-r118)Received: 20 February 2009Revised: 18 September 2009Accepted: 22 October 2009The electronic version of this article is the complete one and can be found online at Genome Biology 2009, Volume 10, Issue 10, Article R118 Conrad et al. R118.2 Genome Biology   2009, 10: R118 histories and basic principles of evolution in bacteria (see [3]for a review). While a wealth of knowledge has been learnedfrom these studies, they are usually coarse-grained, focusingon gene loss, horizontal gene transfer, and general statisticsof sequence changes. The importance of individual singlenucleotide polymorphisms (SNPs) and small insertions/dele-tions (indels) when comparing divergent strains is difficult todetermine using comparative genomics because thesechanges occur with high frequency and are often selectively neutral, necessitating intensive use of population genetics todistinguish selective mutations [4].More recently, platforms allowing a base-by-base comparison between highly similar genomes have been developed [5,6]. Such technology can now be utilized to perform before-and-after experiments, where the genetic changes in a populationoccurring during real time are measured. This advance allowsthe unprecedented ability to observe the genetic basis of adaptive evolution directly, rather than through inference of evolutionary histories. Additionally, these studies allow thecontribution of mutations to adaptation to be observedclearly.Owing to short generation times, large population sizes,repeatability, and the ability to preserve ancestor strains by freezing for later direct comparison of distant generations,microorganisms have been used to study adaptive evolution[7]. Whole-genome resequencing of microorganisms follow-ing adaptive evolution has the potential to discover funda-mental parameters of adaptive evolution in bacteria,including the number of mutations acquired during adapta-tion, functions of the mutated genes, and repeatability of thegenetic changes in replicate experiments. However, presently only a small number of studies of adaptive evolution in bacte-ria have included resequencing of the genome [8-10]. One such study included the resequencing of yeast evolved to glu-cose, phosphate, or sulfate limitation in a chemostat [11]. While yeast was constrained in which genes mutated in thesulfate-limited condition due to a single optimal adaptivesolution to the condition, glucose- and phosphate- limitedconditions had a number of equivalent solutions to the condi-tion and so more variability in observed mutations wasobserved. Their work suggests that the parameters of adap-tive evolution vary with condition. We previously reported the sequencing of  E. coli followingshort-term (approximately 40 days) adaptive evolution inglycerol minimal media to obtain its computationally pre-dicted phenotype [10]. The number and location of genes washighly similar among replicates, with mutations in the glyc-erol kinase and RNA polymerase genes present in mostevolved strains. Experiments showed that a single mutationin glycerol kinase or RNA polymerase genes could account forup to 60% of the adaptive improvement in growth phenotype.However, because adaptive evolution in only a single condi-tion was studied, it is not clear whether findings, such as thenumber, consistency, and impact of mutations, are typical forshort-term adaptive evolution of  E  . coli in minimal media.  E. coli K-12 MG1655 that has undergone adaptation in lactateM9 minimal media shows fitness gains of a magnitude similarto those observed in glycerol M9 minimal media [12]. Herein we describe analogous experiments detailing the sequencingof  E. coli adaptively evolved in lactate minimal media, and thefitness benefits of the discovered mutations. We found thatchanging the carbon source affects adaptive parameters,including the number of mutations needed for adaptation andthe diversity of genotypic outcomes. Results and discussion Comparative genome sequencing Five parallel adaptive evolutions of  E. coli MG1655 (LactA,LactB, LactC, LactD, and LactE) over 60 days (approximately 1,100 generations) [12], and later six additional adaptive evo-lutions (LactF, LactG, LactH, LactI, LactJ, and LactK) over 50days (approximately 750 generations), were carried out usingcontinuous exponential growth in 2 g/L L-lactate M9 mini-mal media at 30°C, resulting in an average 90% increase inthe growth rate versus the starting strain. To determine thegenetic mechanism of adaptation in these strains, thegenomes of single colonies from each endpoint culture weresequenced using Nimblegen Comparative Genome Sequenc-ing (CGS) [5] and later 1G Solexa or 2G Solexa sequencing.Comprehensive lists of mutations reported using Nimblegenand Solexa sequencing are included as Additional data files 1and 2. Regardless of the sequencing method, reported muta-tions were tested for actual presence in the endpoint colony using Sanger sequencing. The confirmed mutations areshown in Table 1.Nimblegen CGS has been used previously to identify theSNPs, deletions, and duplications acquired by bacteria duringadaptive evolution [10]. This approach is based on thedecreased hybridization of mutated DNA to correspondingprobes in genomic tiling arrays relative to hybridization of non-mutated DNA. In this study, CGS identified a total of 93mutations in five evolved strains (LactA to LactE). Of these, we found 14 confirmed SNPs and 67 false positives. Twenty-two reported SNPs were actually discrepancies between thesequences of MG1655 used to create the tiling arrays and theMG1655 strain used to begin the adaptive evolutions. Theobserved false positive rate (1 per 340,000 bp) is highly sim-ilar to the rate previously observed [10] for CGS. We later attempted sequencing of the endpoint strains usingG1 Solexa (LactA, LactB, LactC, and LactE), and then G2 Sol-exa (LactB, LactD, LactF to LactK). Instead of measuringDNA hybridization, Solexa relies on the generation of shortsequence reads through reverse-termination synthesis. Thereads are mapped onto a reference genome, and consistentnon-exact matches are reported as mutations. G1 Solexa suc-  Genome Biology 2009, Volume 10, Issue 10, Article R118 Conrad et al. R118.3 Genome Biology   2009, 10: R118 Table 1Confirmed mutations discovered in eleven endpoint strains of MG1655 adapted to growth in lactate minimal media EndpointGeneProduct/duplicationClassNucleotideCodonProtein change LactA  crp cAMP response proteinRegulatort452aCTG->CAGL151Q hfq RNA binding proteinRegulatorc28tCCG->TCGP10S  ydjO Predicted protein-t138gGGT->GGGG46G~87 kb duplication (3946000-4033000)LactB  gcvT  Glycine cleavage systemMetabolic  Δ 1 bp (971)Frameshift~44 kb duplication (1248300-1292200)LactC  rph-pyrE RNase PH/orotate phosphoribosyltransferaseMetabolic  Δ 82bpFrameshift cya Adenylate cyclaseRegulatorc547tCTT->TTTL183F infC  IF-3Translationg283aGAA->AAAE95KLactD  rph-pyrE RNase PH/orotate phosphoribosyltransferaseMetabolic  Δ 82 bpFrameshift ppsA Phosphoenolpyruvate synthaseMetabolicc288aATC->ATAI96I atoS AtoS/AtoC two component regulatory systemRegulatora1367cCAA->CCAQ456P relA ppGpp synthetaseRegulatora956cTAT->TCTY319S rho Transcription termination factorRegulatorc304tCGC->TGCR102C hepA RNAP recycling factorRegulatorc2665tCAA->TAAQ889(stop) kdtA KDO transferaseCell envlp.t701aGTA->GAAV234ELactE  ppsA Phosphoenolpyruvate synthaseMetabolicc17tTCG->TTGS6L acpP  Acyl carrier proteinMetabolicg50tGGC->GTCG17V hfq RNA binding proteinRegulatorc28tCCG->TCGP10S crp cAMP response proteinRegulatort497cATC->ACCI166T  ydcI Putative transcriptional regulator-g41aCGC->CACR14H  yjbM Predicted protein-g141aATG->ATAM47I~140 kb duplication (3620000-3760000), ~87 kb duplication (3946000-4033000)LactF  rph-pyrE RNase PH/orotate phosphoribosyltransferaseMetabolic  Δ 82 bpFrameshift kdtA KDO transferaseCell envlp.g292aGGG->AGGG98R rpoC  RNA polymeraseRegulatorc2524tCGT->TGTR842C argS Arginyl-tRNA synthetaseTranslationg110cGGC->GCCG37A~12 kb duplication (1774000-1786000)LactG  rph-pyrE RNase PH/orotate phosphoribosyltransferaseMetabolic  Δ 82 bpFrameshift trpB Tryptophan synthaseMetabolicg462tGCG->GCTA154A nadB NAD biosynthesisMetabolicc405tGCC->GCTA135A rpoB RNA polymeraseRegulatora1664cTAC->TCCY555S rpoS  σ S Regulator  Δ 1 bp (609)Frameshift kdtA KDO transferaseCell envlp.g292aGGG->AGGG98R osmF  ABC transporter involved in osmoprotectionCell envlp.ins T after 873AAA->TAAK292(stop) proQ Predicted structural transport elementCell envlp.g(-8)tPromoter  Genome Biology 2009, Volume 10, Issue 10, Article R118 Conrad et al. R118.4 Genome Biology   2009, 10: R118 ceeded in detecting several mutations in LactA and LactEmissed by analysis of CGS data for these strains. However,depending on the mapping technique and stringency used forreporting mutations, analysis of G1 Solexa data resulted ineither many false negatives or many false positives. Whensequencing by G2 Solexa became available, the average cover-age of sequenced strains greatly improved from 10× coverageusing G1 Solexa to more than 40×. The high coverage of readsgenerated by G2 Solexa resulted in a false positive rate of only one false positive per 9,200,000 bp. Analysis of G2 Solexa data from 8 endpoint strains resulted inthe confirmation of 30 SNPs, 14 deletions, and 3 insertions, intotal. Based on a low calculated false negative rate (1 to 2%)for SNPs and deletions (Additional data file 3; see Materialsand methods for details), it is very unlikely that more than afew of these types of mutations were not identified in strainssequenced using G2 Solexa. However, detection of smallinsertions (1 to 4 bp) was less consistent (13% false negativerate) than detection of SNPs and deletions, and larger inser-tions were not generally detectable by our methods. There-fore, it remains a possibility that several insertions arecurrently left undetected in these strains. Additionally, while Solexa sequencing is an excellent tool fordetermining SNPs and deletions on the genome scale in bac-teria, it has the disadvantage that locations of duplicatedgenome segments and chromosomal rearrangements cannot be determined due to short read length. Pulse field gel elec-trophoresis [13] or sequencing using longer read lengths,such as 454 [14], or paired reads can provide information onthese mutation events. Because these methods are notincluded in our study, it must be kept in mind that genomicrearrangements may have occurred, but cannot be observed.Despite these shortcomings, approximately five mutations were detected per endpoint strain, and we believe these are LactH  rph-pyrE RNase PH/orotate phosphoribosyltransferaseMetabolic  Δ 82 bpFrameshift pdxB Erythronate-4-phosphate dehydrogenaseMetabolicg286tGTG->TTGV96L ilvG_1 Acetolactate synthase II (pseudogene)Metabolic  Δ 1 bp (977)Frameshift rpoB RNA polymeraseRegulator  Δ 1 bp (4006)Frameshift kdtA KDO transferaseCell envlp.g292aGGG->AGGG98R wcaA Glycosyl transferaseCell envlp.  Δ 4 bp (506509)FrameshiftLactI  rph-pyrE RNase PH/orotate phosphoribosyltransferaseMetabolic  Δ 82 bpFrameshift relA ppGpp synthetaseRegulatorg4cGTT->CTTV2L proQ Predicted structural transport elementCell envlp.ins T after 15Frameshift, AAG->TAAK6(stop)LactJ  rph-pyrE RNase PH/orotate phosphoribosyltransferaseMetabolic  Δ 82 bpFrameshift mrdA Peptidoglycan synthetase, PBP2Cell envlp.c157aCGC->AGCR53S rpsA 30S ribosomal subunitTranslationa490tAAC->TACN164Y kgtP  Á-ketoglutarate MFS transporterCell envlp.g1083aAAG->AAAK361K kgtP   Δ 1 bp (1212)FrameshiftIntergenicg3630812tLactK  ppsA Phosphoenolpyruvate synthaseMetabolicg61aGTA->ATAV21I rpoC  RNA polymeraseRegulator  Δ 9 bp (36113619)In frameV1204G ryhA Small RNA that interacts with HfqRegulatorc(-9)tPromoter treA TrehalaseOsmoticg676aGCG->ACGA226T secE Sec protein secretion complexCell envlp.g350aCGC->CACR117H secF  Sec protein secretion complexCell envlp.g109aGCT->ACTA37T~40 kb duplication (1253000-1294000)DNA from single colonies isolated from the endpoints of the 11 strains adapted to growth on lactate M9 minimal media were screened for mutations using Nimblegen CGS and Solexa technologies. Mutations (except for large duplications) were confirmed by Sanger sequencing of the DNA isolated from the single colonies using primers flanking the mutated site. Nucleotide changes refer to position within the respective gene, deletions are indicated by the Δ  symbol, and insertions are marked by 'ins'. The rph - pyrE Δ 82 bp mutation is described in Figure 3. Genomic coordinates of large duplications are shown in parentheses. Cell envlp., cell envelope. Table 1  (Continued) Confirmed mutations discovered in eleven endpoint strains of MG1655 adapted to growth in lactate minimal media Genome Biology 2009, Volume 10, Issue 10, Article R118 Conrad et al. R118.5 Genome Biology   2009, 10: R118 informative for the process of adaptive evolution occurring inthese cultures. Summary of mutations found  Accounting for SNPs, deletions, and insertions, we found atotal of 53 mutations across 11 lactate-evolved strains. Thenumber of mutations found in adapted strains was betweentwo and eight. Approximately two-thirds of discovered muta-tions were SNPs. These were mostly found within the codingregion, with only two cases (  proQ and ryhA ) where SNPs were found in a promoter region and one case where a muta-tion was found in a non-promoter intergenic region. Althoughmost SNPs resulted in an amino acid substitution, 4 of 36SNPs in the dataset were so-called silent mutations. Theindels identified by resequencing were located in codingregions and, except for a 9-bp deletion in the rpoC gene of LactK, were out of frame.Sequencing using Solexa suggested the existence of genomicduplications in several endpoint strains. Data for thesestrains indicated certain genomic regions that had a highercoverage of mapped reads than the rest of the genome (Figure1). The increased fold coverage in these regions was calcu-lated across all strains as average coverage across the regiondivided by average coverage across the genome. Some strainshad regions with two- to four-fold coverage, and this was con-sidered indicative of duplication when most other strains had0.9- to 1.1-fold coverage in the same region (if these regionsrepresented experimental or mapping issues, the enrichedcoverage regions would have been seen in all strains). Wefound a total of four regions that were duplicated in at leastone adaptive endpoint. The duplications are described inTable 1. Notably, the duplication in LactF doubled the copy number of the  ppsA gene, which was mutated in three evolvedstrains (LactD, LactE, LactK). The change in expression levelsof genes in these regions due to increased copy number may provide some competitive advantage to the strains, as wasobserved previously in  Salmonella typhimurium adapted tolimiting amounts of various carbon sources [15]. Functions of mutated genes Mutations affected many different genes with a broad rangeof cellular functions, but the majority of mutations belong togenes with primary functions relating to metabolism, regula-tion, or the cell envelope (Figure 2).The most frequently mutated metabolic genes were  ppsA and rph -  pyrE  . The  E  . coli MG1655 laboratory strain used foradaptive evolution has a defect in pyrimidine biosynthesiscaused by a 1-bp deletion in the rph -  pyrE operon that resultsin low levels of orotate phosphoribosyltransferase encoded by   pyrE [16]. The recurring deletion in rph -  pyrE extends pastthe 3' end of the rph gene, to a region of the operon that isclose to an attenuator loop (Figure 3). The deletion shifts thestop codon of the rph gene closer to the attenuator loopthrough a frameshift. Previous experiments suggest that, dueto links between translation and the attenuation before tran-scription of the  pyrE gene, proper regulation of  pyrE expres-sion by intracellular uracil levels is achieved by moving theMG1655 rph stop codon closer to the attenuator loop [17].Thus, mutation of the regulatory structure could function toincrease orotate phosphoribosyltransferase toward normallevels [16]. However, although the nature of the mutationclearly suggests such a mechanism, previously determinedgene expression data did not show significant upregulation of   pyrE gene expression in the LactC and LactD strains, which Large genomic duplications Figure 1 Large genomic duplications. By viewing the coverage of mapped Solexa data graphically across all genomic coordinates, four large duplications were found in the lactate endpoints, two of which are present in two endpoints. The image shows the coverage of mapped Solexa reads from LactK in the region of a large duplication. In total, the following duplications were found: in LactB and LactK, a 4× and 3× duplication of approximately 40 kb from genomic coordinates 1253000 to 1294000; in LactF, a 3× duplication of approximately 12 kb from 1774000 to 1786000; in LactE, a 2× duplication of approximately 140 kb from 3620000 to 3760000; in LactA and LactE, a 2× duplication of approximately 87 kb from 3946000 to 4033000. 140280420560   1 ,    2   5   8 ,   0  0  0  1 ,    2   5   8 ,   0  0  0  1 ,    2  6   2 ,   0  0  0  1 ,    2  6  6 ,   0  0  0  1 ,    2   7  0 ,   0  0  0  1 ,    2   7  4 ,   0  0  0  1 ,    2   7   8 ,   0  0  0  1 ,    2   8   2 ,   0  0  0  1 ,    2   8  6 ,   0  0  0  1 ,    2   9  0 ,   0  0  0 Genomic position       C     o    v     e     r     a     g     e

29 video boleros

Apr 16, 2018


Apr 16, 2018
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks