A Variable Gene in a Conserved Region of the Helicobacter pylori Genome: Isotopic Gene Replacement or Rapid Evolution?

A Variable Gene in a Conserved Region of the Helicobacter pylori Genome: Isotopic Gene Replacement or Rapid Evolution?
of 6
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  Short Communication  AVariable Gene in a Conserved Region of the  Helicobacter pylori Genome: Isotopic Gene Replacement or Rapid Evolution?  Armelle M E´NARD1,2 , Antoine D  ANCHIN3 , Sandrine D UPOUY1,2 , Francis M E´GRAUD1,2 , and Philippe L EHOURS1,2, * INSERM U853, Laboratoire de Bacte´riologie, Universite´ Victor Segalen Bordeaux 2, 146 rue Le´o Saignat, F-33076Bordeaux cedex, France 1  ; Universite´ Victor Segalen Bordeaux 2, Laboratoire de Bacte´riologie, Bordeaux F-33076,France 2 and Institut Pasteur, Ge´ne´tique des Ge´nomes Bacte´riens - CNRS URA2171, Paris F-750015, France 3 (Received 6 November 2007; accepted on 3 April 2008; published online 27 April 2008)  AbstractThe present study concerns the identification of a novel coding sequence in a region of the  Helicobacter  pylori  genome, located between JHP1069 / HP1141 and JHP1071 / HP1143 according to the numbering of the J99 and 26695 reference strains, respectively, and spanning three different coding DNA sequences(CDSs). The CDSs located at the centre of this locus were highly polymorphic, as determined by the analy-sis of 24 European isolates, 3 Asian, and 3 African isolates. Phylogenetic and molecular evolutionary ana-lyses showed that the CDSs were not restricted to the geographical srcin of the strains. Despite a very high variability observed in the deduced protein sequences, significant similarity was observed, always with the same protein families, i.e. ATPase and bacteriophage receptor / invasion proteins. Although this variability could be explained by isotopic gene replacement via horizontal transfer of a gene with thesame function but coming from avariety of sources, it seems more likely that the very high sequence vari-ation observed at this locus is the result of a strong selection pressure exerted on the corresponding geneproduct. The CDSs identified in the present study could be used as strain specific markers.Key words:  Helicobacter pylori ; coding DNA sequence; genetic diversity; diversifying selectionComparative analyses conducted on  Helicobacter pylori  genome sequences, i.e. from  H. pylori  strain J99 associated with peptic ulcer, 1,2 strain 26695 3 associated with gastritis, and strain HPAG1 associatedwith atrophic gastritis, 4 revealed a significant macro-diversity (presence or absence of genes) and micro-diversity (high polymorphism among orthologousgenes). 5,6,7  The plasticity zones and the  cag  pathogen-icity island ( cag  PAI) are considered to be the mainvariable genomic areas. The remaining variablegenes are distributed throughout the  H. pylori genome and some of them have been individualizedin clusters of instability concerning blocks of 5–8coding DNA sequences (CDSs). 5,8,9 Subtractive hybridization is a powerful tool for com-parative prokaryotic genomics and was validated on H. pylori  by several authors. 10,11 In a previous study,we used subtractive hybridization to compare thegenetic content of one  H. pylori  strain isolated froma gastric MALT lymphoma strain (strain B34) andone chronic gastritis only strain. 12 One original1092 bp sequence was identified, with no significantnucleotide similarity in comparison to the  H. pylori reference strains 26695 and J99 genomes whichwere available. The aim of the present study was tolocalize this sequence in the  H. pylori  genome, todetermine its prevalence, and to analyze its geneticdiversity in  H. pylori .Using an in-house genome walking method as pre-viously described, 13 the srcinal region was localized Edited by Masahira Hattori* To whom correspondence should be addressed. Tel.  þ 33 5-57-57-12-86. Fax. 33 5-56-51-41-82. E-mail: #  The Author 2008. Kazusa DNA Research Institute The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display theopen access version of this article for non-commercial purposes provided that: the srcinal authorship is properly and fully attributed; the Journaland Oxford University Press are attributed as the srcinal place of publication with the correct citation details given; if an article is subsequentlyreproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use,please contact DNA R ESEARCH  15 , 163–168, (2008)doi:10.1093 / dnares / dsn006   b  y g u e  s  t   on O c  t   o b  e r 1  7  ,2  0 1  5 h  t   t   p :  /   /   d n a r  e  s  e  a r  c h  . oxf   or  d  j   o ur n a l   s  . or  g /  D o wnl   o a  d  e  d f  r  om   in the  H. pylori  genome and a new CDS was sub-sequently identified using the CDS finder website(http: // / gCDS / CDSig.cgi). Thisnew CDS, called CDS2, is located between two CDShomologous to JHP1069 / HP1141 and JHP1071 / HP1143 according to the numbering of the J99 and26695 reference strains, respectively. 1,3 CDS2replaced JHP1070 / HP1142, called CDS1, in  H. pylori reference strains J99 and 26695. The percentage of identity between the nucleotidesequences of CDS1 and CDS2 was determined usingthe LALIGN software, 14 which identifies multiplematching subsegments in two sequences (http: // / software / LALIGN_form.html).CDS2 showed 54.9% identity in a 2046 nucleotidesoverlap with JHP1070 and 55.5% identity in a 2083nucleotides overlap with HP1142. CDS2 encodes aputative polypeptide of 820 residues (Genbank acce-sion number EF492441, EMBL Nucleotide Sequence AM902682). Regarding the protein homology, CDS2shared 23.6% identity with JHP1070 in a 628amino acid overlap and 24.4% identity withHP1142 in a 630 amino acid overlap. Finally, astrong nucleotide identity was found with theHPAG1_1080 sequence 4 with 89.3% identity in a2469 nucleotides overlap. The prevalence and the genetic diversity of theidentified genomic locus were first determined for24  H. pylori  strains: 13  H. pylori  strains isolatedfrom gastric MALT lymphoma patients obtainedfrom two multicentre French protocols and 11strains isolated from French chronic gastritis onlypatients, as previously described 12,15 by PCR amplifi-cation using primers hybridizing to the conservedsequence of the flanking genes (JHP1069 / HP1141and JHP1071 / HP1143) according to the numberingof the J99 and 26695 strains, respectively. Theprimers were designed using the web Primer3 soft-ware (http: // / cgi-bin / primer / pri-mer3_www.cgi). 16 Direct sequencing was carried outon both strands, and nucleotide and deducedprotein sequences were compared with the NCBIBlast program (http: // / BLAST / ). A CDS was always present at this locus: CDS1 wasfound in 54% of the strains, CDS2 in 29% of thestrains, and an additional CDS, called CDS3, wasidentified in 17% of the strains. In the chronic gastritisonly  H. pylori  strain G2, CDS3 had a 53.4% identity ina 2005 nucleotide overlap with CDS1 and a 52.9%identity in a 2063 nucleotides overlap with CDS2,and it encodes a putative polypeptide of 861 residues(GenBank accesion number EF492442, EMBLNucleotide Sequence AM902683). CDS3 still has nocounterpart in databases. Considering the threeCDSs, no significant association with a virulencefactor was found, nor with a pathology (data notshown). The presence or absence of these CDSs wasalso verified by dot blot hybridization, as previouslydescribed. 12 It showed that the presence of one of these three CDSs was exclusive (no local duplication,data not shown).We first focused on the role of the genes presentaround the polymorphic locus. According to therevised annotation of the  H. pylori  genome, 17  JHP1069 / HP1141 encodes a methionyl-tRNA for-myltransferase (  fmt  ) and JHP1071 / HP1143, a con-served hypothetical protein.  fmt   is considered to bean essential gene which links general metabolismwith the translation process (protein biosyn-thesis). 18,19  As shown in Fig. 1, JHP1069 / HP1141and JHP1071 / HP1143 are surrounded by genes of hypothetical function. Considering the G þ C%content of the region, all of the CDSs contained aG þ C% similar to the rest of the  H. pylori  genome(  39%) except for these variable regions: CDS1,CDS2, and CDS3 had 29, 30, and 31% G þ C%content, respectively. The lower G þ C% contentsuggests an external srcin of these CDSs or a rapidadaptation. 20 Indeed, Saunders et al. 21 , using a tetra-nucleotide and hexanucleotide signature analysis,identified substantial differences between JHP1070and HP1142 genes and hypothesized that they werehorizontally transferred.CDS1 has been annotated as a predicted codingregion JHP1070 with no homolog in the databases.It codes for a putative polypeptide of 759 residues.Using a Blastp search, significant homologies werefound with (i) Rlo proteins (R-linked ORF) from Campylobacter   (e.g. RloG,  E ¼ e-11 in  Campylobacter  jejuni  strain RM1167, or RloC,  E ¼ 7e-13 in  C. jejuni strain RM11221), 22,23 (ii) an ATP / GTP binding-site(GXXXXGKT), and (iii) a putative phage murein trans-glycosylase  Yom I (SPbeta phage protein; lytic transgly-cosylase,  E ¼ 4.52e-05 in  Bacillus subtilis ). 24  The samesignificant homologies with Rlo and YomI were alsofound in CDS2 and CDS3. Finally, another interestingpoint to consider is that the three CDSs shared signifi-cant homology with chromosome partition proteinSMC: for example,  Treponema denticola  ATCC 35405chromosome NC_002967  E ¼ 1.11e-11 for CDS1, Fusobacterium nucleatum  subsp . nucleatum  ATCC25586  E ¼ 8.66e-10 for CDS2,  Fusobacterium nuclea-tum  subsp . nucleatum  ATCC 25586  E ¼ 3.86e-08 forCDS3. 25 We did not find anysignificant motifs indicat-ing that the proteins could be secreted and / or presentin the membrane, but this does not preclude anassociation with the membrane via interaction withan integral membrane protein partner.How can one explain the apparent variability of thelocus identified in the present study? One potentialhypothesis is that the region is a hot spot for geneinsertion / deletion, with a specific selection pressure164 A Novel  H. pylori  Locus for Genetic Exchange [Vol. 15,   b  y g u e  s  t   on O c  t   o b  e r 1  7  ,2  0 1  5 h  t   t   p :  /   /   d n a r  e  s  e  a r  c h  . oxf   or  d  j   o ur n a l   s  . or  g /  D o wnl   o a  d  e  d f  r  om   maintaining a particular function at that preciselocation in the genome. Suerbaum and Josenhans 26 recently reviewed the current data on the geneticdiversity of   H. pylori  and argued that this bacteriumuses mutation and recombination processes to adaptto its individual host by modifying molecules thatinteract with the host. 26 Because the three CDSsretain the same similarities, it is likely that (i) theseproteins share the same function or (ii) the gene issubmitted to specific selection pressure making itevolve at a very rapid rate. We proposed that such aprotein could be a phage receptor / translocator orthat it could allow the DNA phage to enter host cellsby remodelling the cell wall. 27–29 Indeed, as alreadydescribed in  Escherichia coli , this kind of protein is sub- jected to a strong positive selection. 30 Helicobacter pylori  genotypes vary markedly withtheir geographical region, and this is particularly thecase for genes under positive selection. Therefore, thecorresponding genes were looked for in three East Asian strains and three African strains. All three CDSwere found: CDS1 was found in one Asian (strain8038) and one African strain (strain TALLAN), CDS2in two Asian strains (strains 12001 and strain 8033),and CDS3 in one Asian (strain 19A) and one Africanstrain (strain BAPOOI) (Fig. 2). A phylogenetic analysiswas conducted on the deduced amino acid sequencesof CDS. Phylogenetic and molecular evolutionaryanalyses were conducted using MEGA version 4. 31 Phylogenetic trees were generated by the neighbour- joining method. 32 Molecular distances were deter-mined using the Kimura two-parameter model. 33  The tree showed three independent clusters whichwere clearly separated and corresponded to CDS1,CDS2, and CDS3, respectively (Fig. 2). However, theexact organization of these different CDS cannot bedetermined since this consensus tree cannot berootedto other species. Indeed, no CDS withsignificanthomology has ever been found in other species (indatabases). Interestingly, even though the testing wasperformed on a limited number of non-Europeanstrains, these results indicate that the presence of one of the three CDS cannot be restricted to thegeographical srcin of the strains. The type of selection operating at the aminoacid level was also evaluated by comparing non-synonymous substitutions (Ka) and synonymoussubstitutions (Ks). 34  The overall mean of Ks and Kasubstitutions was determined using the Nei– Gojobori method. 35  The codon based  Z  -test of selection 36 was used to evaluate the significance of Ka / Ks substitution values. Bootstrap confidencelevels were determined by randomly resampling thesequencing data 1000 times. The results are indicated Figure 1.  Representation of the genomic area of interest, according to the genome sequences of the two  Helicobacter pylori  referencestrains J99 and 26695, which contain the variable CDS identified in the present study. Each CDS is represented by an arrow withthe direction indicating the translational direction. The numbering under each CDS corresponds to the number of CDS in  H. pylori  J99 (top line) and 26695 (bottom line). The function of each CDS is indicated according to the revised annotation of Boneca et al. 17 Fmt, methionyl-tRNA formyltransferase; BirA, biotin ligase bifunctional protein; ParB, replication / partition-related protein;ParA, chromosome partition protein (Soj). No. 3] A. Me´nard et al. 165   b  y g u e  s  t   on O c  t   o b  e r 1  7  ,2  0 1  5 h  t   t   p :  /   /   d n a r  e  s  e  a r  c h  . oxf   or  d  j   o ur n a l   s  . or  g /  D o wnl   o a  d  e  d f  r  om   for each CDS in Table 1. Since Ka / Ks was  , 1 for thethree CDSs analyzed, the purifying selection hypo-thesis was tested and the significant  P  -value obtainedsupports the hypothesis of conservation at the proteinlevel for each CDS (  Z  -test  P  , 0.001).Finally, we propose that the very high variationobserved in the protein sequences reflects the per-manent selection pressure exerted by phages orother elements interacting with the organism’s cellenvelope. If this is the case, this locus could beused as a marker for constraints operating in theenvironmental niches in which particular  H. pylori strains evolve. The presence of phages in  H. pylori has been rarely described. 37 For example, Marsichet al. 38 postulated that  H. pylori  lysozyme gene( lys ) had a prophage srcin. Numerous other expla-nations cannot be excluded, such as bacterialmammalian host interaction, protozoan predation,or porin specificity. Indeed several publicationshave focused on cases of genes that vary markedlyamong  H. pylori  isolates. One example is the replace-ment of   bab  A by  bab B as reported by Solnick et al. 39 Helicobacter pylori  BabA is the ABO blood groupantigen binding adhesin, which has a closelyrelated paralogue (BabB) whose function isunknown. An extensive genotypic diversity in  bab  A and  bab B across different strains, as well as within astrain colonizing an individual patient has beenshown in line with the hypothesis that diverse pro-files of   bab  A and  bab B reflect selective pressures foradherence, which may differ across different hostsand within an individual over time. 40 In summary, a novel polymorphic locus com-prised of a single gene was identified in the Figure 2.  Phylogenetic analysis of CDS, CDS1, CDS2, and CDS3, proteins generated with the neighbour-joining method. The phylogenypresented is based on the alignment of the entire deduced protein. The bootstrap values are indicated next to each node.Nucleotide and protein sequences are available in GenBank (EMBL) for the  Helicobacter pylori  strains B34, G2, 8038, TALLAN,12001, 8033, 19A, and BAPOO1 under the accession numbers EF492441 (AM902682), EF492442 (AM902683), EU553483(AM946633), EU553485 (AM946634), EU553505 (AM946635), EU553482 (AM946636), EU553481 (AM946637), andEU556504 (AM946638), respectively. Nucleotide and protein sequences of   H. pylori  reference strains 26695, J99, and HPAG1 areavailable in GenBank under the accession numbers AE000511, AE001439, and ABF85147, respectively. Table 1.  Analysis of molecular distances and synonymous and non-synonymous nucleotide substitutions within CDSs, CDS1 ( n ¼ 4),CDS2 ( n ¼ 4), and CDS3 ( n ¼ 3), in different  Helicobacter pylori  strainsCDS1 CDS2 CDS3Mol. distance (nt) 0.045 + 0.003 & 0.080 + 0.005 0.069 + 0.004No. differences (nt) 98.167 + 7.246 182.167 + 9.753 170.00 + 9.334Ks 0.088 + 0.012 0.161 + 0.015 0.123 + 0.013Ka 0.035 + 0.004 0.061 + 0.004 0.057 + 0.004Ka / Ks 0.398 + 0.071 † 0.379 + 0.043 † 0.463 + 0.059 † nt, nucleotides; Ks, synonymous substitutions; Ka, non-synonymous substitutions. † P  Z-Test , 0.001 for purifying selection hypothesis (Ka / Ks , 1). & Value + standard error. The GenBank accession numbers of the sequences used in this study are listed in Fig. 2. 166 A Novel  H. pylori  Locus for Genetic Exchange [Vol. 15,   b  y g u e  s  t   on O c  t   o b  e r 1  7  ,2  0 1  5 h  t   t   p :  /   /   d n a r  e  s  e  a r  c h  . oxf   or  d  j   o ur n a l   s  . or  g /  D o wnl   o a  d  e  d f  r  om   H. pylori  genome. Although this variation could beexplained by isotopic gene replacement via horizon-tal transfer of a gene with the same function butcoming from a variety of sources, it seems morelikely that the very high sequence variationobserved at this locus is the result of a strong selec-tion pressure exerted on the corresponding geneproduct. We propose that the evolution of CDS1,CDS2, and CDS3 is due to the occurrence of aspecific environmental event, such as interactionwith a biological structure, e.g. bacteriophagewhich are involved in surface cell secretion. Thegenes identified in the present study couldbe used as strain specific markers for particularniches. The predicted function of the gene products,although highly speculative, should encourageinvestigators to explore the presence of phages inthe  H. pylori  environment and study their relation-ship regarding pathogenicity.  Acknowledgments:  The authors want to thankDr Monica Oleastro from the Departamento deDoenc¸as Infecciosas of the Instituto Nacional Sau´deDr Ricardo Jorge (Lisbon, Portugal) for thecomparison of non-synonymous substitutions andsynonymous substitutions among the three CDSsdescribed in this study and Dr Jorge M. B. Vı´tor andDr Vale Filipa from the Faculty of Pharmacy inLisbon for strains. The study was financiallysupported by the Institut de Recherche des Maladiesde l’Appareil Digestif (IRMAD), the Association pourla Recherche contre le Cancer (ARC), and theConseil Re´gional d’Aquitaine, France. References 1. Alm, R. A., Ling, L. S. L., Moir, D. T.,  et al . 1999, Genomic-sequence comparison of two unrelated isolates of thehuman gastric pathogen  Helicobacter pylori ,  Nature , 397 , 176–180.2. Alm, R. A. and Trust, T. J. 1999, Analysis of the geneticdiversity of   Helicobacter pylori : the tale of twogenomes,  J. Mol. Med. ,  77 , 834–846.3. Tomb, J. F., White, O., Kerlavage, A. R.,  et al . 1997, Thecomplete genome sequence of the gastric pathogen Helicobacter pylori ,  Nature ,  388 , 539–547.4. Oh, J. D., Kling-Backhed, H., Giannakis, M.,  et al . 2006, The complete genome sequence of a chronic atrophicgastritis  Helicobacter pylori  strain: evolution duringdisease progression,  Proc. Natl. Acad. Sci. USA ,  103 ,9999–10004.5. Salama, N., Guillemin, K., McDaniel, T. K., Sherlock, G., Tompkins, L. and Falkow, S. 2000, A whole-genomemicroarray reveals genetic diversity among  Helicobacter pylori  strains,  Proc. Natl. Acad. Sci. USA ,  97 ,14668–14673.6. Falush, D., Wirth, T., Linz, B.,  et al . 2003, Traces of humanmigrations in  Helicobacter pylori  populations,  Science , 299 , 1582–1585.7. Raymond, J., Thiberge, J. M., Chevalier, C.,  et al . 2004,Genetic and transmission analysis of   Helicobacter pylori strains within a family,  Emerg. Infect. Dis. ,  10 ,1816–1821.8. Gressmann, H., Linz, B., Ghai, R.,  et al . 2005, Gain andloss of multiple genes during the evolution of  Helicobacter pylori ,  PLoS Genet  .,  1 , e43.9. Chanto, G., Occhialini, A., Gras, N., Alm, RA., Megraud, F.and Marais, A. 2002, Identification of strain-specificgenes located outside the plasticity zone in nine clinicalisolates of   Helicobacter pylori ,  Microbiol. Sgm. ,  148  (11),3671–3680.10. Akopyants, N. S., Fradkov, A., Diatchenko, L.,  et al . 1998,PCR-based subtractive hybridization and differences ingene content among strains of   Helicobacter pylori , Proc. Natl. Acad. Sci. USA ,  95 , 13108–13113.11. Kersulyte, D., Mukhopadhyay, A. K., Shirai, M.,Nakazawa, T. and Berg, D. E. 2000, Functional organiz-ation and insertion specificity of IS607, a chimericelement of   Helicobacter pylori ,  J. Bacteriol. ,  182 ,5300–5308.12. Lehours, P., Dupouy, S., Bergey, B.,  et al . 2004,Identification of a genetic marker of   Helicobacter pylori strains involved in gastric extranodal marginal zoneB cell lymphoma of the MALT-type,  Gut  ,  53 ,931–937.13. Abdelbaqi, K., Menard, A., Prouzet-Mauleon, V.,Bringaud, F., Lehours, P. and Megraud, F. 2007,Nucleotide sequence of the gyrA gene of   Arcobacter  species and characterization of human ciprofloxacin-resistant clinical isolates,  FEMS Immunol. Med. Microbiol .,  49 , 337–345.14. Huang, X. and Miller, M. 1991, A time-efficient, linear-space local similarity algorithm,  Adv. Appl. Math .,  12 ,337–357.15. Lehours, P., Menard, A., Dupouy, S.,  et al . 2004,Evaluation of the association of nine  Helicobacter pylori  virulence factors with strains involved in low-grade gastric mucosa-associated lymphoid tissue lym-phoma,  Infect. Immun. ,  72 , 880–888.16. Rozen, S. and Skaletsky, H. 2000, Primer3 on the WWWfor general users and for biologist programmers,  Methods Mol. Biol .,  132 , 365–386.17. Boneca, I. G., de Reuse, H., Epinat, J.-C., Pupin, M.,Labigne, A. and Moszer, I. 2003, A revised annotationand comparative analysis of   Helicobacter pylori genomes,  Nucl. Acids Res. ,  31 , 1704–1714.18. Meinnel, T., Guillon, J. M., Mechulam, Y.,  et al . 1993, The Escherichia coli fmt   gene, encoding methionyl-tRNA(fMet) formyltransferase, escapes metaboliccontrol. Disruption of the gene for Met-tRNA(fMet) for-myltransferase severely impairs growth of   Escherichiacoli ,  J. Bacteriol .,  175 , 993–1000.19. Guillon, J. M., Mechulam, Y., Schmitter, J. M., Blanquet, S.and Fayat, G. 1992, Disruption of the gene for Met-tRNA(fMet) formyltransferase severely impairsgrowth of   Escherichia coli ,  J. Bacteriol. ,  174 ,4294–4301. No. 3] A. Me´nard et al. 167   b  y g u e  s  t   on O c  t   o b  e r 1  7  ,2  0 1  5 h  t   t   p :  /   /   d n a r  e  s  e  a r  c h  . oxf   or  d  j   o ur n a l   s  . or  g /  D o wnl   o a  d  e  d f  r  om 
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks