Abstract
Strains of Escherichia coli serotype O78 are associated with many diseases, including invasive infections, in humans and farm animals. The clonal relationship between strains from different hosts is therefore important for assessing the risk of zoonotic infections. Here we propose a multilocus sequence typing scheme for E. coli, based on six housekeeping genes. Preliminary, but significant, results indicate that clonal division in E. coli O78 strains is host independent, and closely related clones reside in different hosts. There was a positive correlation between virulence and clonal origin.
Multilocus sequence typing, Bacterial virulence, Serotype O78, Sepsis, Avian colisepticemia, Newborn meningitis
1 Introduction
Strains of Escherichia coli serotype O78 are associated with a large variety of diseases, including invasive infections. Human diseases include enteritis, newborn meningitis (NBM) and sepsis [2,6]. O78 strains are also the causative agent of diseases in farm animals, causing sepsis in sheep and poultry [13]. The poultry disease is avian colisepticemia, a severe systemic disease of chickens and turkeys causing high morbidity and mortality. Around the world the disease is commonly associated with serotypes O1, O2 and O78, with the latter two constituting about 80% of the cases. It is therefore important to establish the clonal relationship between strains from different hosts and diseases in order to assess the risk of zoonotic infections.
Multilocus sequence typing (MLST) has become the method of choice for typing of epidemiologically important strains [5]. The method is based on determining short nucleotide sequences (450–500 bp) of several (five to seven) housekeeping genes that had undergone some evolutionary diversification, leading to polymorphism [9]. MLST is highly discriminatory, detecting even few nucleotide substitutions, and enables easy automation of polymerase chain reaction (PCR) and sequence determination. Furthermore, it offers better standardization and lab-to-lab portability than its predecessor, multilocus enzyme electrophoresis, since it relies on sequence data, which are easily accessible in computer databases.
Here we propose an MLST scheme for E. coli, based on six housekeeping genes with adequate polymorphism. We show its application for preliminary determination of clonal relationship between E. coli O78 strains.
2 Materials and methods
2.1 E. coli strains used in this study
Strains from septicemic animals were isolated from spleen or bone marrow of sheep (631,23,632,75,62) or septicemic chickens (781,787,788,789,790,792,793,794). Strains from NBM were isolated from spinal fluid (285,286,287), and enteritis-associated strains were isolated from human (278,279) or bovine (B41) stool. E. coli serotype O2 strains (1772, MN) were isolated from the bone marrow of septicemic chickens. Strains were defined as invasive if they were isolated from meningitis or sepsis.
2.2 PCR and sequencing
Single colonies were picked into 100 µl of deionized water. Samples were incubated at 98°C for 10 min and the suspension obtained was used as a DNA template. Amplifications were carried out in a total volume of 50 µl using 1 µl of template DNA, each deoxynucleoside triphosphate at a concentration of 0.25 mM, 10 pmol of each primer, 5 µl of 10-fold PCR exTaq buffer (Takara) and 2.5 U of exTaq DNA polymerase (Takara). The reaction conditions used were: 5 min denaturation at 94°C, 30 cycles of 40 s denaturation at 94°C, 45 s annealing at the specific temperature (Table 1), extension at 72°C (Table 1) and a final additional 5 min at 72°C. PCR products were purified for sequencing using the ExoSAP-IT PCR cleanup kit (USB). Sequencing was performed using the ABI Prism 3100 automated sequencer.
1
Open in new tab
Primers used in this study
Gene | Primer sequence | Primer length (bases) | Product length (bp) | Annealing temperature (°C) | Extension time (s) | |
adk | adk | F-5′ cgggcgcggggaaagggactc 3′ | 21 | 595 | 59 | 45 |
adk | R-5′ gcgcgaacttcagcaaccg 3′ | 19 | ||||
gdh | gdh | F-5′ tcggcgtagggcgtgctgac 3′ | 20 | 796 | 59 | 45 |
gdh | R-5′ ctgctcttgttcgcgccctcttc 3′ | 23 | ||||
mdh | mdh | F-5′ cccggtgtggctgtcgatctga 3′ | 22 | 706 | 59 | 45 |
mdh | R-5′ cgccgtttttacccagcagcagc 3′ | 23 | ||||
metA | metA | F-5′ cgcaacacgcccgcagagc 3′ | 19 | 601 | 59 | 45 |
metA | R-5′ gccagctcgctcgcggtgtatt 3′ | 22 | ||||
ppk | ppk | F-5′ tgccgcgctttgtgaatttaccg 3′ | 23 | 758 | 58 | 50 |
ppk | R-5′ ccccggcgcagagaagataacgt 3′ | 23 | ||||
gcl | gcl | F-5′ gcgttctggtcgtccgggtcc 3′ | 21 | 758 | 58 | 50 |
gcl | R-5′ gccgcagcgatttgtgacagacc 3′ | 23 |
Gene | Primer sequence | Primer length (bases) | Product length (bp) | Annealing temperature (°C) | Extension time (s) | |
adk | adk | F-5′ cgggcgcggggaaagggactc 3′ | 21 | 595 | 59 | 45 |
adk | R-5′ gcgcgaacttcagcaaccg 3′ | 19 | ||||
gdh | gdh | F-5′ tcggcgtagggcgtgctgac 3′ | 20 | 796 | 59 | 45 |
gdh | R-5′ ctgctcttgttcgcgccctcttc 3′ | 23 | ||||
mdh | mdh | F-5′ cccggtgtggctgtcgatctga 3′ | 22 | 706 | 59 | 45 |
mdh | R-5′ cgccgtttttacccagcagcagc 3′ | 23 | ||||
metA | metA | F-5′ cgcaacacgcccgcagagc 3′ | 19 | 601 | 59 | 45 |
metA | R-5′ gccagctcgctcgcggtgtatt 3′ | 22 | ||||
ppk | ppk | F-5′ tgccgcgctttgtgaatttaccg 3′ | 23 | 758 | 58 | 50 |
ppk | R-5′ ccccggcgcagagaagataacgt 3′ | 23 | ||||
gcl | gcl | F-5′ gcgttctggtcgtccgggtcc 3′ | 21 | 758 | 58 | 50 |
gcl | R-5′ gccgcagcgatttgtgacagacc 3′ | 23 |
1
Open in new tab
Primers used in this study
Gene | Primer sequence | Primer length (bases) | Product length (bp) | Annealing temperature (°C) | Extension time (s) | |
adk | adk | F-5′ cgggcgcggggaaagggactc 3′ | 21 | 595 | 59 | 45 |
adk | R-5′ gcgcgaacttcagcaaccg 3′ | 19 | ||||
gdh | gdh | F-5′ tcggcgtagggcgtgctgac 3′ | 20 | 796 | 59 | 45 |
gdh | R-5′ ctgctcttgttcgcgccctcttc 3′ | 23 | ||||
mdh | mdh | F-5′ cccggtgtggctgtcgatctga 3′ | 22 | 706 | 59 | 45 |
mdh | R-5′ cgccgtttttacccagcagcagc 3′ | 23 | ||||
metA | metA | F-5′ cgcaacacgcccgcagagc 3′ | 19 | 601 | 59 | 45 |
metA | R-5′ gccagctcgctcgcggtgtatt 3′ | 22 | ||||
ppk | ppk | F-5′ tgccgcgctttgtgaatttaccg 3′ | 23 | 758 | 58 | 50 |
ppk | R-5′ ccccggcgcagagaagataacgt 3′ | 23 | ||||
gcl | gcl | F-5′ gcgttctggtcgtccgggtcc 3′ | 21 | 758 | 58 | 50 |
gcl | R-5′ gccgcagcgatttgtgacagacc 3′ | 23 |
Gene | Primer sequence | Primer length (bases) | Product length (bp) | Annealing temperature (°C) | Extension time (s) | |
adk | adk | F-5′ cgggcgcggggaaagggactc 3′ | 21 | 595 | 59 | 45 |
adk | R-5′ gcgcgaacttcagcaaccg 3′ | 19 | ||||
gdh | gdh | F-5′ tcggcgtagggcgtgctgac 3′ | 20 | 796 | 59 | 45 |
gdh | R-5′ ctgctcttgttcgcgccctcttc 3′ | 23 | ||||
mdh | mdh | F-5′ cccggtgtggctgtcgatctga 3′ | 22 | 706 | 59 | 45 |
mdh | R-5′ cgccgtttttacccagcagcagc 3′ | 23 | ||||
metA | metA | F-5′ cgcaacacgcccgcagagc 3′ | 19 | 601 | 59 | 45 |
metA | R-5′ gccagctcgctcgcggtgtatt 3′ | 22 | ||||
ppk | ppk | F-5′ tgccgcgctttgtgaatttaccg 3′ | 23 | 758 | 58 | 50 |
ppk | R-5′ ccccggcgcagagaagataacgt 3′ | 23 | ||||
gcl | gcl | F-5′ gcgttctggtcgtccgggtcc 3′ | 21 | 758 | 58 | 50 |
gcl | R-5′ gccgcagcgatttgtgacagacc 3′ | 23 |
2.3 In vivo virulence assays
Assessment of virulence was performed in 1-day-old chicks as previously described [13]. Briefly, overnight bacterial cultures were diluted in saline and 104 bacteria were injected intraperitoneally into 1-day-old chicks. The chicks had free access to food and water during the experiment. Mortality was monitored over a period of 72 h after inoculation. Groups of at least eight chicks were used for each determination and the significance of the results was determined by the chi-square analysis and by the Wilcoxon non-parametric test.
2.4 Sequence analysis
DNA sequences were aligned and phylogenetic trees reconstructed by the neighbor-joining method, with 1000 bootstrap trials, using Clustal X [15], a windows application based on Clustal W [7]. Trees were generated applying Kimura's correction [8]. Trees were visualized using the program NJplot included in the Clustal X package.
3 Results
3.1 Selection of MLST loci
Housekeeping genes for typing were selected by comparing sequences between E. coli K-12 and O157:H7, and selecting genes which showed between 97 and 99% identity at the nucleotide sequence level. The genes selected are unlinked according to their genomic location on the E. coli K-12 genome map (Fig. 1). Specific primers were designed for each gene, amplifying a 600–800-bp-long fragment from its coding region. PCR products were isolated and sequenced. All products for a specific gene were of the same length for all the strains tested. The genes and the numbers of allele types are described in Table 2.
1
Relative location of the genes used in this study on a schematic map of the E. coli genome (K-12 MG1655). The coordinates are given in bases.
Open in new tabDownload slide
2
Open in new tab
Number of alleles obtained among E. coli isolates, for several MLST loci
Gene product | Gene | Accession number | Reference | Number of alleles |
Adenylate kinase | adk | X03038 | [3] | 8 |
Glyoxylate carboligase | gcl | L03845 | [4] | 9 |
Glucose-6-phosphate dehydrogenase | gdh | M55005 | [14] | 8 |
Malate dehydrogenase | mdh | AF293111 | [12] | 7 |
hom*oserine transsuccinylase | metA | M10210 | [10] | 8 |
Polyphosphate kinase | ppk | L03719 | [1] | 8 |
Gene product | Gene | Accession number | Reference | Number of alleles |
Adenylate kinase | adk | X03038 | [3] | 8 |
Glyoxylate carboligase | gcl | L03845 | [4] | 9 |
Glucose-6-phosphate dehydrogenase | gdh | M55005 | [14] | 8 |
Malate dehydrogenase | mdh | AF293111 | [12] | 7 |
hom*oserine transsuccinylase | metA | M10210 | [10] | 8 |
Polyphosphate kinase | ppk | L03719 | [1] | 8 |
2
Open in new tab
Number of alleles obtained among E. coli isolates, for several MLST loci
Gene product | Gene | Accession number | Reference | Number of alleles |
Adenylate kinase | adk | X03038 | [3] | 8 |
Glyoxylate carboligase | gcl | L03845 | [4] | 9 |
Glucose-6-phosphate dehydrogenase | gdh | M55005 | [14] | 8 |
Malate dehydrogenase | mdh | AF293111 | [12] | 7 |
hom*oserine transsuccinylase | metA | M10210 | [10] | 8 |
Polyphosphate kinase | ppk | L03719 | [1] | 8 |
Gene product | Gene | Accession number | Reference | Number of alleles |
Adenylate kinase | adk | X03038 | [3] | 8 |
Glyoxylate carboligase | gcl | L03845 | [4] | 9 |
Glucose-6-phosphate dehydrogenase | gdh | M55005 | [14] | 8 |
Malate dehydrogenase | mdh | AF293111 | [12] | 7 |
hom*oserine transsuccinylase | metA | M10210 | [10] | 8 |
Polyphosphate kinase | ppk | L03719 | [1] | 8 |
3.2 Selection of strains and assessment of virulence
Several pathogenic O78 strains were selected from humans, chickens, and cattle. To make the tree more informative we also included a K-12 strain and two avian septicemic strains belonging to the O2 serotype. Since nearly all of the strains tested were associated with a disease, it was important to determine their level of virulence in order to find out whether there is a correlation between a high degree of virulence and specific clones. The results were used to group the isolates according to their level of virulence to 1-day-old chicks, from non-pathogenic to highly lethal (−, +, ++, +++, ++++).
3.3 Multiple-alignment-based MLST of serotype O78 strains
DNA sequences from the six alleles described above were determined for each strain tested. To obtain global phylogenetic reconstruction, incorporating the different genes, sequences of the six loci of each strain were artificially joined in the same order to give 2974-nucleotide-long sequences. A dendrogram of the merged sequences was then reconstructed (Fig. 2). The virulence assessment of the strains was superimposed on this phylogenetic tree.
2
Neighbor-joining MLST phylogenetic tree of E. coli strains and virulence to 1-day-old chicks. A neighbor-joining tree combining nucleotide sequences from six housekeeping genes was constructed correcting for multiple substitutions. Numbers, denoting bootstrap support values (1000 bootstrap trials), are shown for branches with >900 bootstrap values. The scale bar represents the number of substitutions per site. Serotypes are noted before strain designation. H: Human; C: cattle; A: avian pathogen; −: no mortality; +: less than 25% mortality; ++: less than 25–49% mortality; +++: 50–74% mortality; ++++: 75–100% mortality.
Open in new tabDownload slide
Most E. coli strains tested (the major cluster) share the same ancestry (root) with very high bootstrap support (Fig. 2), while five isolates (top of the figure) group separately in a different branch (the minor cluster). This minor cluster contains strains that do not share close ancestry with the strains of the major cluster. In this branch we find two internal branches, each containing both O2 and O78 avian pathogens, which are rather distant from one another.
The major cluster branches into two subclusters: The top subcluster (dark gray) contains only invasive strains from extraintestinal diseases of diverse hosts, including isolates from humans, cattle and birds. Conversely, the bottom subcluster (light gray) contains strains of lower virulence and is comprised of enteritis-related strains of humans and cattle, a low-virulence strain isolated from sheep sepsis and the commensal K-12 strain.
As can be seen from the tree, most of the invasive O78 strains cluster together, irrespective of their host, while the non-invasive strains cluster in a different group.
3.4 Allele profile MLST of serotype O78 strains
MLST can also be analyzed by allelic profiling, a method established mainly to determine large clonal complexes [9]. By examining gene sequences of all the strains, nucleotide variations were observed and specific alleles were designated for each locus. The allele profile for each strain was then used to determine sequence type (ST). All STs were compared, and identical STs were grouped together. STs which differ from one another in one, two or three alleles were designated single-locus variant, double-locus variant, and triple-locus variant, respectively.
The allele profile analysis yielded two main clusters. One clonal complex contains the invasive isolates corresponding to the top cluster of the multiple-alignment-based tree (Fig. 3A). The other clonal complex contains the less-virulent strains corresponding to the bottom cluster of the tree discussed above (Fig. 3B). All the other strains share less than three alleles with the isolates within the clonal complexes (Fig. 3C). These results are in very good agreement with the results of the multiple-alignment analysis. The ST composed of avian strains 790 and 793 has the largest number of single-locus variants (5), suggesting it is the ancestor of its clonal complex. This conclusion is also in accordance with the distance of these strains from the root of their respective cluster in the neighbor-joining tree (Fig. 2).
3
Allele profile MLST phylogenetic tree. The tree was constructed based on the MLST system described in http://www.mlst.net. The lines represent clonal relation; SLV: single-locus variant; DLV: double-locus variant; TLV: triple-locus variant. Serotypes are noted before strain designation. H: Human; C: cattle; A: avian; −: no mortality; +: less than 25% mortaility; ++: less than 25–49% mortality; +++: 50–74% mortality; ++++: 75–100% mortality.
Open in new tabDownload slide
The diversity observed is probably due to mutation rather than recombination, since all the different allele variants were single-nucleotide polymorphisms, except one case of a four-nucleotide polymorphism (in the adk gene).
4 Discussion
This work set out to establish an MLST scheme for typing of E. coli strains and is therefore a small-scale pilot study that will lay a foundation for future work. Because it is based on nucleotide sequences, MLST data may be easily stored in computer databases accessible via the internet, and available to the entire scientific and medical community. As more sequences are deposited in the databases, large-scale analysis will be facilitated, enabling the study of population genetics and epidemiology of E. coli.
The genes presented in this study, were found in all the strains tested, and were also polymorphic enough in both the sequence and number of different alleles (Table 2). All the loci had seven or more alleles in the 22 strains tested. Our preliminary experiments indicated that several other housekeeping genes showed little diversity. This level of diversity is not trivial, since a recent study conducted on 77 strains of enterohemorrhagic E. coli O157:H7 showed identity in seven housekeeping genes [11]. The difference in polymorphism levels between the two serotypes can be explained by the fact that E. coli O157:H7 is a recently emerged human pathogen causing one disease, whereas E. coli O78 infections of birds and mammals are more varied and include both intestinal and extraintestinal diseases. Furthermore, most of the diversity observed in O157:H7 is due to large insertions and deletions (e.g. following bacteriophage integration or exclusion), whereas in O78 strains single-nucleotide polymorphisms are common (see above). Thus, it would appear that MLST is useful for epidemiologic studies of O78 strains while it is of little value for such studies in O157:H7.
Three non-O78 controls were used in our study — two O2 strains and one K-12 strain. We expected all of these to be phylogenetically distant from the O78 strains. Although both O2 strains were very remote from most O78 isolates, they were somewhat related to O78 strains 794 and 75, which was unexpected. Even more surprising was the finding that certain low-virulence O78 strains and E. coli serotype K-12 are close evolutionary relatives. This result was unexpected since operons encoding the O-antigen biosynthesis vary greatly between a virulent strain O157:H7 and K-12 [16], and one would not expect the O side-chain O78 to be synthesized by gene products similar to those encoded by K-12 strains. However, one should keep in mind that O-antigen gene clusters of different strains have conserved flanking regions [16] and may therefore undergo hom*ologous recombination, resulting in strains with similar genotypes but very different serotypes.
The analysis of our sequence data included allele profiling and neighbor-joining trees based on multiple alignment. The two methods yielded similar results, which implies that allele profiling, which is more convenient for large datasets, can be applied without compromising sensitivity. The fact that the two clonal complexes obtained by allele profiling are separate — unlike the neighbor-joining tree, which assumes a common ancestor for both — is probably due to the small number of isolates used which in our sample did not contain strains closer to the root of both complexes.
Typing of serotype O78 strains is of particular interest because they are causative agents of invasive, and often lethal, diseases of humans and livestock, and present a continuing threat of zoonotic infection. Although we used a small sample in this study, the results are clear-cut. The results of the MLST analysis presented here indicate that most invasive E. coli O78 strains of mammals and birds originate from a single clone (Figs. 2 and 3). These results clearly indicate that the risk of zoonotic infection from sheep and even birds could be significant for E. coli O78 strains.
This study indicates that clonal division in E. coli O78 strains is host independent, and closely related clones can reside in different hosts. It therefore follows that the ability of strains to conquer new niches and infect a new host species depends more on horizontally acquired DNA than on the vertically inherited genotype. However, virulence does seem to be correlated with clonal origin, suggesting that many virulence properties that contribute to the pathogen's ability to cause invasive disease are relatively ancient.
Acknowledgments
The authors wish to thank Diana G. Ideses and Dr. Michael Naveh for help with in vivo studies, Dr. Moshe Wolk and Dr. Dan Elad for help with strains, and Tslil Ophir for help with automated sequencing. Dr. David Aanensen's help with MLST is highly appreciated. This work was supported by the Manja and Morris Leigh Chair for Biophysics and Biotechnology, the Israel Center for Emerging Diseases and European Community project COLIRISK.
References
[1]
Akiyama M. Crooke E. Kornberg A.
1992
)
The polyphosphate kinase gene of Escherichia coli. Isolation and sequence of the ppk gene and membrane location of the protein
.
J. Biol. Chem.
267
,
22556
–
22561
.
[2]
Babai R. Blum-Oehler G. Stern B.E. Hacker J. Ron E.Z.
1997
)
Virulence patterns from septicemic Escherichia coli O78 strains
.
FEMS Microbiol. Lett.
149
,
99
–
105
.
[3]
Brune M. Schumann R. Wittinghofer F.
1985
)
Cloning and sequencing of the adenylate kinase gene (adk) of Escherichia coli
.
Nucleic Acids Res.
13
,
7139
–
7151
.
[4]
Chang Y.Y. Wang A.Y. Cronan J.E. Jr.
1993
)
Molecular cloning, DNA sequencing, and biochemical analyses of Escherichia coli glyoxylate carboligase. An enzyme of the acetohydroxy acid synthase-pyruvate oxidase family
.
J. Biol. Chem.
268
,
3911
–
3919
.
[5]
Enright M.C. Spratt B.G.
1999
)
Multilocus sequence typing
.
Trends Microbiol.
7
,
482
–
487
.
[6]
Gophna U. Oelschlaeger T.A. Hacker J. Ron E.Z.
2001
)
Yersinia HPI in septicemic Escherichia coli strains isolated from diverse hosts
.
FEMS Microbiol. Lett.
196
,
57
–
60
.
[7]
Higgins D.G. Thompson J.D. Gibson T.J.
1996
)
Using CLUSTAL for multiple sequence alignments
.
Methods Enzymol.
266
,
383
–
402
.
[8]
Kimura M.
1980
)
A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences
.
J. Mol. Evol.
16
,
111
–
120
.
[9]
Maiden M.C. Bygraves J.A. Feil E. Morelli G. Russell J.E. Urwin R. Zhang Q. Zhou J. Zurth K. Caugant D.A. Feavers I.M. Achtman M. Spratt B.G.
1998
)
Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms
.
Proc. Natl. Acad. Sci. USA
95
,
3140
–
3145
.
[10]
Michaeli S. Mevarech M. Ron E.Z.
1984
)
Regulatory region of the metA gene of Escherichia coli K-12
.
J. Bacteriol.
160
,
1158
–
1162
.
[11]
Noller A.C. McEllistrem M.C. Stine O.C. Morris J.G. Jr. Boxrud D.J. Dixon B. Harrison L.H.
2003
)
Multilocus sequence typing reveals a lack of diversity among Escherichia coli O157:H7 isolates that are distinct by pulsed-field gel electrophoresis
.
J. Clin. Microbiol.
41
,
675
–
679
.
[12]
Pupo G.M. Lan R. Reeves P.R.
2000
)
Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics
.
Proc. Natl. Acad. Sci. USA
97
,
10567
–
10572
.
[13]
Ron E.Z. Yerushalmi Z. Naveh M.W.
1991
)
Adherence pili of E. coli O78 detrmine host and tissue specificity
. In:
Microbial Surface Components and Toxins in Relation to Pathogenesis
( Ron E.Z. Rottem S.
61
–
68
.
Plenum Press
,
New York
.
OpenURL Placeholder Text
[14]
Rowley D.L. Wolf R.E. Jr.
1991
)
Molecular characterization of the Escherichia coli K-12 zwf gene encoding glucose 6-phosphate dehydrogenase
.
J. Bacteriol.
173
,
968
–
977
.
[15]
Thompson J.D. Gibson T.J. Plewniak F. Jeanmougin F. Higgins D.G.
1997
)
The CLUSTAL-X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools
.
Nucleic Acids Res.
25
,
4876
–
4882
.
[16]
Wang L. Reeves P.R.
1998
)
Organization of Escherichia coli O157 O antigen gene cluster and identification of its specific genes
.
Infect. Immun.
66
,
3545
–
3551
.
Author notes
1 These authors contributed equally to this work.
© 2003 Federation of European Microbiological Societies