Identification of Escherichia coli and shigella species from whole-genome sequences

Marie Anne Chattaway, Ulf Schaefer, Rediat Tewolde, Tim Dallman, Claire Jenkins*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

49 Citations (Scopus)


Escherichia coli and Shigella species are closely related and genetically constitute the same species. Differentiating between these two pathogens and accurately identifying the four species of Shigella are therefore challenging. The organism-specific bioinformatics whole-genome sequencing (WGS) typing pipelines at Public Health England are dependent on the initial identification of the bacterial species by use of a kmer-based approach. Of the 1,982 Escherichia coli and Shigella sp. isolates analyzed in this study, 1,957 (98.4%) had concordant results by both traditional biochemistry and serology (TB&S) and the kmer identification (ID) derived from the WGS data. Of the 25 mismatches identified, 10 were enteroinvasive E. Coli isolates that were misidentified as Shigella flexneri or S. boydii by the kmer ID, and 8 were S. flexneri isolates misidentified by TB&S as S. boydii due to nonfunctional S. flexneri O antigen biosynthesis genes. Analysis of the population structure based on multilocus sequence typing (MLST) data derived from the WGS data showed that the remaining discrepant results belonged to clonal complex 288 (CC288), comprising both S. boydii and S. Dysenteriae strains. Mismatches between the TB&S and kmer ID results were explained by the close phylogenetic relationship between the two species and were resolved with reference to the MLST data. Shigella can be differentiated from E. Coli and accurately identified to the species level by use of kmer comparisons and MLST. Analysis of the WGS data provided explanations for the discordant results between TB&S and WGS data, revealed the true phylogenetic relationships between different species of Shigella, and identified emerging pathoadapted lineages.

Original languageEnglish
Pages (from-to)616-623
Number of pages8
JournalJournal of Clinical Microbiology
Issue number2
Publication statusPublished - Feb 2017

Bibliographical note

Publisher Copyright:
© 2017 American Society for Microbiology.


  • DNA sequencing
  • Escherichia coli
  • Identification
  • Kmer
  • MLST
  • Shigella


Dive into the research topics of 'Identification of Escherichia coli and shigella species from whole-genome sequences'. Together they form a unique fingerprint.

Cite this