Group A Streptococcus (GAS) M protein is an important virulence factor and potential vaccine antigen, and constitutes the basis for strain typing (emm-typing). Although >200 emm-types are characterized, structural data were obtained from only a limited number of emm-types. We aim to evaluate the sequence diversity of near-full-length M proteins from worldwide sources and analyse their structure, sequence conservation and classification. GAS isolates recovered from throughout the world during the last two decades underwent emm-typing and complete emm gene sequencing. Predicted amino acid sequence analyses, secondary structure predictions and vaccine epitope mapping were performed using MUSCLE and Geneious software. A total of 1086 isolates from 31 countries were analysed, representing 175 emm-types. emm-type is predictive of the whole protein structure, independent of geographical origin or clinical association. Findings of an emm-type paired with multiple, highly divergent central regions were not observed. M protein sequence length, the presence or absence of sequence repeats and predicted secondary structure were assessed in the context of the latest vaccine developments. Based on these global data, the M6 protein model is updated to a three representative M protein (M5, M80 and M77) model, to aid in epidemiological analysis, vaccine development and M protein-related pathogenesis studies.
Bibliographical noteFunding Information:
This work was supported by the European Society for Clinical Microbiology and Infectious Diseases, European Society for Paediatric Infectious Diseases, Fonds National de la Recherche Scientifique (Belgium), Fonds Brachet and Fondation Van Buuren (Belgium), Australian National Health and Medical Research Council, National Institutes of Health (AI-065572). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
- M protein
- Streptococcus pyogenes