Genomic molecular epidemiology of carbapenemase-producing Escherichia coli ST410 isolates by complete genome analysis

The circulation of carbapenemase-producing Escherichia coli (CPEC) in our society is a serious concern for vulnerable patients in nosocomial environments. However, the genomic epidemiology of the circulation of CPEC bacteria among companion animals remains largely unknown. In this study, epidemiological analysis was conducted using complete genome identification of CPEC ST410 isolates obtained from companion animals. To estimate the genomic distance and relatedness of the isolates, a total of 37 whole-genome datasets of E. coli ST410 strains were downloaded and comparatively analysed. As a result of the analysis, the genomic structure of the chromosomes and plasmids was identified, revealing the genomic positions of multiple resistance and virulence genes. The isolates in this study were grouped into the subclade H24/RxC, with fimH24, and substituted quinolone resistance-determining regions (QRDRs) and multiple beta-lactamases, including extended-spectrum β-lactamase (ESBL) and carbapenemase. In addition, the in silico comparison of the whole-genome datasets revealed unidentified ST410 H24/Rx subgroups, including either high pathogenicity islands (HPIs) or H21 serotypes. Considering the genetic variations and resistance gene dissemination of the isolates carried by companion animals, future approaches for preventive measurement must include the “One Health” perspective for public health in our society. Supplementary Information The online version contains supplementary material available at 10.1186/s13567-023-01205-6.


Introduction
Nosocomial infections caused by carbapenemase-producing Escherichia coli (CPEC) are emerging as a major clinical threat.Dissemination of carbapenemase-producing bacteria among companion animals, which live close to humans in modern society, should be considered an urgent threat to public health [1,2].Carbapenem usage in animals is prohibited worldwide.However, the unauthorized usage of carbapenems is prevalent and not systematically monitored; thus, the dissemination of CPEC among companion animals is an unaddressed threat.
In South Korea, a nationwide study investigated nosocomial strains isolated between 2011 and 2015 [12].The investigation reported that Klebsiella pneumoniae carbapenemase-2 (KPC-2) and New Delhi metallo-βlactamase-1 (NDM-1) were the dominant carbapenemase types in the Korean nosocomial environment.Thus far, ST410 strains have not been discovered in ongoing human investigations.To date, in South Korea, two separate studies have discovered a total of 7 strains of CPEC from companion animals [13,14].Unlike the results from the human investigation, all companion animal-derived CPEC strains harboured the IncX3 plasmid, which encodes bla NDM-5 , and were identified as multilocus sequence type (MLST) ST410.The discrepancy between the findings in humans and animals could be attributed to insufficient data regarding companion animals or unidentified transmission events within our environment and animal community.However, given the growing importance of companion animals in human society, it is crucial to consider the possibility of human-animal transmission of CPEC pathogens.
MDR pathogenic bacteria, which have already become a major public health concern, are no longer confined to the realm of human health.As a natural phenomenon derived from ancient bacterial genomes [15], resistance genes can be shared and transferred through horizontal gene transfer among pathogenic bacteria in humans, animals, and the environment [16][17][18].Therefore, it was necessary to conduct a resistance and virulence gene distribution analysis combining potential sources of shared yet undetected resistance genes while considering the comprehensive perspective of the "One Health" approach.
In this study, the genomic distance between companion animal-derived CPEC pathogens and previously identified strains was measured using whole-genome phylogenetic analysis.Since all the strains discovered in our country were identified solely as ST410, worldwide ST410 datasets were selected as the reference.To conduct the phylogenetic analysis, we performed whole-genome sequencing on four CPEC isolates obtained from companion dogs, including three isolates (DMCPEC2, DMCPEC3 and DMCPEC7) that were included in our previous study [13].The wholegenome datasets were screened and analysed based on the public database to identify undiscovered genes that could be potential threats.

Bacterial strain isolation and minimum inhibitory concentration
A total of 4 isolates of E. coli ST410 strains obtained from companion animals were included in this study.Three strains (DMCPEC2, DMCPEC3 and DMCPEC7), which were identified as carrying bla NDM-5 -encoding IncX3 plasmids in a previous study [13], were included among the four isolates.NB7CPEC was isolated from a screening rectal swab at a local veterinary clinic in Seoul, South Korea.Meropenem-impregnated (1 μg/ mL) MacConkey (MIM) agar was used to identify the carbapenem-resistant gram-negative phenotype from the rectal swab of a mixed Pomeranian dog.The host dog (6-year-old mixed Pomeranian canine, spayed female) had no specific clinical disease condition and was swabbed for carbapenemase screening with a normal rectal swab collected by professional veterinarians in accordance with the Guide for the Care and Use of Laboratory Animals and the Animal Welfare Act.The antimicrobial resistance minimum inhibitory concentration (MIC) profile was determined using the broth microdilution method.E. coli strain ATCC 25922 was used as a quality control strain for the MIC determination, following the Clinical and Laboratory Standards Institute (CLSI) recommendations for performance and interpretation [19].

Total DNA isolation followed by in vitro genotyping
The Wizard Genomic DNA purification kit (Promega, Madison, WI) was used for total DNA purification, and carbapenemase gene screening was performed using previously designed multiplex PCR primers and protocols [20].To compare the sequences with available GenBank data, Sanger sequencing was performed using the Basic Local Alignment Search Tool (BLAST) network service [21].
Classical MLST was performed using a previously described protocol [22] to evaluate seven housekeeping genes (adk, fumC, gyrB, icd, mdh, purA and recA), and the results were further confirmed on the online database [23].

Combined complete genome sequencing and de novo assembly
The whole-genome DNA samples of 4 isolates of CPEC ST410 were purified with a Wizard Genomic DNA purification kit (Promega, Madison, WI, USA) from overnight cultures.For high-quality sequencing and assembly, both long and short genomic DNA libraries were prepared.Short-read sequencing was performed using an Illumina NovaSeq 6000 (Illumina, San Diego, CA, USA) platform following a paired-end 2 × 150-bp protocol.The Oxford Nanopore platform (Oxford Nanopore Technologies, Oxford, UK) was employed for long-read sequencing.The ONT library was constructed and sequenced by using the Ligation Sequencing Kit (SQK-LSK109), the Flow Cell Priming Kit (EXP-FLP002) and Flowcell (FLO-MIN106).

In silico typing and identification for bioinformatic comparison
For comparative analysis, 37 genomic datasets of E. coli ST410 strains were downloaded from the National Center for Biotechnology Information [35] and compared with the 4 isolates sequenced in this study.The assembled genomes were screened for comparison of the resistance genes, virulence genes, serotypes, SPI sites, MLST, plasmid types and fimH types on the Center for

Table 1 Basic information of bla NDM-5 -encoded CPEC isolates included in this study
The basic profiles of bacterial strains carrying bla NDM-5 -encoded IncX3 plasmids were described.Four isolates were discovered from a veterinary clinical hospital.The basic bacterial genotypes and host information are listed.Genomic Epidemiology (CGE) server [36] for in silico utilization of ResFinder 4.1, VirulenceFinder 2.0, Sero-typeFinder 2.0, SPIFinder 2.0, MLST 2.0, PlasmidFinder 2.1 and FimTyper 1.0.The schematic complete genome structure maps of the chromosomes and plasmids were generated with the CGView server [37].High-quality whole-chromosome SNPs were identified from the chromosomal reference sequence of YD786 (GenBank no.CP013112.1)for concatenated alignment using the standard settings of CSI Phylogeny [38].A maximum likelihood (ML) tree was constructed with 1000 bootstrap replicates in MEGA 11 software [39].The comparative gene distribution annotations and heatmaps were annotated, and a phylogenetic tree was constructed and visualized on iTOLs [40].

Qualification of the sequenced whole-genome datasets
The 4 E. coli ST410 isolates identified from 4 different companion dogs were subjected to whole-genome sequencing, and a high-quality nucleotide sequence was generated (Additional file 2).Whole-genome sequencing identified chromosomes and 2 plasmids from each strain.In NB7CPEC, 2 additional plasmids were identified (Table 2).were carried by all 4 strains.The IncX3 plasmids were consistent with the datasets reported in a previous study [13].The data for pNB7-NDM5 were included in the visual map.The plasmid map of pEC3-NDM5 (46 749 bp) was used as the backbone for visualization and is depicted as a black circle.The circular map was generated by using CGView.

Identification of characteristic genes in the chromosomes and plasmids of the ST410 strains
The identified chromosomal length of each strain was between 4.71 and 4.83 Mbp (Figure 1).The ESBL-encoding genes of CMY-2 and CMY-121 were identified.The chromosomes included variant sites of quinolone resistance-determining regions (QRDRs) of the gyrA (S83L and D87N), parC (S80I) and parE (S458A) genes.A total of 5 CRISPR region sites were also discovered, with a CAS-type IE cluster in the chromosome.The virulence genes (csgA, fimH, lpfA, hlyE and yehABCD) were also identified, and their positions were marked on the visualization map.Two heterogeneous plasmids were carried by all isolates, including the bla NDM-5 -encoding IncX3 plasmid (Figure 2).The other plasmid (Figure 3) was discovered as an integrated form of three types of plasmids, namely, IncFIA (% identity; 99.74), IncFIB (AP001918) (% identity; 98.39) and IncFII (pAMA1167-NDM-5) (% identity; 100).
The bacterial strain NB7CPEC, which was isolated in 2021 from a local veterinary clinic, carried an additional 2 different plasmids (Figures 4 and 5).The IncFII (pHN7A8)-type plasmid (Figure 4) was identified as a 73 618 bp-long plasmid encoding the ESBL-harbouring genes bla TEM-1B and bla CTX-M-65 .The structural positions of the resistance genes and the mobile gene elements were mapped on the schematic map of the whole plasmid.The p0111-type (% identity, 98.08) plasmid was also identified as a 96,439 bp-long plasmid (Figure 5), although it lacked mobile gene cassettes and resistance genes.

Whole-genome phylogeny and bioinformatic comparison of the characteristic genes
The complete genome datasets of 37 E. coli ST410 strains isolated between 2010 and 2020 from 17 different The measured number of valid SNP positions of the whole chromosomes identified by the CSI Phylogeny pipeline was 5271.The pairwise SNP difference, adjusted according to YD786 (GenBank no.CP013112.1)as the reference genome, ranged from 0 (between CP024801 and CP026473) to 2,204 (between CP031231 and CP027205) (Additional file 4).Among the isolates from this study, the pairwise SNP differences were from 28 (between DMCPEC3 and DMCPEC7) to 533 (between DMCPEC2 and NB7CPEC).The input parameters, analysis quality and identity of each strain against the reference are displayed in Additional file 5.The resistance and virulence genes and plasmid types are denoted in Additional files 6, 7, 8.
A whole-chromosome SNP-based phylogenetic tree was then constructed and displayed along with the epidemiological datasets (Figure 6).All 41 isolates included amino acid substitutions in the QRDR and fimH24 genes, which could be classified as the B/H24R lineage.
The analysis of the subclade ST410 B/H24R identified the following distinguishable groups: (i) Group A, including the H9 antigen and the type 1 high-pathogenicity island (HPI) of the Salmonella pathogenicity islands, (ii) Group B, identified with the H21 antigen, and (iii) Group C (B4/H24RxC), carrying the ESBLs and carbapenemase genes.Three of the strains from this study (DMCPEC3, DMCPEC7 and NB7CPEC) encoding bla CMY-6 and bla NDM-5 could be included in Group C.
The isolates included in phylogenetic group A included HPIs.The discovered HPI gene dataset was identical to the type 1 HPI (identity > 98%) identified from Salmonella enterica group VI [41].The SNP differences ranged from 37 (between CP018965 and CP035325) to 662 (between CP035123 and CP073926).Several virulence genes were identified from Group A with high priority, namely, iucC (aerobactin synthetase), ituA (ferric aerobactin receptor) and sitA (iron transport protein).The tetracycline resistance gene tet(A) was carried by the isolates of Group A with higher priority than the other groups.
Group B featured the H21 antigen type.The isolates of Group B carried the bla CMY-42 gene with relatively higher priority than the other groups.The differences in SNPs ranged from 5 (between CP035944 and CP042934) to 403 (between CP031653 and CP029369).The isolates in Group C mainly originated from samples from Asia and Europe.The serotypes of the isolates in Group C were identified as O8:H9.The pairwise SNP difference matrix ranged between 15 (CP034958 and CP033401) and 103 (CP048344 and CP027205).Group C isolates were found to have carbapenemase genes (Figure 7), encoding at least one carbapenemase gene of NDM-5 or OXA-1.The ESBL genes of CMY-2, CMY-6 and TEM-1B were discovered in this group with higher priority, along with the IncFIA-, IncFII-and IncX3-type plasmids.

Discussion
The extraintestinal pathogenic E. coli ExPEC ST410 strain is known to circulate not only in humans but also in animals (wildlife and companion animals) and the environment [3,10].Through whole-genome phylogenetic analysis, it was discovered that the companion animal-derived isolates were closely grouped with strains identified from humans, animals and the environment (in subgroup C, Figures 6-8).The total number of whole-genome SNP differences of isolate DMCPEC7, A maximum likelihood (ML) tree was constructed and visualized by iTOLs based on whole-chromosome SNPs.The isolates originating from this study are highlighted with a black background.The coloured ranges covering the strain labels show the identified subgroups: Group A (red); Group B (blue); Group C (green).The coloured and labelled columns indicate the epidemiological information of the ST410 strains.From left to right, geographic locations; O antigen types; H antigen types; quinolone resistance-determining regions (QRDRs) variation sites of gyrA; QRDR variation sites of parC/parE; levels of identity of salmonella pathogenicity islands (%).The geographic locations of the strains are denoted in distinguishable colours according to their continents of origin: Africa (grey), Asia (yellow), Europe (cyan), North America (red) and South America (green).carried by a companion dog, from the human blood isolate KBN10P04869 (GenBank no.CP026473) was 36.Previous investigations on Korean isolates have indicated differences in genetic characteristics, such as MLST types, between human and animal strains.The MLST types of the human-obtained CPEC isolates investigated by the national laboratory surveillance system were ST131, ST1642 and ST101, whereas ST410 was not reported [12].In contrast, all CPEC isolates identified from companion animals in South Korea thus far have been typed as ST410.Despite the phylogenetic relatedness, the genetic studies conducted in this work are not suitable to provide direct evidence of the transmission between the isolative sources of those strains, including human-animal transmission.However, when considering the genetic evidence revealed by the whole-genome phylogeny, it is crucial to seriously consider the possibility of circulation among humans, animals and the environment.Therefore, an integrative "One Health" approach should be applied for control measures considering the E. coli ST410 strains.The E. coli ST410 strains, along with ST131, have been proposed as globally circulating strains of extraintestinal pathogenic E. coli (ExPEC) [10].ExPEC bacteria are known to encode various extraintestinal virulence factors and have been attributed to various infectious diseases, including neonatal meningitis, urinary tract infections, bloodstream infections and pneumonia [42][43][44].In these clinical situations, the administration of β-lactam antibiotics, including carbapenems, is an important treatment option.However, ExPEC bacteria may be resistant not only to β-lactam antibiotic agents due to the production of various β-lactamases, such as ESBL and carbapenemase but also to other antibiotics, including fluoroquinolones, tetracyclines and aminoglycosides [45][46][47].In this study, companion animal-derived ST410 strains were found to carry various virulence genes and antimicrobial resistance genes.The MIC results (Additional file 1) revealed that colistin is currently the only viable option for treating these isolates.Therefore, the circulation of these strains in companion animals within our community should be addressed seriously, considering that these strains have the capacity to not only cause extraintestinal infections as ExPEC bacteria but also resist various antimicrobial agents as CPEC pathogens.
E. coli strains are largely classified into phylogroups A, B1, B2, C, D, E or F, followed by further categorization of sequence types, clades and subclades [48].The carbapenemase gene is mainly carried by E. coli strains in phylogroups A and B1 [49], and E. coli ST410 strains are grouped into phylogroup A. ST410 has been reported as a high-risk clone in previous studies [48].
The subclade of ST410 carrying fimH24 has been classified by B2/H24R (gyrA and parC mutations), B3/H24Rx (additional carriage of the ESBL gene) and B4/H24RxC (carbapenemase introduction) [5].In this study, further specific groups were identified in the B2/H24R subclade by bioinformatic gene identification based on wholegenome data.In particular, the distribution of virulence factor genes (Figure 8) was greater in Group A strains relative to the other strains, such as HPIs, iucC, ituA and sitA.HPI and ExPEC bacteria are strongly correlated and have been described as the causative agents of various extraintestinal infections [50,51].Conversely, Group C isolates were identified as having a relatively larger dissemination of antimicrobial resistance genes.To continue to define the genetic characteristics of the ST410 strains identified in this study, further investigations are needed, such as additional isolate collection and antimicrobial and virulence phenotyping.
Whole-genome analyses revealed the genomic potential of CPEC isolates identified from companion animals, indicating that these strains should be considered a potential threat to public health.Therefore, it is crucial to consider new measures for controlling the dissemination of CPEC ST410 strains in our society, adopting a combined public health approach with a "One Health" perspective.
• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations maximum visibility for your research: over 100M website views per year

•
At BMC, research is always in progress.

Learn more biomedcentral.com/submissions
Ready to submit your research Ready to submit your research ?Choose BMC and benefit from: ? Choose BMC and benefit from:

Figure 1
Figure 1 Circular map of comparative chromosomes of the 4 ST410 isolates identified in this study.The whole chromosomes of the E. coli ST410 strains identified in this study were comparatively mapped.The chromosomal map of DMCPEC2 (4 832 084 bp) was utilized as the backbone for visualization and is depicted as a black circle.The circular map was generated by using CGView.

Figure 2
Figure 2 Schematic visualization map of IncX3 plasmids encoding bla NDM-5 carried by ST410 strains.The approximately 46 kbp-long IncX3-type plasmids encoding the carbapenemase gene bla NDM-5were carried by all 4 strains.The IncX3 plasmids were consistent with the datasets reported in a previous study[13].The data for pNB7-NDM5 were included in the visual map.The plasmid map of pEC3-NDM5 (46 749 bp) was used as the backbone for visualization and is depicted as a black circle.The circular map was generated by using CGView.

Figure 3
Figure 3 Integrated map of the 3 plasmid types carried by the ST410 strains.Three distinct plasmid types (namely, FII (pAMA1167-NDM-5), FIA and FIB (AP001918)) were integrated into 77-82 kbp-long plasmids.The plasmid map of pEC3-FIIFIAFIB (82 232 bp) was used as the backbone for visualization and is depicted as a black circle.The circular map was generated by using CGView.

Figure 4
Figure 4 Circular map of the IncFII (pHN7A8)-type plasmid carried by strain NB7CPEC.The plasmid pNB7-pHN7A8 was identified from NB7CPEC and encoded the ESBL genes bla TEM-1B and bla CTX-M-65 .The circular map was generated by using CGView.

Figure 5
Figure 5 Map of the p0111-type plasmid carried by the ST410 strain NB7CPEC.The p0111-type plasmid was identified without additional resistance genes.The circular map was generated by using CGView.

Figure 6
Figure 6Epidemiological comparison of the complete genome datasets of 41 ST410 strains.A maximum likelihood (ML) tree was constructed and visualized by iTOLs based on whole-chromosome SNPs.The isolates originating from this study are highlighted with a black background.The coloured ranges covering the strain labels show the identified subgroups: Group A (red); Group B (blue); Group C (green).The coloured and labelled columns indicate the epidemiological information of the ST410 strains.From left to right, geographic locations; O antigen types; H antigen types; quinolone resistance-determining regions (QRDRs) variation sites of gyrA; QRDR variation sites of parC/parE; levels of identity of salmonella pathogenicity islands (%).The geographic locations of the strains are denoted in distinguishable colours according to their continents of origin: Africa (grey), Asia (yellow), Europe (cyan), North America (red) and South America (green).

Figure 7
Figure 7Whole-genome dataset comparison of the antimicrobial resistance genes and plasmid types identified from the ST410 strains.Different types of antimicrobial resistance genes were categorized and are shown in distinguishable colours and rectangular boxes: carbapenemases (black), ESBLs (red), aminoglycosides (magenta), dihydrofolate reductases (green), macrolides (brown), quinolones (purple), sulfonamides (blue) and tetracyclines (grey).A heatmap indicating the identified plasmid types is shown along with the antimicrobial resistance gene distribution map.

Figure 8
Figure 8Virulence gene distribution comparison identified from the whole genome datasets.Virulence genes were screened from the whole genome datasets adjusting VirulenceFinder 2.0 via the CGE server and visualized using iTOLs.Distinguished phylogenetic subgroups are indicated with coloured ranges: Group A (red); Group B (blue); Group C (green).

Table 2 Assembled and identified genomes in this study
Each strain contained a single chromosome.Two types of plasmids were carried by all isolates in common, including NDM-5 harbouring the IncX3 plasmid.NB7CPEC was identified with 2 additional plasmids, which were typed as IncFII (pHN7A8) and p0111.