Multiple introductions of serotype O foot-and-mouth disease viruses into East Asia in 2010–2011

Foot-and-mouth disease virus (FMDV) is a highly contagious and genetically variable virus. Sporadic introductions of this virus into FMD-free countries may cause outbreaks with devastating consequences. In 2010 and 2011, incursions of the FMDV O/SEA/Mya-98 strain, normally restricted to countries in mainland Southeast Asia, caused extensive outbreaks across East Asia. In this study, 12 full genome FMDV sequences for representative samples collected from the People’s Republic of China (PR China) including the Hong Kong Special Administrative Region (SAR), the Republic of Korea, the Democratic People’s Republic of Korea, Japan, Mongolia and The Russian Federation were generated and compared with additional contemporary sequences from viruses within this lineage. These complete genomes were 8119 to 8193 nucleotides in length and differed at 1181 sites, sharing a nucleotide identity ≥ 91.0% and an amino acid identity ≥ 96.6%. An unexpected deletion of 70 nucleotides within the 5′-untranslated region which resulted in a shorter predicted RNA stem-loop for the S-fragment was revealed in two sequences from PR China and Hong Kong SAR and five additional related samples from the region. Statistical parsimony and Bayesian phylogenetic analysis provide evidence that these outbreaks in East Asia were generated by two independent introductions of the O/SEA/Mya-98 lineage sometime between August 2008 and March 2010. The rapid emergence of these viruses from Southeast Asia highlights the importance of adopting approaches to closely monitor the spread of this lineage that now poses a threat to livestock industries in other regions.


Introduction
Foot-and-mouth disease (FMD) is a highly contagious viral disease characterized by rapid onset and high morbidity in a wide range of susceptible host species within the members of the order Artiodactyla (for reviews see [1,2]). The disease is endemic in the Middle East, Central and South Asia, Africa, and some countries in South America. FMD is notifiable to the World Organisation for Animal Health (OIE) and as a consequence FMD-affected countries have restricted trade in livestock and livestock products with FMD-free regions and countries. Therefore, an incursion of FMD into disease-free countries can have a devastating impact as was shown in the UK in 2001 [3] and 2007 [4], or in Taiwan during 1997 [5].
FMD is caused by a non-enveloped picornavirus (FMDV: genus Aphthovirus) with icosahedral symmetry. The virion, approximately 30 nm diameter, contains a single-stranded positive-sense RNA genome of approximately 8500 nucleotides (nt) in length. It contains a single open reading frame which is flanked by 5′ and 3′ untranslated regions (UTRs) and encodes the four structural proteins which form the capsid [1A (also known as VP4); 1B (VP2); 1C (VP3) and 1D (VP1)], and ten non-structural proteins (L, 2A, 2B, 2C, 3A, 3B1-3, 3C, and 3D) (for reviews see [6,7]). FMDV is a rapidly evolving virus classified into seven distinct serotypes, i.e. O, A, C, Asia 1, and Southern African Territories (SAT) 1, SAT 2 and SAT 3, which are supported by genetic classification based on the VP1-coding region. Most of our knowledge about the global distribution and molecular epidemiology of the virus is dependent upon analysis of this region which comprises approximately 8% of the FMDV genome [8]. However, complete genome sequences of FMDV are required to fully understand viral determinants of pathogenicity, virulence, host range and evolution. Moreover, complete genome sequence analysis of FMDV isolates has been successfully used to trace the origin and the transmission pathways of the virus within an outbreak [4,[9][10][11].
During 2010-2011, incursions of the Mya-98 lineage of the Southeast Asia (SEA) topotype of serotype O (O/ SEA/Mya-98) caused a series of high profile FMD outbreaks across five East Asian countries: the People's Republic of China (PR China) including the Hong Kong Special Administrative Region (SAR), Japan, Mongolia, the Russian Federation (Russia) and the Republic of Korea (ROK; South Korea) [12][13][14][15]. A range of host species have been affected by these outbreaks including domesticated pigs, cattle and small ruminants, as well as evidence for infection in gazelles in Mongolia [3,16]. During 2005-2007, serotype Asia 1 also spread throughout many countries in the region (PR China, Mongolia, Russia, North Korea); although it was not possible to determine the precise origin of these outbreaks, Southeast Asia was not implicated [17]. In ROK and PR China the outbreaks due to O/SEA/Mya-98 were preceded by FMD outbreaks due to serotype A (A/ASIA/Sea-97 lineage) [12]. Together, these recent events may be indicative of changing epidemiology of FMD in East Asia which may heighten risk for onward transmission to more distant countries including those that are FMD-free.
The aim of this study was to analyse the complete genomes of representative FMD viruses recovered from outbreaks during the pandemic of O/SEA/Mya-98 in East Asia and to compare them with sequences from viruses from Southeast Asia where this lineage is endemic [18]. Furthermore, previously uncharacterised FMD outbreaks due to serotype O also occurred during 2011 in the Democratic People's Republic of Korea (DRK; North Korea) and analysis of one of these samples are included in this report. This study investigates the origin and evolution of this emerging virus lineage.

Samples
The clinical samples and isolates included in the present study are listed in Table 1. They represent viruses within the O/SEA/Mya-98 lineage which affected eight East Asian countries during 2009-2011. These isolates were selected on basis of their VP1 sequence, previously described [12].

Full genome amplification
The FMDV amplification of 12 isolates was undertaken in two laboratories using the following approaches.
At The Pirbright Institute (PI, UK), the full genome of nine FMD viruses (Table 1) were amplified using a protocol initially developed to sequence related FMDV serotype O viruses from Southeast Asian countries [19]. Briefly, total RNA was extracted from the clinical samples or cell culture using RNeasy Mini Kit (Qiagen Ltd., Crawley, West Sussex, UK) according to the manufacturer's instructions. After reverse-transcription, using the UKFMD Rev 6 primer (5′-GGC GGC CGC TTT TTT TTT TTT TTT-3′) and SuperScript™ III Reverse Transcriptase (Invitrogen, CA, USA), the viral cDNA was purified [Illustra™ GFX PCR DNA and Gel Band Purification Kit (GE Healthcare, Buckinghamshire, UK)] according to the manufacturer's instructions, and amplified using Platinum® High Fidelity Taq (Invitrogen, CA, USA) to generate 22 PCR overlapping fragments that ranged in length from approximately 330 to 700 base pairs.
At the FGI Federal Centre for Animal Health (ARRIAH), full genome sequencing was undertaken as follows. Briefly, total RNA was extracted from three cell culture isolates (Table 1) using Ribo-prep kit (ILS, Moscow, Russian Federation) according to the manufacturer's instructions. Reverse-transcription and PCR (variant "one tubeone buffer") were used to generate 18 overlapping fragments. To determine 5′-and 3′-end sequences, RNA ligation was carried out using T4 RNA Ligase (Fermentas, Lithuania) in a protocol similar to one described previously [20]. Subsequently, a nested RT-PCR using forward primers for 3D region and reverse primers for the S-fragment was used to amplify a PCR-fragment consisting of approximately 155-160 nt (depending on poly (A) length) from the 3′-end and 74 nt from the Sfragment of FMDV RNA.

Amplification of the S-fragment of the 5′-UTR
Nine additional clinical samples from animals infected within the O/SEA/Mya-98 outbreak in Hong Kong SAR (Table 1) were processed at the PI as previously described to generate the S-fragment of the 5′ UTR. The primers described in the full genome protocol to generate and sequence this region of the FMDV genome (i.e. O1F and O1R [19]), were used in parallel with a further reverse primer (i.e. O1F2: 5′-ACC GAC TAG TAC TCT TAA CAC TCC GC-3′), designed to target the deleted region within the S-fragment of the 5′ UTR. This last step was carried out to discard the hypothesis that the deletion in the S-fragment of the FMDV isolate O/HKN/20/2010 was an artefact of the RT-PCR procedure either due to the passage of the virus in cell culture.

DNA visualization and sequencing
The PCR products were visualized on a 1.8% TBE agarose gel stained with ethidium bromide and purified using the same kit used for cDNA purification. Sequencing reactions were performed using the individual respective primers used for the amplification and Big Dye-Terminator v3.1 Cycle Sequencing Reaction Kit on an ABI 3730 DNA Analyser (Applied Biosystems, USA) following the manufacturer's instructions. Sequences were assembled, proof-read and edited with the Lasergene version 10.1 package (DNASTAR Inc, USA). These sequences were submitted to GenBank and were assigned the following accession numbers: HM229661, KF112879-KF112898.

Computational analysis
The sequences were compared with the complete genome sequences or sequences coding for the polyprotein of FMD viruses from the O/SEA/Mya-98 lineage obtained from other countries in East Asia such as the PR China, Malaysia and Vietnam which are available in GenBank ( Table 1). Alignment of the sequences was performed using the ClustalW subroutine in BioEdit, Version 7.0.5.3 [21]. The same program was used to calculate the nucleotide and amino acid (aa) identity matrices. Calculation of dN/dS and detection of potential positive selection was carried out using using SNAP [22]. Potential recombination was checked using SimPlot version 3.5.1 [23]. The RNA structure of the S-fragment was reconstructed for O/HKN/20/2010, O/HKN/15/ 2010 and O/TAI/22/2009 using RNAStructure v 3.5 [24] and RNAdraw version 1.1 [25].
Maximum-likelihood trees (1000 bootstrap replications) optimized using the heuristic nearest-neighbour -interchange method as implemented by the software program MEGA version 5.10 [26] were calculated using the S-fragment and the VP1-coding sequence. Maximum parsimony analyses using full genome sequences were carried out as implemented in TCS freeware, version 1.21 [27] following formatting using DnaSP, Version 4.10.3 [28]. The General Time Reversible (GTR) model (with no invariant sites) of nucleotide substitution was selected using jModelTest [29]. Bayesian evolutionary analysis using Markov chain Monte Carlo (MCMC) sampling bases (100 000 trees from 100 million generations), as implemented using BEAST software, version 1.6.1. [30], was carried out using the sequence coding for the polyprotein to estimate the rate of molecular evolution, to infer phylogenetic relationships and to implement the genetic temporal reconstruction of the outbreaks. Sampling collection dates were used to calibrate the molecular clock. The robustness of the parameters was assessed by substituting different combinations of molecular clocks and demographic models. The resulting output was checked in Tracer, version 1.5 [31] and visualized with FigTree, version 1.3.1. [32].

RT-PCR and sequencing
A total of 12 full genome sequences belonging to the O/SEA/Mya-98 lineage of FMDV collected from eight Eastern Asia countries during 2009-2011 were obtained using an overlapping PCR strategy. All of the component PCR products were of the expected size and no evidence of cross-contamination was detected within the negative control RT-PCR reactions which were performed in parallel. Sequencing coverage for these consensus sequences (total number of sequenced nucleotides divided by the contig length) was > 3.5.
The novel sequences generated in this study were compared with additional FMDV sequences (five full genome sequences and eight more sequences coding for the virus polyprotein) from the same lineage available in GenBank. In total, 17 full genome sequences and 25 sequences coding for the polyprotein of FMDVs belonging to the O/SEA/Mya-98 lineage recovered from outbreaks in the PR China and Hong Kong SAR, ROK, DRK, Japan, Mongolia, Russia, Thailand, Malaysia, Myanmar and Vietnam during 2003-2011 (Table 1) were included in the following analysis.

Analysis of the full genome sequences
The length of the 12 generated sequences ranged from 8123 to 8192 nt which for the nine sequences generated at the PI included 44-55 nt derived from the PCR primers, a 10-nt artificial internal poly(C) tract within the 5′ UTR and a 10-nt artificial poly Nucleotide sequence identity ranged from 91.0% to 99.3% while predicted aa identity ranged from 96.6% to 99.5%. There was no clear evidence of recombination within these sequences or between these sequences and those available in GenBank (using SimPlot version 3.5.1; data not shown).
Within the open reading frame encoding the FMDV polyprotein, which ranged from 2331 to 2333 aa length, there were 173 aa substitutions. The difference in aa length was due to a single codon insertion in the se-  substitution rate [0.17 substitutions per aa sequenced, higher (P < 0.05) than the most variable of the structural proteins] followed by VP1 (0.11) and 3B (0.15) as clearly visualised in the cumulative amino acid substitution plot in Figure 1a. Moreover, analysis of the synonymous versus non-synonymous ratio (dS/dN) also highlighted equivalent regions within L, VP1, 3B as well as 2C, 3A and 3C where there was evidence of positive selection (Figure 1b).  (Figure 2a) was subsequently found in five out of nine closely related field samples from Hong Kong SAR selected by phylogenetic analysis of the VP1 region. RT-PCR results using two forward different primers (O1F and O1F2) yielded negative results when using the forward primer targeting this fragment in those viruses with the deletion (O1F2, Figure 2b

Maximum likelihood analysis (S-fragment and VP1 coding regions)
Phylogenetic analysis using sequences for the Sfragment of these viruses together with viruses from other FMDV serotypes in which a similar deletion is present showed that all East Asian isolates (whether they have or do not have the deletion) clustered together with other viruses within the O/SEA/Mya-98a lineage ( Figure 4a) and were not closely related to other sequences from other serotypes with the deletion. Furthermore, Hong Kong SAR and Chinese viruses with the S-fragment deletion were also clustered closely together within the O/SEA/Mya-98a lineage as shown by analysis of the VP1 coding region (Figure 4b).

Statistical parsimony analysis (17 full genome sequences)
Analysis implemented by TCS allowed differences between these closely related individual sequences to be clearly visualised. Samples from the outbreaks in East Asian countries clustered into two different groups to that generated using statistical parsimony and full genome sequences ( Figure 6). The rate of nucleotide substitutions (per site per year) was 4.94 × 10 -3 (95% highest posterior density -HPD: 3.40 × 10 -3 -6.58 × 10 -3 ) which are comparable to the rates observed during the FMDV outbreaks in UK during 2001 and 2007 [4,9,33] and Bulgaria in 2011 [11]. The most recent common ancestor (MRCA) was estimated to be in the 1990s (21 st

Discussion
Full genome sequencing data can be used to identify the origin, to reconstruct the transmission pathways and to detect undisclosed infection within field outbreaks of FMDV, as demonstrated in previous studies [4,[9][10][11]. These data were analysed using different methods (TCS, Maximum Likelihood and Bayesian methods) providing complementary information that was used to increase the knowledge of the epidemiology of FMD in the region. The results from these different analyses were consistent and supported the occurrence of multiple independent virus introductions into some of the affected countries. This is most evident for sequences recovered from the Russian Federation, which were from two different O/SEA/Mya-98 genetic lineages. Furthermore, two sequences of FMD viruses sampled in ROK did not cluster together and were more closely related to other viruses from different countries in the region ( Figure 5). The on-going risk of FMDV re-introduction influences the strategies used to control FMD. Although ROK previously held the OIE status of FMD-free without vaccination, the characterisation of these recent outbreaks has led to the use of a wide-scale vaccination programme in the country in order to adopt FMD freedom with vaccination [15]. All the viruses sequenced in this study share origins in Southeast Asia where this lineage is endemic. Certainly, the Mya-98a and Mya-98b lineages described in this paper have arisen due to at least two separate introductions into FMD-free countries in East Asia. Based upon our knowledge of previously characterised outbreaks [4,11], it is also clear that a large number of unsampled FMD cases must have occurred in order to explain the observed genetic diversity between these viruses. This data does not categorically define the number of virus introductions into East Asia, and it is now critical that further characterisation of additional samples is undertaken to further understand the epidemiology of FMD in the region to establish and reinforce control measures. A 70 nt deletion in the S-fragment (5′ UTR) was revealed in the sequences O/HKN/20/2010 and O/GSLX/ 2010 (JQ900581) which was located at positions corresponding to nt 148-217 of O/HKN/15/2010. This deletion has not been reported previously for any serotype O sequence or for any FMD virus in Asia. However, previous comparative genomic studies of FMDV [7] have highlighted two instances of similar deletions for serotype A isolates from Argentina in 1959 and 1961 (AY593769 and AY593789) and one additional serotype C isolate from the UK in 1934 (AY593810). Likewise, a 41 nt deletion within the same region was found for serotype A viruses from India in 2009 (HQ832592) [34]. The results obtained in the present study using maximum likelihood analysis of the S-fragment and the VP1-coding region of isolates with and without deletions indicate that this deletion has arisen independently in these different serotypes and have not been transferred via recombination from a single event generated in one common ancestor. The deletion is neither host-species dependent, since they have been observed in viruses recovered from both pigs (East Asia) and cattle (India, South America and Europe  [35,36]. Earlier studies have suggested that the S-fragment plays a role in viral replication, as well as contributing to pathogenesis [37]. However, further studies are required to determine the cause and the consequence of these changes impact upon the viral phenotype.