4 minute read

Origin and Evolution

158 | CovidReference.com

Genomic sequencing of lower respiratory tract samples from index patients in Wuhan, China, identified SARS-CoV-2 as a novel coronavirus. It was thus placed by the CSG within the Coronaviridae family [Lu 2020]. Phylogenetic analysis conducted to determine the relationship of SARS-CoV-2 to other CoV clustered it in the Betacoronavirus genus, Sarbecovirus subgenus [Tan W 2020, Zhu N 2020]. Notably there is 94,4% homology with SARS-CoV in the seven conserved replicase domains in ORF1ab forming a distinct clade within the Severe Acute respiratory syndrome related coronavirus species (SARSr-CoV). The SARSr-CoV species comprises of hundreds of known viruses predominantly isolated from humans and diverse bats. Understandably the reference to “SARS” can be misleading as SARS-CoV-2, along with other SARSr-CoV, do not cause SARS-like clinical disease. SARS-CoV was the prototype of a new viral species and thus the unique name was assigned to the species as per established viral taxonomic practise. Accordingly, virus nomenclature does not necessarily indicate SARS-like disease but refers to the phylogenetic grouping within the founding virus’s species (CSG ICTV 2020, Wu Y 2020).

There has been considerable discussion regarding the origin of SARS-CoV-2. Currently there are numerous articles in scientific journals, pre-publication servers, as well as conspiracy theories on social and popular media. The most controversial of theories center around a laboratory engineered virus or bioweaponry. One of the major contributors to this theory was a preprint article where authors (Pradhan 2020) reported disconcerting similarities between SARS-CoV-2 spike glycoprotein (S) and HIV-1 envelope glycoprotein gp120 and gag protein. The implication of the article was that SARS-CoV-2 may have been manufactured using gene fragments from the HIV-1 genome. The article received extensive scrutiny from various peers internationally. It was quickly refuted after extensive bioanalysis demonstrated that there was no evidence amino acid sequences within the s-glycoprotein were HIV-1 specific nor obtained from HIV-1 (Xiao C 2020). Other claims supporting a laboratory engineered virus was based on a study where construction of a chimeric mouse/bat CoV was capable of infecting human cells in vitro [Menachery 2015]. Investigation into these claims making use of whole genome sequencing compared SARS-CoV-2 to several artificial CoV. Significant divergence between their genomes was identified making it improbable that they are interrelated. Additionally, SARS-CoV-2 is not derived from a previously used virus backbone, and contains randomly occurring mutations favouring natural evolution rather than synthetic construction (Andersen 2020, Dalavilla 2020, Liu S 2020). Other concerns involve

Virology | 159

theories of an escaped “natural” laboratory virus during basic research involving passage of bat-CoV. While this may be plausible, especially when considering incidents of inadvertent laboratorty escapes of SARS-CoV, the unique RBD region and acquisition of O-linked glycans make a natural origin and evolution more realistic. Several early reports of COVID-19 are linked to the Huanan Seafood and Wholesale Market in Wuhan, China, where wildlife was sold suggesting an animal source and zoonotic origin of the outbreak (Lau S 2020). Comparative genomics and phylogenetic analysis provide insight into the origin and evolution of SARS-CoV-2 by identifying the closest CoV relative and by extension, potential reservoir hosts (Andersen 2020). While multiple publications have investigated the genetic relatedness between SARS-CoV-2 with SARS-CoV (79%), MERS-CoV (50%), and other CoV, it is a previously isolated bat CoV, Bat-CoV-RaTG13 (RaTG13), that is identified as the closest relative. RaTG13, sampled from Rhinolophus affinis bats, shares >96% sequence identity across the entire genome (Xiao K 2020, Zhou P 2020). Notably, another batCoV isolated from Rhinolophus Malayanus, denoted RmYN02, was shown to share 93% homology with SARS-CoV-2. However, low sequence identity in the spike protein’s receptor binding domain (RBD) make it an unlikey candidate to be the exact CoV variant responsible for the outbreak [Zhou H 2020]. Nevertheless, given the similarities and close phylogenetic relationship to batCoV, it is reasonable to infer that SARS-CoV-2 originated in bats. Despite close homology between SARS-CoV-2 and RaTG13, there are fundamental differences in the RBD region responsible for human ACE-2 receptor binding. The RBD of SARS-CoV-2 binds efficiently to ACE-2 whereas RaTG13 does not. Divergence in the RBD is significant as it implies that RaTG13 is unlikely to be a direct progenitor of SARS-CoV-2 and an intermediate host is required for zoonotic transmission. Malayan pangolins (Manis javanica) are the host of several SARS-related CoV. Remarkably, the RBD regions of pangolin-CoV exhibit >97% amino acid homology in all six key amino acid residues that are required for ACE-2 receptor binding (Andersen 2020, Xiao K 2020). When considering that bats are the major reservoir of SARSr-CoV in conjunction with animal mixing occuring at food markets and smuggling centres, its plausible that pangolin-CoV originated from bat-CoV. Subsequent recombination events and natural selection between RaTG13 and Pangolin-CoV would allow a precursor virus to acquire an ACE2-encoding gene (Andersen 2020, Xiao K 2020). Crossover to humans could have arisen by various means; pangolin meat is considered a delicacy in some cultures and the scales have been reportedly used in Chinese traditional medicine. Lastly, SARS-CoV-2 has a unique polybasic cleavage site that is