Molecular Evolutionary Characteristics of the 2019 Novel Coronavirus (SARS-CoV-2) Contracted by Tunisian Citizens : Comparison and Relationship to Other Human and Animal Coronaviruses Based on Spike Glycoprotein-Coding Gene Sequences Analysis

In contributing to the initiative to address the COVID-19 pandemic and in order to enhance the knowledge on driving forces shaping the evolution of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) (isolated from Tunisian patients), a comparison in relation to other coronaviruses infecting humans (SARS-CoV-1, MERS-CoV, HCoV/229E, HCoV/NL63, HCoV/OC43, and HCoV/HKU1) as well as animals (SARS-CoVs in tiger, bats, civet, pangolin, bovine, and MERS-CoV in dromedary/camel), was conducted. In-depth analysis was carried out involving 115 sequences of spike glycoprotein-coding gene extracted from the international databases. Phylogeny inference allowed the reconstruction of a bifurcating tree where four distinct groups were delineated and at the same time, three animal accessions (SARS-CoV-2/tiger, MERS-CoV/camel, and SARS-CoV/bovine) shifted from the animal group and integrated the human coronaviruses clades. Nonetheless, in the presence of reticulate events such as recombination, networks described better the phylogenetic relationships rather than the classic dendrogram. Thus, networks were produced and identified four clusters containing sharply demarcated subgroups (eight subdivisions). Except networked phylogenies of SARS-CoV-1, SARS-CoV-2, and HCoV/HKU1, all the others showed edges and boxes illustrating the occurrence of incompatibilities related to the sequences of spike glycoprotein-coding gene. Thereby and consolidating this result, three methods (RDP package, GARD, and RECCO) were used to detect breakpoints in aligned sequences. Except the clades SARS-CoV-1 and SARS-CoV-2, all the remaining phylogenetic subdivisions were subject to recombination. Furthermore, the screening of selection pressure in all studied sequences by various statistics-based models of the HyPhy package, showed that, similarly, the lineages belonging to the clades SARS-CoV-1 and SARS-CoV-2 were not under selection. In contrast, all members of the remaining clades underwent, to different extents, adaptive selection as well as purifying selection. Corresponding author: Moncef Boulila, Professor, Université de SfaxInstitut de l’OlivierB.P. 14, 4061 Sousse Ibn Khaldoun, Tunisia, Email: boulila.moncef@yahoo.fr


Introduction
Seven distinct zoonotic human coronaviruses described as spillovers since they crossed the species boundaries and jumped from animals to humans, are currently known. Four of them (HCoV-229E, HCoV-NL63, HCoV-OC43, and HCoV/HKU1) are responsible of the common cold ; whereas, the remaining three can cause from mild to severe respiratory diseases in humans i.e., conserved among all human coronaviruses [2]. The S protein retains sufficient affinity to the cellular Angiotensin Converting Enzyme 2 (ACE2) protein, and likely uses ACE2 protein as a receptor for cellular entry. In fact, this oligomeric transmembrane protein mediates coronavirus entry into host cells. It contains two subunits S1 and S2. While S1 comprises a receptor-binding domain (RBD) that identify various host cell surface receptors, S2 includes basic elements necessary for membrane fusion. The coronavirus begins by binding to a receptor on the host cell using S1

SARS
subunit ; afterwards, it fuses viral and host membranes by the means of the subunit S2. Due to its compelling functions, it is regarded as one of the most important targets for COVID-19 vaccine and therapeutic research [3]. It is worth noting that Nelson et al. [4] discovered a dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic : ORF3d, which was able to elicit a strong antibody response in COVID-19 patients. duplication : a genetic process, responsible for the genesis of novelty and redundancy, is considered as a major mechanism involved in the evolutionary history of different eucaryotes such as plants [5], where it plays an important role in escaping extinction [6], and bacteria [7]. Conversely, streamlined viral genome generally possesses limited intergenic regions and numerous cases of overlapping open reading frames giving rise to a reduced genome size which undergoes strong selection. As a result, there is a low prevalence of gene duplication process in viruses particularly RNA viruses [8]. Gene duplication contributes efficiently to evolution by providing new material for mutation, genetic drift, and selection [9] ; (iv) recombination is an important source of genetic variation for many RNA viruses. Most RNA viruses contain in their genome RNA-dependant RNA-polymerase (RdRp) which is error-prone due to lack of proofreading activity.

SARS-CoV
Consequently, it is liable to create heterogenous populations of molecules (quasispecies) brought about by an accumulation of mutations that can be deleterious and damaging. Genetic recombination comes into repairing these errors with the help of other functional genes. Nagy and Simon [10] argued that there are three main types of recombination: similarity-essential, similarity-assisted, and similarity-nonessential.
Furthermore, recombination process takes place according to a copy-choice model. During replication, the RdRp switches from the donor template to the acceptor template without releasing the nascent strand.
Nevertheless, it is worth mentioning that, in contrast, several components of the Nidovirales order (to which Coronavirus genus is affiliated) possess a highly active and processive RNA polymerase complex whose 3'-5' exoribonuclease implicated in RNA proofreading as a strong regulator of replication fidelity and diversity of coronaviruses [11] [12].
In furthering our understanding on molecular evolution of the 2019 novel coronavirus (SARS-CoV-2), an in-depth investigation of main evolutionary forces that shape its genetic diversity, was conducted across the protein S-coding gene sequences. In addition, a comparison was made with other coronaviruses infecting humans as well as animals in order to explore possible relationships, if any, among all analyzed sequences.      (Table 3). Besides, only one lineage of HCoV/HKU1 species from USA was recombinant:

Selective Pressure Inference
Natural selection, briefly defined as the unequal survival and reproduction of hereditary material due to environmental forces resulting in the preservation of favorable adaptations, was screened in aligned sequences of all eight described subdivisions. To find selection signature over sequences, site-specific models (SLAC, REL, FEL, IFEL, FUBAR, and MEME) and branchspecific models (aBSREL and GA-branch), were used. As indicated in Table 5 for the first category of models (site      (Table 3S).  Table   5.   Symptoms of the disease are fever, abdominal pain, vomiting, bleeding gums, rash as well as pain behind the eyes. But it seems that the transmissibility is lower than for respiratory viruses such as influenza or Covid-19.

Discussion
Finally, it is recommended to monitor landscapes dominated by human activities more closely than wild areas. In addition, the protection of natural areas and the restoration of habitats degraded by humans could benefit both the environment and public health.
Moreover, it is necessary to think about global biosecurity, evaluate weaknesses and strengthen health systems in developing countries.

Author's Contribution
The author conceived himself the study, collected data, carried out the analyses, wrote and revised the manuscript.

Funding
No funding was used to conduct this research

Data Availability
The nucleotide sequences used in this study are available from the database GenBank (https:// www.ncbi.nlm.nih.gov/).

Competing Interests
The author declares that he has no competing interests

Ethics Approval
Not required

Supplementary Material
Supplementary Table -1S Supplementary