Academic Editor:Qianqian Song, Wake Forest School of Medicine, Wake Forest Baptist Comprehensive Cancer Center, Medical Center Boulevard, Winston-Salem, NC 27157
Checked for plagiarism: Yes
Quantitative Computational Prediction of the Consensus B-cell Epitopes of 2019-nCoV
The goal of this paper is to obtain the numerical consensus of B cell epitopes from the three-dimensional structure of the prefusion spike glycoprotein of the new betacoronavirus that could lead to the development of a vaccine to 2019-nCoV. In order to do that, we first calculated the B-cell epitopes that are predicted using fourteen different mathematical algorithms. Later, we obtained the consensus of B-cell epitopes according to the Similarity Index, and finally selecting the best candidates according to the results of a function called <F> which is evaluated for the glycoprotein. The best candidates that we obtained in order to design a vaccine are SSANNCT, PLQSYGFQPT, TESNKKFLP, NNSYEC, AENS, LPDPSK and YDPLQPE.
On March 11 a new epidemic of an unusual pneumonia appearing to originate in the Huanan Seafood Market in Wuhan (China) was eventually declared to be a pandemic. More than 150,000 cases of 2019-nCoV infected until March 10, 2020, and almost 6000 deaths in almost 140 countries in the world. Unfortunately there currently is no vaccine to fight this disease.
In January 2020, Chen et al obtained the sequence of the virus from a sample obtained from bronchoalveolar lavage fluids from an infected patient in December 2019 1. They concluded that it is a new human coronavirus, and called it 2019-nCoV (or Covid-19). Later, the team of Wrapp et al determined the three-dimensional structure by X-rays (resolution 3,46 Å) of one of the key proteins for the development of a vaccine against this disease, the spike (S) glycoprotein 2.
With these results, it is possible to determine the B-cell epitopes that could be the basis for developing a vaccine, it means, it is basically residues present on the surface of an antigen that stimulates humoral immune responses 3, 4. In fact, the goal of epitope prediction is to design the minimal immune unit that invoke strong humal responses in human body 4, Therefore, the goal of this paper is determine in silico the consensus of linear and conformational B-cells epitopes obtained from the spike glycoprotein. Since these in silico methods obtain a large number of results, structural and energetic properties are considered from a function called <F> as will be explained in the next section.
In 2015, a computational methodology was published that allowed quantifying the quantification of the B cell epitopes employing a function called <F>, which is based on structural and energy factors evaluated in this glycoprotein which includes the degree of exposure to the solvent, the mobility, and the Gibbs free energy. Only the first of the three factors have been previously used in the scientific literature 5. The second factor is the mobility in order to identify the displacement of the amino acid in the time, and the last factor is the Gibbs free energy because it tells us how likely that an amino acid can mutate.
Therefore, the function <F> will be defined as follows:
<F> = <Q> · δ Δ G / <R>
where <Q> is the average value obtained from the Similarity Index; δΔG is the sum of all the values obtained from the free energy changes, and <R> is the average value of the mobility of the amino acids. In the next section we will explain how these values are calculated.
An important aspect is to verify the degree of solvent exposure of the region from where the epitopes come, and for to calculate it, we average values obtained from two different computer programs. Therefore, the work will calculated the consensus of B-cell epitopes derived from the spike glycoprotein of the 2019-nCoV, filtered according to the values of the function <F>, and the degree of solvent exposure.
From the sequence and three dimensional structure of the spike (S) glycoprotein obtained from PDB, we calculated the linear and conformational B-cell epitopes with the following computational programs: BePiPred 6, Emini Surface Accessibility Prediction 7, Kolaskar and Tongaonkar Antigenicity 8, ABCpred 9, BCPred based on flexibility, accessibility, exposed solvents and hydrophobicity 10, ElliPro 11, DiscoTope 12, CBTope 13, SEPPA 14, COBEpro 15 , and SVMTriP 16.
The next step was to determine the consensus of the B-cell epitopes according to the procedure described by Isea et al17, 18, 19, 20, 21, 22. To do this, we employed a script in Python that allows us to calculate the overlap of B-cell epitopes where only those epitopes with a length equal to or greater than four was used (cutoff of 5.0) 23.
In parallel, the contributions of the Gibbs free energy was calculated with the PoPMuSiC program 24. The values of δΔG will be equal to the sum of the different values obtained from these amino acids where we have assumed that small values represent a low probability that their amino acids can mutate.
On the other hand, the mobility associated with each epitope, abbreviated as <R>, was obtained with the elNémo program 25. This value was calculated according to the normal modes of the protein and indicates the gross displacement of the amino acids of the protein.
It is necessary to verify that these epitopes come from a region exposed to the solvent, and for this reason, it was determined from the average value obtained from two different computer programs: PoPMuSiC 24 and Polyview 26. Logically, these results must be normalized in order to compare these results.
Thus, the best candidates to select a vaccine against this disease should be those with the lowest value of the <F> function and the highest degree of solvent exposure. These last two criteria have been set in the present work and must be verified with experimental results
We have obtained 373 B-cell epitopes according to results obtained from the sequence and the three-dimensional structure of the spike glycoprotein (PDB ID) 6VSB as detailed below: 24 linear B-cell epitopes (the threshold is 0,35) yields with the BePiPred 2.0 program. The ABCPred predicted 49 epitopes of the length of 10 amino acids whose score is equal to or greater than 0,66. The BCPred program based on the hydrophilicity, flexible, accessibility and exposed surface were predicted 15, 14, 30 and 8, respectively. The Emili procedure generated 22 epitopes with a threshold of 1 and a window size of 6; and Kolasar antigenic were 37 (threshold 1,044 and windows size = 7). Discotope 1.1 (threshold = -7,7) yields 19, and ElliPro was 14 (minimum score = 0,5 and maximum distance = 6 Å). CBTope (threshold = -0.3) predicted 31, COBEpro (threshold = 0 .69) was 59, SEPPA 3.0 was 41 and finally SVMTripP was 10 (windows size = 10 and threshold = 0.325).
The next step was to calculate the consensus of B cell epitopes and the function <F>. To visualize this calculation, we let us focus on the region between 276 to 287 amino acids as show in Table 1.
In the first and second columns of Table 1, the position (abbreviated Pos) and their corresponding amino acid sequence of spike glycoprotein are shown. The third column indicates the value of Q, that is, the Similitud Index calculated by Isea et al. The consensus B cell epitope will be one whose values of Q are greater than 4 (with a minimum extension of four amino acids). As seen in Table 1, the consensus B cell epitope will be KYNENGT (highlighted in bold for easy viewing). The value <Q> will be simply the average of these Q values, which is (6+5+5+6+6+6+6) / 7 = 5,71.Table 1. Region selected in the spike glycoprotein to visualize the procedure for calculating the consensus of B cell epitope and function <F> (see text for more details).
|Pos.||Amino Acid||Q||R||Mutation||Polyview||PoPMuSiC solvent|
The value δΔG is the sum of the Gibbs energy contributions, it means, 0,04+2,51+0,63+0,51+0,81+2,24+0,88=7,62; while the average mobility value (<R>) is equal to (3.22+3.75+4.19+4.94+4.95+4.49+3.99)/7=4,22.
Therefore the value of the function <F> obtained from the glycoprotein and extrapolate to the consensus of B cell epitope KYNENGT is equal to 10,31 (ie., 5,71·7,62/4.22).
Table 2 shows the values obtained according to the procedure described above. However, it is important to verify that the regions are exposed to the solvent. For this reason, we calculated the average value that was obtained from two different programs from the data shown in Table 1. From the previous example, the degree of solvent exposure obtained with the PoPMusic program is simply (34,98+2,18+39,07+33,60+48,86+2,49+29,27)/7 =27,21 (see the values in Table 1); while the average value with the Polivew program is 2,29. To compare these values, they must be normalized it, taking into account that the maximum values are 100 and 9 for PoPMusIc and Polyview, respectively. The average value of the degree of exposure to the solvent is 0,26 (ie., 27,21/100+2,29/9). This last result indicates that this epitope (theoretically) is 26% exposed to the solvent. The rest of the results are shown directly in Table 2.Table 2. Values obtained from the consensus of B-cell epitopes obtained in the present work ordered according to the position of the sequence in the spike glycoprotein. See text for more details.
The results in Table 2 have been ordered according to the position of their amino acid sequence in the spike glycoprotein. It is interesting to comment that 12 of the 22 consensus epitopes obtained in this work have a value of <F> less than 6, which would imply that they could be possible candidates for the design of a vaccine. However of these 12, only 7 have an average value of exposure to the solvent greater than 40%. So the best candidates for the development of a vaccine appear to be: SSANNCT, PLQSYGFQPT, TESNKKFLP, NNSYEC, AENS, LPDPSK and YDPLQPE.
The present work calculated the consensus linear and conformational B-cell epitopes from a new coronavirus 2019-nCoV obtained from spike glycoprotein. With this information, it may be possible to design a vaccine for this disease. We employed a function called <F> that considers energy factors and the structure of the glycoprotein for selected the best candidates of B cell epitopes. Finally, the next step is to begin to validate these results with in vivo experiments.