An assessment of potential epitopes in SARS-CoV-2 structural proteins

The coronavirus disease 2019 (COVID-19) is a highly contagious acute respiratory disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Phylogenetic analysis of SARS-CoV-2 whole-genome sequences isolated from infected patients revealed 96.2%, 79.6%, and 50% sequence identity with the genomes of RaTG13, SARS-CoV BJ01, and the Middle East respiratory syndrome coronavirus (MERS-CoV), respectively.

SARS-CoV-2 has a genome that encodes both structural and non-structural proteins (NSPs). Except for gamma coronavirus, which lacks NSP1, the initial open reading frame (ORF1a/b) encodes 16 NSPs (NSP1-16).

Study: Immunoinformatics mapping of potential epitopes in SARS-CoV-2 structural proteins. Image Credit: Design_Cells /


The spike (S), envelope (E), membrane (M), and nucleocapsid (N) are the four structural proteins that comprise SARS-CoV-2. The SARS-CoV-2 S protein consists of two functional subunits including the S1 subunit that mediates cell attachment and the S2 subunit that mediates virus-host fusion.

SARS-CoV-2 has been neutralized by antibodies that bind to the S protein. The subsequent rapid development of neutralizing antibodies against the S protein correlates with the immune response to the virus. Furthermore, individuals who show seroconversion may develop a long-lasting immune response to SARS-CoV-2. In a recent study, the S and N proteins of SARS-CoV-2 were used to develop a rapid diagnostic test for COVID-19 serodiagnosis.

In this study, researchers from various institutions in India utilized multiple online bioinformatics resources and stringent selection criteria to identify potent T- and B-cell epitopes of four SARS-CoV-2 structural proteins. The in silico prediction method used in this study identified potent, common, and species-specific B- and T-cell epitopes that are likely to be recognized in humans. Furthermore, the researchers determined the conservation of the predicted epitopes across coronavirus species (CoVs).

About the study

Using existing immunological knowledge and the genetic similarity of SARS-CoV-2 with SARS-CoV, the authors predicted B- and T-cell epitopes using various prediction services. Using a restricted number of prediction servers, previous research found epitopes mostly on the S protein of SARS-CoV-2.

In the current study, the researchers identified putative B- and T-cell epitopes in all four SARS-CoV-2 structural proteins using well-established prediction techniques. The authors found 20 linear B-cell epitopes in the structural proteins of SARS-CoV-2 by choosing the top linear B-cell epitopes predicted using BepiPred, Bcepred, and ABCpred services. These included 11 linear B-cell epitopes for the S protein, six for the N protein, as well as two and for the M and E proteins, respectively.

The researchers manually sorted the epitopes in the absence of bioinformatics tools to investigate the common epitopes and discovered conserved epitopes of S (aa 407–416, aa421-427, aa1028-1049, aa1254-1273), N (aa173-189, aa235-247), M (aa163-182), and E (aa 58–68) proteins that are shared by both SARS-CoV and SARS-CoV-2. The B-cell epitopes were visualized using BIOVIA Discovery Studio 2017 R2 on the 3D structure of the structural proteins of SARS-CoV and SARS-CoV-2.

The images predicted the likely location of epitopes on the surface of S, E, M, and N proteins’ three-dimensional structures. Furthermore, most peptides’ strong alpha-helical content, as predicted by high Agadir scores, indicated peptide stability in solution.

Patients who have recovered COVID-19 had CD4+ and CD8+ memory responses to SARS-CoV-2 according to a few investigations on SARS-CoV-2-specific T-cell responses and their function in protective immunity. Unexposed healthy subjects had SARS-CoV-2-specific CD4+ T-cell responses as well, thereby suggesting the possibility of pre-existing cross-reactive immunological memory to seasonal human coronaviruses. Researchers can use the information on SARS-CoV-2 proteins and epitopes recognized by human T-cells to help them select prospective epitopes or target proteins for the development of future vaccine candidates.

Based on the prediction of epitopes by at least two servers, the authors picked the top 2% peptides with the highest affinity. To this end, the researchers discovered 55 non-overlapping peptides (26 CD8+ T-cell epitopes for S, 16 for N, 10 for M, and 3 for E proteins) as strong binders of major histocompatibility complex (MHC) I molecule. These included 26 CD8+ T-cell epitopes for the S protein, as well as 16, 10, and 3 for the N. M. and E proteins, respectively.

These strong MHC I binding peptides were evaluated for their projected ability to trigger interferon (IFN) responses, as well as their antigenic and non-allergenic features. Notably, the authors of the current study predicted 16 CD8+ T-cell epitopes that are shared by structural proteins of both SARS-CoV-2 and SARS-CoV-3. These included five CD8+ T-cell epitopes each for S and N proteins and three each for M and E proteins.


The immune epitopes predicted in silico in this study are limited to the structural proteins of SARS-CoV-2. Notably, many of the predicted B- and T-cell epitopes derived from various computational tools have been experimentally validated in recent studies using sera from COVID-19 convalescent patients.

The immunological response to SARS-CoV-2 infection may be better understood and resolved with more stringent experimental validations of anticipated epitopes.

Journal reference:
  • Devi, Y. D., Goswami, H. B., Konwar, S., et al. (2021). Immunoinformatics mapping of potential epitopes in SARS-CoV-2 structural proteins. PLOS One. doi:10.1371/journal.pone.0258645

Content Source:

Gemma Wilson

Gemma is a journalism graduate with keen interest in covering business news – specifically startups. She has as a keen eye for technologies and has predicted quite a few successful startups over the last couple of years.

Related Articles