Screening for Deleterious non-synonymous SNPs in Human CCL21 Gene using in-silico analysis

Rheumatoid arthritis (RA) is a chronic, systematic, and progressive inflammatory disorder, causing severe damage to joints and hence increase mortality. The Chemokine (C-C motif) ligand 21 (CCL21), a member cytokines family, is involved in immuno-inflammatory and regulatory processes. Therefore, identifying the important SNPs (single nucleotide polymorphisms) in the CCL21 gene is of key importance to evaluate their structural and functional significance and to discover novel therapeutic targets for immune-related diseases, including RA. In this study, we used in silico approaches for identifying the most damaging non-synonymous SNPs (nsSNPs), playing a significant structural and functional role in CCL21 protein. The primary tools used for this study included PROVEAN, SNPs&GO, SIFT and PolyPhen2. Other tools, its stability, Structure and functional effect as well as the conservation profile, were verified using I-Mutant, MutPred, and ConSurf. The site of post-translational modification also predicted. The 3-D modeling of proteins was carried out using I-TASSER which were then visualized in Chimera v1.11. Furthermore, the gene-gene interactions were predicted using STRING and gene MANIA. It was observed that the nsSNPs D30Y (rs753133670), I62N (rs1170851787), R75C (rs759733358), R75S (rs776954599) and A83V (rs776954599) were the most damaging nsSNPs in the CCL21 gene. These nsSNPs might have a significant role in CCL2 protein’s malfunctioning and possibly causing different autoimmune diseases including RA. Our study concluded that, to study the correlation of the CCL21 gene with certain autoimmune disorders, i.e. Crohn’s Disease (CD), RA and other immune-associated diseases, these SNPs could be the most important ones. In addition, these SNPs need to be studied in animal models and cell cultures in association with certain diseases, to identify if they could be of use for the gene therapy and pharmacogenomics.


Introduction
Rheumatoid arthritis (RA) leads to inflammation in joints and articular cartilage coupled with synovial hyperplasia, thereby characterized as an autoimmune disease, and causes consistent pain and permanent disability of the patients' physical activities in normal life [1,2]. The pathogenesis of RA is still unclear but the incidence and prevalence of RA are said to be the result of different environmental and genetic risk factors [3]. On the basis of proposed data by different studies (family aggregation and twin consonance), RA was found heritable in 60% of patients. These variations indicate the role of genetic factors in the pathogenesis of RA [4,5]. The roles of SNPs have been detected in the non-MHC genes, such as PTPN22 and MHC genes like HLA-DRB-1 which are potent and can drive inflammatory response in RA. Many studies revealed >150 SNPs in RA located at more than 70 gene loci [6][7][8][9][10].
CCL21 gene belongs to the Chemokine family having a C-C motif and is located on chromosome 9p13. 3. CCL21 is the chemokine that binds to CCR7 and plays an important role by modulating the process of circulation in the lymphoid and peripheral organs of T cells as will as dendritic cells [11]. In addition, defective movement of dendritic cells and lymphocytes into T zones has been demonstrated in CCR7 deficient mice [12]. Previous studies have shown that the endothelial cell growth factor CCL21 is mediated/expressed through the endothelial cell lymph node, which is related to tertiary lymphoid tissue development [13,14].  [15,16].
In RA pathogenesis, the up-regulation of the observed in the sub-lining of endothelial cells than the peripheral ordinary blood cells [17]. were applied to identify the deleterious nsSNPs [27]. Further screening was done on the nsSNPs predicted by all the tools as likely deleterious or intolerant.

Effect of nsSNPs in CCl21 protein
To predict the structural as well as functional effects of the given deleterious nsSNPs on the protein product MutPred tool was used [28].  [29].
Further analysis was carried out on those nsSNPs which were found to reduce CCL21 protein stability. Each amino acid was identified by the use of ConSurf tool) based on evolutionary conservation. It also illustrates the phylogenetic relationships between homologous sequences [30]. Further analysis was performed only for those sequences which were found highly conserved and which also showed more similarity with lethal nsSNPs.

Protein Modeling
In this study, I-TASSER was to modeled 3D structures of wild type and mutant CCL21 [31]. We used Chimera

Modification (PTM) Sites
Different tools were applied for identification of possible PTM sites in CCL21 protein.
GPS-MSP 3.0, an online tool, was applied to identify the methylation sites in CCL21 protein [33] whereas, possible sites for phosphorylation in the protein were less specific results having a lower phosphorylation potential than GPS 3.1 [34].
In addition, ubiquitylation sites were predicted with the help of BDM-PUB and UbPred) tools [35].

Gene-gene Interaction and effect of regulatory region SNPs
The interaction and association of CCL21 with other proteins and its nsSNPs' effects on other proteins were studied by utilisng two in silico tools i,e., GeneMANIA and STRING [36,37]. MicroSNiPer and PolymiRTS Database were used to study whether these nsSNPs in CCL21 gene has a role in gene regulation [38]. MicroSNiPer particularly show whether or not the target SNPs containing region which is a web based server ensuring that variants in the UTR regions as well as in miRNA seed are affected by deleterious SNPs.  nsSNPs having score 1 (Table 1).

Effect of nsSNPs in CCL21 protein
The damaging effect of the finalized nsSNPs on the structure or function of CCL21 protein was predicted through the MutPred server.
The results are given in Table 2. I-Mutant predicted the influence of the selected 7 nsSNPs on CCL21 protein stability. Each nsSNP was submitted separately for RI calculations (ranging from 0 to 10) to determine whether stability should be decreased/increased results are given in Table 3. Out of 7 shortlisted nsSNPs, 2 were shown to increase the protein's stability i.e., substitution with L7P (rs779706400) and R46C (rs1453433779) and hence were omitted for further study whereas rest of the 5 nsSNPs showed a decrease in stability of CCL21 protein (Table 3). These 5 nsSNPs were chosen for further investigation.
The conservation profile of these nsSNPs was predicted through the ConSurf tool which depicted C57R, C75S and V83A as highly conserved, exposed structural residues. The amino acid D30 was predicted to be highly conserved, exposed and functional residue, while the amino acid I62 was predicted to be buried. Retention scores for all the selected nsSNPs are depicted in       Table 7.

SNPs in CCL21
The in-silico tool GeneMANIA depicted that    rheumatoid arthritis (RA) in several studies [39,40]. Although several nsSNPs are probably neutral and have little functional effects, many of these nsSNPs have been predicted to be deleterious because due to the disruption of functional sites in proteins or effects on protein's folding [41]. Therefore,