Abstract
Sickle cell anemia (SCA) is an autosomal recessive disease caused by the HBB:c.20A>T mutation that leads to hemoglobin S synthesis. The disease presents with high clinical heterogeneity characterized by chronic hemolysis, recurrent episodes of vaso-oclusion and infection. This work aimed to characterize by in silico studies some genetic modulators of severe hemolysis and stroke risk in children with SCA, and understand their consequences at the hemorheological level.
Association studies were performed between hemolysis biomarkers as well as the degree of cerebral vasculopathy and the inheritance of several polymorphic regions in genes related with vascular cell adhesion and vascular tonus in pediatric SCA patients. In silico tools (e.g. MatInspector) were applied to investigate the main variant consequences.
Variants in vascular adhesion molecule-1 (VCAM1) gene promoter and endothelial nitric oxide synthase (NOS3) gene were significantly associated with higher degree of hemolysis and stroke events. They potentially modify transcription factor binding sites (e.g. VCAM1 rs1409419_T allele may lead to an EVI1 gain) or disturb the corresponding protein structure/function. Our findings emphasize the relevance of genetic variation in modulating the disease severity due to their effect on gene expression or modification of protein biological activities related with sickled erythrocyte/endothelial interactions and consequent hemorheological abnormalities.
Introduction
Sickle cell anemia (SCA; OMIM #603903) is one of the most common autosomal recessive monogenic disorders worldwide. The genetic basis of the disease is an A-to-T transversion in the 6th codon of the β-globin gene (HBB:c.20A>T), located on chromosome 11 (reviewed in [23]) which gives rise to hemoglobin S (HbS) production. The pathogenesis of SCA primarily derives from the polymerization of deoxygenated HbS in the red blood cells which, in turn, leads to the distortion of the cell and its adoption of a sickled shape, reduced deformability, increased vascular adhesion, and ultimately to erythroptosis and hemolysis [8, 21]. The disease presents with high clinical heterogeneity characterized by chronic hemolysis, recurrent painful episodes of severe vaso-oclusion and infection.
A model of disease pathophysiology has been proposed including two clinical/biological sub-phenotypes: the hemolytic-endothelial dysfunction sub-phenotype which includes stroke, pulmonary hypertension, priapism, leg ulceration and cholelithiasis, and the viscosity-vaso-oclusion sub-phenotype with relatively high hemoglobin level, microvasculature obstruction by sickled erythrocytes, tissue damage, pain crisis, acute chest syndrome and osteonecrosis [15].
As a result of hemolysis, free hemoglobin is released into plasma. Part of the free hemoglobin quickly reacts with haptoglobin and the complex is cleared from plasma. However, due to the elevated rate of hemolysis in this disease, the excess of free circulating hemoglobin scavenges nitric oxide (NO) leading to reduced NO bioavailability. The decline in blood NO concentration leads to endothelial dysfunction, over-expression of vascular adhesion molecules and impaired vasomotor tone [16]. Since endothelial function and NO metabolism are key elements in vascular homeostasis it is reasonable to assume that molecules involved in the former, the latter and/or in both may provide clues to the pathophysiology of SCA. Vascular occlusion in SCA decreases organ perfusion, leading to tissue infarction and, together with hemolytic anemia, causes the characteristic SCA clinical complications, including intermittent painful vaso-occlusive episodes, splenic autoinfarctions and consequent increased risk of infections, acute chest syndrome, pulmonary hypertension, stroke, cumulative multi-organ damage and a shortened lifespan. Therefore, vaso-occlusion is believed to occur as a multi-step process that involves interactions between sickle erythrocytes, activated leukocytes, endothelial cells (ECs), platelets and plasma proteins [8]. Recurrent vaso-occlusion, ischemia-reperfusion with consequent vascular endothelial activation and injury induce a continuous inflammatory response in SCA individuals that is propagated by the release of high levels of inflammatory cytokines, decreased NO bioavailability and oxidative stress [8]. The ECs, activated by cytokines and low NO bioavailability, may provide the basis in specific organs, like the lung, for decreased vasodilation, blood cell adhesion and micro-thrombosis. ECs have different properties in the vascular beds of different organs. Hypothetically, those in the target organs of SCA vasculopathy may be the most dependent on NO bioactivity for normal function [1].
The phenotypic heterogeneity of this monogenic disorder has long been discussed and its multifactorial-like behaviour has prompted the hypotheses that genetic modulators other than the ones involving the β-globin cluster may come into play. Hence the designation of SCA as a single gene disorder under polygenic control. In explaining altered EC function, for instance, one of the most likely candidates is VCAM1, the gene encoding the vascular adhesion molecule-1 (VCAM-1), a cytokine-inducible cell surface glycoprotein present in ECs in inflammatory conditions. It mediates the adhesion of monocytes and leukocytes to the endothelium, especially in small vessels [8]. VCAM-1 is also known to lead to sickled erythrocyte adhesion to the endothelial vessel wall and therefore to vaso-occlusion [1]. Another likely SCA genetic modulator known to be involved in vascular homeostasis is the gene encoding the endothelial nitric oxide synthase eNOS, NOS3. Contrary to VCAM-1, eNOS is a constitutive enzyme responsible for NO endothelial production, fundamental for the vasoconstriction/vasodilation balance due to its role in smooth muscle cell relaxation [18].
In silico methods may be used to provide preliminary information on the genetic variants identified and their putative structural/functional consequences when compared with the wild-type genome sequences. Several of these research tools are available for analysing variant sequences, fundamentally relying on the type of sequence change to be analysed, and on the algorithm and databases applied by the different tools. For regulatory genomic regions, such as gene promoters, software capable of predicting which changes may lead to modified gene expression levels may provide useful information on gene regulation impairment. Analysis of putative transcription factor binding sites (TFBS) constitutes a way to achieve that end and relies on assessing the degree of similarity between a given sequence and the consensus sequence corresponding to a specific transcription factor (TF).
In the present study, in silico tools were applied for analysis and characterization of the structure and function of VCAM1 variants previously associated with hemolysis severity [6] and stroke risk in SCA pediatric patients. Also, one NOS3 gene variant associated with cardiovascular risk was alsoanalysed [25].
Materials and methods
In order to evaluate possible TFBS changes that might have an influence in the regulation
of VCAM1 gene expression, the wild type sequence and three variants (SNPs
rs1409419, and rs1041163 and indel rs3917025) within the gene promoter region were compared
in silico. The nucleotide sequence of the core VCAM1
promoter and their variants were obtained from the NCBI (http://www.ncbi.nlm.nih.gov) and
ENSEMBL (http://www.ensembl.org) databases, spanning from –2180 to +101 bp relative to
the main transcription starting site. The MatInspector (www.genomatix.de) [5] in silico tool was used for the
TFBS analysis (Table 1), with a
threshold of 0.85. The following TFs were considered in particular, due to their role in
inflammation, cell proliferation and oxidative stress:
The putative functional consequences of the nonsynonymous SNP of NOS3 (rs1799983) were analysed through the use of the following in silico tools specific for evaluating missense mutations effects on protein function: Sorting Intolerant from Tolerant (SIFT) [20], Polyphen 2 [2], Domain Mapping of Disease Mutations (DMDM) and PredictSNP [4]. The latter combines predictions of eight established prediction tools (MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP) and transforms the individual confidence scores to one comparable scale of 0–100%, using the values of their observed accuracies [4].
OMIM (http://www.ncbi.nlm.nih.gov/omim), ENSEMBL (http://www.ensembl.org), dbSNP (http://www.ncbi.nlm.nih.gov/SNP) and ClinVar (http://www.ncbi.nlm.nih.gov/clinvar) databases were also used to assess SNP frequencies and disease association/pathogenicity references.
Results
In a previous work we have performed an association study between hemolysis biomarkers (serum LDH, total bilirubin and reticulocyte count) and the inheritance of 41 genetic variants of ten candidate genes related with vascular tonus, vascular cell adhesion, inflammation, fetal hemoglobin expression, and alpha-thalassemia, in a series of 99 pediatric SCA patients [6]. Furthermore, we have performed another association study evaluating the role of the same genetic variants in stroke risk, enrolling 66 children with SCA categorised according to their degree of cerebral vasculopathy (unpublished data). In both studies, variants in a gene related with adhesion of sickled erythrocytes to vascular endothelium (VCAM1 rs1409419, rs3917025, and rs1041163) were associated with the SCA hemolysis severity and/or with stroke risk. On the other hand, one NOS3 gene variant (SNP rs1799983) implicated in cardiovascular risk was also analysed [25]. Thus, potential consequences of those genetic variants were evaluated in silico prior to in vitro functional studies.
When the three non-coding variants of VCAM1 were searched in ClinVar and in Variant Effect Predictor (ENSEMBL), two common features emerged: (i) to date, they are considered upstream gene variants, and (ii) their probable functional impact would be as modifiers.
The TFBS analysis performed with MatInspector (Table 2) revealed several potential effects. The input
data format used was the SNP database identification number, with the analysis results
corresponding to the minor frequency allele. For rs1041163, the change T>C led to a
potential substitution of a
The DMDM analysis indicated that the nonsynonymous variant p.Glu298Asp is located in the oxygenase domain-coding region of NOS3. This tool is linked to the OMIM database which associates this genetic variant to susceptibility for coronary heart spasm, late-onset Alzheimer’s disease and hypertension. These results were consistent with the ones obtained when assessing the ClinVar database and arise from taking into account all reports of the presence of the genetic variant in association with a given phenotype.
For the analysis of the putative functional consequences three bioinformatics tools were used: SIFT, PolyPhen 2 and PredictSNP. These tools predict the consequences of mutations that are translated into amino acid changes in the protein structure, based on algorithms and comparison with known disease variation databases. The SIFT results indicated that this variant is tolerated. The PolyPhen 2 results were consistent with the SIFT ones, with a benign classification for this variant whether considering HumDiv or the HumVar databases references. The non-pathogenic category was also attributed by the combined PredictSNP tool results (83%) which are based on analyses performed by five different tools – SIFT (71%), PolyPhen 2 (74%), PolyPhen 1 (67%), SNAP (71%) and PhDSNP (78%). Nevertheless, in this case a possible association with cardiovascular disease susceptibility was considered which is in accordance with the above mentioned databases’ assessment.
Discussion
Although the major genetic modifiers of SCA clinical manifestations are those affecting the fetal hemoglobin expression, VCAM1 and NOS3 variants have also been identified as potential modulators of the disease. In this study, the results of an in silico analysis of VCAM1 rs1041163, rs1409419 and rs3917025 noncoding polymorphisms, as well as NOS3 coding SNP rs1799983, provide some clues about possible functional roles of these genetic variants in the pathophysiology of SCA.
The three VCAM1 promoter variants mentioned above have in common the potential for affecting this gene’s expression regulation. This may occur as a result of differences in TF affinity to the altered sequence as compared to the wild type sequence. In the present work, TFBS changes were indeed observed for the three polymorphic regions. Concerning the rs1041163 G>C, a RXRF by PRDM1 substitution as well as a loss of an FHXB were indicated. PRDM1 is a transcription repressor that promotes differentiation of hematopoietic B cells and secretion of pro-inflammatory cytokines [9]. Therefore in a pro-inflammatory environment, such as an activated endothelium, this variant might lead to an increase of VCAM1 inducible expression (Fig. 1). In the case of rs1409419 C>T, it was shown to have a potential gain for EVI1, Oct1 and Barx2. EVI1 is a complex multifunctional that modulates multiple processes, including cell migration, motility, adhesion, response to oxidative stress, proliferation and apoptosis/survival [3]. It contains a GATA consensus motif and prevents DNA binding by GATA1, thus limiting red blood cell differentiation and proliferation [24]. EVI1 has been reported to cooperate with FOS transcription factor to limit cell adhesion while enhancing cell proliferation, one hallmark of oncogenesis [3]. On the other hand, Oct1 is a TF known to promote a transcriptional repression/silencing effect which would potentially lead to VCAM1 down-regulation. Barx2 has been shown to promote murine muscle cell differentiation by interacting with muscle regulatory factors (MRFs) [26], a gain of which could result in upregulation of gene expression in muscle tissue (Fig. 1). Finally, a gain of FAST1 was identified for rs3917025_delCT (Fig. 1). FAST1 is a TF involved in patterning and development of embryonic structures in vertebrates, in a complex network of activation/repression mechanisms [19].
In summary, all of the TFs affected by the sequence variants are mainly involved in development (including in early embryonic stages, as FAST1) and in different tissues which is in accordance with the VCAM-1 proposed role in development with tissue- and time-specific expression patterns [11]. In terms of endothelial environment, for instance, one might expect that altered expression levels may affect sickled erythrocytes/EC adhesion as well as endothelium inflammation/activation, thus contributing for endothelial dysfunction and ultimately to impaired blood flow/shear rate.
Regarding the rs1799983 in NOS3, although this SNP leads to a change in the amino acid sequence of the protein (p.Glu298Asp), all the analyses showed that this variant is most probably non-deleterious. Therefore, this NOS3 gene variant may be considered a nonsynonymous tolerant SNP. The apparent conservative (negatively charged) amino acid substitution that results therein (aspartate for glutamate) would also be in agreement with that observation. Nevertheless, as the DMDM database results indicate, it occurs in the sequence encoding the oxygenase domain of eNOS, which is critical for the enzyme activity, containing the catalytic site as well as the components of its oxygenase function. Being considered benign, it is reasonable to assume that the variant will probably not affect the main catalytic function. Nonetheless, the oxygenase function may be impaired, thus possibly contributing to higher oxidative stress through decreased heme binding, eNOS uncoupling and (indirectly) to NO bioavailability. Possible alterations in endothelial location in the caveolae have also been proposed [14]. Overall, these potentially altered functions play key roles in endothelial dysfunction and/or vascular tone and may modulate SCA severity in terms of cardiovascular risk [13]. Furthermore, impaired oxygenase activity would expectedly result in higher levels of oxidative stress which (i) has been demonstrated to damage healthy erythrocytes through decreasing their deformability as well as increasing the strength of erythrocytes aggregates, and (ii) have been hypothesized to induce an exaggerated response in erythrocytes from SCA patients, accompanied by a highly abnormal hemorheological profile (reviewed in [7]).
Besides its hemorheological importance, altered oxygenase function may also have an impact on therapeutic approaches. For instance, it is known that drugs interfering with the renin-angiotensin-aldosterone system, as well as statins, are useful in preventing endothelial dysfunction. However, the mechanisms through which they promote eNOS uncoupling, in the case of elevated oxidative stress, may provide useful clues as to ways of increasing NO beneficial actions in the cardiovascularsystem [10].
Conclusions
In silico studies require a careful analysis of specific factors. Concerning input data format, specific sequences already identified in databases provide lesser margin for error so, using an identification number is in general less prone to error than the manual introduction of a given sequence, for instance. The algorithm that is used for the prediction also determines the tool’s accuracy, since it determines sensitivity, specificity and has associated false-positive and false-negative rates. Furthermore, the cut-off values or parameter thresholds are key elements in determining reliability of results since they are associated with the similarity of the given sequence and a reference sequence and therefore with the likelihood of a specific TF actually binding to that given sequence. Database information, as well as size, quality and curation also impact a tool’s quality and reliability.
Nonetheless, in silico approaches only constitute a preliminary step in evaluating genetic variant potential biological and clinical consequences. In vitro (and, whenever possible, in vivo and ex vivo,) studies are crucial for confirmation purposes and to unravel the biological link between genetic variants and the sub-phenotypes of SCA. Gene expression studies are also of the utmost importance in particular in the case of VCAM1 polymorphisms to assess overall up- or down-regulation of the gene, as a consequence of the changes in the regulatory sequence. In NOS3 functional analysis is also mandatory to evaluate enzymatic activity in different cellular environments.
All studies undertaken to identify genetic modifiers of SCA sub-phenotypes are important in order to pinpoint essential pathways and mechanisms for SCA pathophysiology and to evaluate potential molecular targets to which direct innovative therapeutic strategies.
Footnotes
Acknowledgments
This work was partially supported by FCT: grant PIC/IC/83084/2007. Authors would like to thank the Unidade de Tecnologia e Inovação, DGH, INSA, for technical support.
