Validation of endoplasmic reticulum stress-related gene signature to predict prognosis and immune landscape of patients with non-small cell lung cancer

Abstract

BACKGROUND:

Lung cancer is one of the most common cancers worldwide, with the incidence increasing each year. It is crucial to improve the prognosis of patients who have lung cancer. Non-Small Cell Lung Cancer (NSCLC) accounts for the majority of lung cancer. Though its prognostic significance in NSCLC has not been often documented, Endoplasmic Reticulum (ER) stress has been identified to be implicated in tumour malignant behaviours and resistance to treatment.

OBJECTIVE:

This work aimed to develop a gene profile linked to ER stress that could be applied to predictive and risk assessment for non-small cell lung cancer.

METHODS:

Data from 1014 NSCLC patients were sourced from The Cancer Genome Atlas (TCGA) database, integrating clinical and Ribonucleic Acid (RNA) information. Diverse analytical techniques were utilized to identify ERS-associated genes associated with patients’ prognoses. These techniques included Kaplan-Meier analysis, univariate Cox regression, Least Absolute Shrinkage and Selection Operator regression analysis (LASSO) regression, and Pearson correlation analysis. Using a risk score model obtained from multivariate Cox analysis, a nomogram was created and validated to classify patients into high- and low-risk groups. The study employed the CIBERSORT algorithm and Single-Sample Gene Set Eenrichment Analysis (ssGSEA) to investigate the tumour immune microenvironment. We used the Genomics of Drug Sensitivity in Cancer (GDSC) database and R tools to identify medicines that could be responsive.

RESULTS:

Four genes – FABP5, C5AR1, CTSL, and LTA4H – were chosen to create the risk model. Overall Survival (OS) was considerably lower ( $P<$ 0.05) in the high-risk group. When it came to predictive accuracy, the risk model outperformed clinical considerations. Several medication types that are sensitive to high-risk groups were chosen.

CONCLUSION:

Our study has produced a gene signature associated with ER stress that may be employed to forecast the prognosis and therapeutic response of non-small cell lung cancer patients.

Keywords

Non-small cell lung cancer single-cell sequencing endoplasmic reticulum stress predictive model immunotherapy

1. Introduction

Lung cancer is the most significant cause of cancer-related mortality worldwide among the most prevalent malignancies [1]. NSCLC is the most common kind of lung cancer, incorporating around 85% of cases [2]. Surgery is considered the mainstay of treatment for early-stage NSCLC, while some patients have recurrence within a few years [3]; for advanced NSCLC patients, chemotherapy, radiation, targeted medicines, and immunotherapy are indicated [2]. Personalized and adaptive strategies, such as genetic profiling, patient stratification, adaptive management, combination therapies, proactive side effect management, patient education, multidisciplinary teams, clinical trials, and psychosocial support, are essential for overseeing non-small cell lung cancer treatment plans. However, each patient’s reaction to these therapies is unique and unexpected. Furthermore, many patients have recurrence following standard treatment, burdening patients and society. Although current studies have focused on biomarkers related to NSCLC in clinical practice [4], which are used for monitoring and treatment, their reliability is controversial.

Despite the rapidly emerging new treatment methods for NSCLC in recent years, the overall 5-year survival rate is still low [5]. Scientists have raised their interest in tumour heterogeneity and believe it might correlate with resistance to cancer therapies [6]. Genetic variety, resistance development mechanisms, phenotypic plasticity, cell state transitions, mutation-driven resistance, epigenetic variability, and tumour microenvironment impact tumour heterogeneity in cancer treatment. Drug efflux, drug inactivation, changed drug targets, DNA repair, evasion of apoptosis, immunological evasion, and altered cell signaling are some of the processes that might result in the development of resistance to cancer therapies. Cancer is an evolving disease that gradually becomes more heterogeneous during progression [7]. Genetic mutations, epigenetic modifications, clonal development, and microenvironmental variables all contribute to tumour evolution, which generates a variety of cancer cell populations that affect the course of the illness and the effectiveness of treatment. Enhancing cancer therapy and patient survival requires tailored and flexible therapeutic approaches. The majority of the tumour may be made up of a diverse collection of cells with different molecular fingerprints and levels of treatment susceptibility as a result of this heterogeneity [8]. Heterogeneity lays the foundation for resistance to treatment; however, tumour heterogeneity is complex to investigate due to current technologies. A next-generation sequencing technique called single-cell sequencing offers the possibility of comprehending cellular variations and the function of a single cell in its surroundings [9]. Another cutting-edge technique that has a lot of promise for breaking down the intricate clonal structures of cancers is single-cell sequencing. Therefore, single-cell sequencing offers a way to understand tumours’ heterogeneity deeply.

With the investigation of the molecular mechanism of the tumours through rapidly developed technologies, scientists realize endoplasmic reticulum stress (ERS) might serve a role in the evolution of tumours. The ER is a large, central organelle involved in lipid metabolism, protein synthesis, and calcium storage, among other processes in the cell [10]. ERS is characterized as an activation of ER homeostasis disruption, encompassing the unfolded protein response and calcium perturbation [11]. Calcium perturbation is a critical component of endoplasmic reticulum (ER) stress that influences cell responses. Toxins and oxidative stress are two things that cause ER stress. Upon identifying ER stress, the Unfolded Protein Response (UPR) triggers corrective signalling pathways that improve protein degradation, restore calcium homeostasis, and enhance cell survival. Recent studies suggest that ERS might be necessary for the growth and metastasis of tumours [12].

Additionally, a connection has been shown between medication-induced apoptosis in lung cancer and ERS [13]. Nevertheless, only some single-cell sequence-based investigations have looked at the features of ERS-related genes and how useful they are for NSCLC patients. Managing data securely in medical cancer rehabilitation centres is vital, given the escalating mortality risks. IoT applications in healthcare, like sensors, present security challenges due to vulnerabilities in real-time data transmission. Deep Federated Collaborative Learning (DFCL) addresses these issues by preserving privacy while managing sensitive data. Recent studies underscore DFCL’s efficacy, demonstrating enhancements in accuracy (up to 19.8%), leading diagnoses (26%), and hospital dictionary analysis rates [14].

The problem statement and the main contribution of the paper are discussed as follows:

Lung cancer, a joint global disease, is gaining ground, particularly non-small cell lung cancer (NSCLC). Despite its increasing prevalence, endoplasmic reticulum stress has been linked to tumour malignancies and treatment resistance in NSCLC, emphasizing the need for a better prognosis. An ER stress-related gene profile has been developed as a result, and it may be utilized to estimate and forecast the risk of non-small cell lung cancer. Then, 1014 NSCLC patients’ clinical and RNA data were collected using the TCGA database. Several methods were employed to identify genes linked to ERS and prognosis, including Pearson correlation analysis, LASSO, Cox regression, and the Kaplan-Meier method. Multivariate Cox analysis was used to classify the patients in a developed and evaluated nomogram using the risk score model. Sensitive drug screening involved R tools and the Genomics of Drug Sensitivity in Cancer database. Using four genes – FABP5, C5AR1, CTSL, and LTA4H –, the study developed a risk prediction model for patients with non-small cell lung cancer. The model performed better than clinical factors in predicting outcomes and showed specific sensitivity to high-risk populations. This genetic signature can potentially forecast both the prognosis and outcomes of treatment.

This paper is organized as follows:

•
Section 2 details the methodology and the proposed The Cancer Genome Atlas-Lung Adenocarcinoma (TCGA-LUAD) and The Cancer Genome Atlas-Lung Squamous Cell Carcinoma (TCGA-LUSC) data for lung cancer prediction.
•
Section 3 presents the results of our empirical analysis and discusses the findings.
•
Section 4 addresses the practical implementation and integration challenges.
•
Finally, Section 5 summarizes the key findings, conclusions, and recommendations.

2. Materials and methods

2.1 Data sources and data processing

Using the accession IDs GSE117570 [15], GSE37745 [16], and GSE61676 [17], we ran a keyword search for NSCLC at the Gene Expression Omnibus (GEO) [18, 19, 20] (http://www.ncbi.nlm.nih.gov/geo/). Data from single-cell dataset GSE117570, which includes four tumour tissues and four peritumoral tissues, were performed by the GPL18573 platform (Illumina NextSeq 500 (Homo sapiens)). At the same time, the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) was employed to sequence the gene expression profile data. Affymetrix GeneChip Human Genome U133 Plus 2.0 Array, also called the GPL570 platform, is a sensitive instrument for gene expression profiling that spans more than 47,000 transcripts, or 39,000 human genes. For precise measurement, it employs many probes for every gene, facilitates strong data normalization, and works with bioinformatics tools for integrated analysis. To investigate the mechanisms of action of the combination targeted therapy bevacizumab erlotinib in late-stage non-small cell lung cancer, the GPL5188 platforms (Affymetrix Human Exon 1.0 ST Array) analyzed the preliminary data in the dataset GSE61676. Table 1 illustrates the attributes of the datasets. Furthermore, UCSC Xena [21] (http://xena.ucsc.edu/) provided transcriptome survival, besides clinical data from LUAD and LUSC tumour tissues ( $n=$ 1014) and standard lung specimens ( $n=$ 108). Batch effects were controlled using ComBat from the R package sva, and raw microarray data from gene chips was normalized using Robust Multichip Average (RMA) from the R package affy [22]. The GSE117570 dataset was also inspected for accuracy, evaluation, and analysis of single-cell RNA-seq data using the R package Seurat [23]. Genecards [24] (https://www.genecards.org/) yielded 5885 Endoplasmic Reticulum Stress-Related (ERSR) genes in total, which are included in Table S1.

Table 1
Data set information

ID	GPL	Sample source	Sample size	References	Species
GSE117570	GPL18573	Tumor/adjacent normal	8	PMID: 31033233 PMID: 34295900	Homo sapiens
GSE37745	GPL570	Cancer cells from patient	196	PMID: 23032747 PMID: 29112949 PMID: 26608184 PMID: 33576873 PMID: 35574381	Homo sapiens
GSE61676	GPL5188	Peripheral blood	86	PMID: 28359318	Homo sapiens

2.2 Subset analysis in gene expression

In this work, we used the Seurat FindAllMarkers programme to identify marker genes distinctive to a subset. Using threshold parameters, multiple testing correction, pre-processing procedures, data subsetting, parallel processing, assay specification, slot usage, and other parameters, the Seurat FindAllMarkers function helps identify marker genes in single-cell RNA sequencing data. The FindAllMarkers function in Seurat is essential for locating differentially expressed marker genes in single-cell RNA sequencing data. This function also helps with disease diagnosis and treatment targeting by revealing functional differences and genes that drive specific biological processes. Moreover, the expression of subset-specific marker genes was visualized using the DotPlot and ViolinPlot tools. The DoHeatmap function was performed to illustrate each cell subset’s former 10 or 20 specific genes by heatmap.

2.3 AUCell: Analysis of ‘gene set’ activity in single-cell RNA-seq data

To score the cells with intersection genes of ERSR genes and differentially expressed genes (DEGs) of cell subsets, we utilized the AUCell_calcAUC algorithm of the AUCell package for the GSE117570 dataset [25]. One tool that grades cells according to gene set expression – such as DEGs and ERSR genes – is the AUCell_calcAUC algorithm. It quantitatively assesses gene set activity by calculating the Area Under the Curve (AUC) for each gene set inside individual cells. Cells with elevated ER stress may be identified using AUC score visualization. The high-scoring cell subsets were extracted for a subset analysis and generated with ten cell subsets.

2.4 Identification of differentially expressed genes (ERSRDEGs) associated with endoplasmic reticulum stress in TCGA

The ERSR gene expression profiling data from TCGA were analyzed to identify Endoplasmic Reticulum Stress-Related (ERSRDEGs). The study obtains RNA-seq data, normalizes expression, performs differential expression analysis, focuses on ER stress pathways, and validates the significance of the genes found to be differentially expressed about endoplasmic reticulum stress (ERSRDEGs) using data from The Cancer Genome Atlas (TCGA). Limma package in R was performed with the thresholds $|$ log2-fold change (FC) $|$ $>$ 0.5 and $P$ -value $<$ 0.05 [26]. ERSRDEG expressions were illustrated by ggplot2 and the ComplexHeatmap package with heatmaps and volcano plots [27, 28]. Then, we selected the intersection genes of DEGs from high-scored cell subsets and ERSRDEGs to construct the prognosis model.

2.5 Functional enrichment analysis

Using the Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway enrichment assessment and Gene Ontology (GO) performance, ClusterProfiler software 4.1.1 was used to examine the association between genes of DEGs from high-scored cell subgroups and ERSRDEGs with $P$ -value $<$ 0.05 [29, 30, 31]. An analysis tool called ClusterProfiler facilitates the visualization and detection of biological significance in enormous gene lists from high-throughput studies. It analyses gene ontology and enriches KEGG pathways, offering thorough annotations for formulating hypotheses and interpreting data.

2.6 Construction of the prognosis model

Univariate analysis was performed using Cox logistic regression models to find genes linked to prognosis. Univariate analysis of survival studies requires the use of Cox logistic regression models, sometimes referred to as Cox proportional hazards models. They simplify interpretation, manage suppressed data, and offer dynamic risk evaluations. They are adaptable to different forms of survival data and quantify the effect size of individual predictor factors on survival outcomes. The glmnet tool was used to perform LASSO analysis [32], which allowed patients to be classified as high- or low-risk. Essential methods for comprehending gene expression and survival outcomes include LASSO Analysis with glmnet and Cox Logistic Regression Models. They measure gene effects, offer hazard ratios, and pinpoint prognostic indicators. LASSO Analysis with glmnet minimizes complexity in high-dimensional data, streamlines models, and guards against overfitting. The predictive model’s accuracy was evaluated using the survival Receiver Operating Characteristic (ROC) tool to determine the Area Under the Curve (AUC) while generating survival curves using the Kaplan-Meier technique [33]. Simultaneously, the GSE37745 dataset was utilized to ensure the depth of the predictive model. With its reliable predictive models, comprehensive clinical annotations, gene expression profiles, and biological insights, the GSE37745 dataset is an essential resource for cancer research. The reproducibility and trustworthiness of results are ensured by its enormous sample size, public accessibility, and thorough analysis. Using Cox regression to create a predictive model, assign points to predictors, and summaries and compute scores for each patient is known as a nomogram. Clinical factors like age, stage, and therapy account for personalized forecasts, and the entire score predicts clinical outcomes like survival likelihood.

2.7 Independence analysis of prognosis model and construction of nomogram

According to the univariate and multivariate analysis findings, a nomogram was created to examine the prognostic usefulness of ERSRDEGs. The nomogram, which included crucial clinical characteristics and calibration plots, was produced using the rms R tool [34]. Model correctness was assessed using calibration [35, 36]. A unique assessment technique called Decision Curve Analysis (DCA) examined the clinical net advantages of predictive models [36]. Additionally, we assessed the precision of the Cox regression models using DCA. While waiting, we evaluated the achievement of Cox regression techniques using ROC curves and AUC statistics. A modified ROC curve, demonstrating a high AUC ( $>$ 0.7) for prognosis prediction, was used to evaluate the model’s effectiveness. The study also looked at four independent genes as potential predictive markers for NSCLC. The results demonstrated the association between risk score, N stage, and T stage and patient outcomes. A nomogram was created using calibration curves to estimate the patients’ 3-, 5-, and 10-year OS.

2.8 Immune infiltration analysis

Using the CIBERSORT approach, we calculated the amount of 22 apoptotic cells conquering tumours in tissue samples [37]. Using CIBERSORT, a technique for gene expression analysis, the study intends to examine immune infiltration in NSCLC tumours. The process entails taking gene expression data from microarrays or RNA-seq, decomposing it into proportions of immune cell types, and analyzing the resultant to link it with ER stress-related gene expression patterns and clinical outcomes. A risk score model for patients with NSCLC was validated using the GSE37745 dataset. The list had twenty-two distinct cell types: NK cells resting, active, plasma cells, dendritic cells activated, M1 and M2 macrophages, dendritic cells resting, mast cells activated, eosinophils, neutrophils, and T cells at rest. T cells include gamma delta, CD8, CD4 memory, follicular helper, CD4 naive, and regulatory cells. The 22 cell types were shown with box plots to show differences between high- and low-risk groups. Additionally, the immune infiltration analysis of lung cancer was performed by single sample gene set enrichment analysis (ssGSEA) using GSVA package to investigate the infiltration of the 28 types of immune cells [38].

2.9 Drug sensitivity

We evaluated each patient’s medication sensitivity using the Genomics of Medication Sensitivity in Cancer (GDSC) database (https://www.cancerrxgene.org/) [39]. The prophetic software also discovered the Half-Maximal Inhibitory Concentration or IC50. We used the Wilcoxon rank sum test to examine variations in drug susceptibility [40].

2.10 Analysis of the correlation between critical genes and reaction to immunotherapy

To investigate the correlation between the four essential genes (C5AR1, CTSL, FABP5, and LTA4H) and response to immunotherapy, we collected 79 immune checkpoint genes from current studies [41] (Table S2). Among the 79 immune checkpoint genes, 69 genes in the TCGA data collection were chosen. Applying the Spearman correlation, the relationship between the four essential genes and the 69 immunological checkpoint genes was investigated; a $P$ -value of less than 0.05 was deemed significant. To evaluate the direction and strength of relationships between continuous variables such as immune cell proportions and gene expression levels, as well as to look for nonrandom relationships between categorical variables such as high vs. low expression groups and immunotherapy response, the study employs Spearman Correlation and Fisher’s Exact Test techniques. Four essential genes found in the GSE61676 dataset have been classified into two separate categories based on their expression patterns: high expression and low development. Following immunotherapy, Fisher’s exact test was used to assess the response of the high and low-expression groups; a $P$ -value of less than 0.05 was considered significant.

2.11 Statistical analysis

The statistical analysis used R (v.4.1.1). The link between immune cells and genes was discovered using the Spearman coefficient. The Spearman coefficient is a non-parametric metric for determining the relationships between immune cells and genes. It works well with noisy and adaptable biological data, working with various data formats such as immune cell counts and gene expression levels. Consistent trends are identified, facilitating the interpretation of results, particularly with high-throughput data. Multiple hypothesis assessments were addressed using the Benjamini-Hochberg (BH) approach. A reliable technique for evaluating many hypotheses, increasing statistical power, lowering false discovery rates, and boosting result interpretation is the Benjamini-Hochberg (BH) methodology. It is appropriate for high-throughput studies and ensures that meaningful discoveries are not mostly false positives by balancing discovery and error control. It entails figuring out the goal, the setting, the number of hypotheses, the uniform distribution of $p$ -values under the null hypothesis, the choice of a suitable FDR threshold, and the BH formula adjustment of $p$ -values. Statistical software is used to apply and evaluate the procedure, which takes research design and data type into account and necessitates an awareness of the study environment. Independent samples with an average distribution were subjected to the Student’s $T$ -test; individual samples with non-normal distributions were subjected to the Mann-Whitney test. The Student’s $T$ -test is a statistical technique for comparing immune cell properties across various gene profiles. It entails formulating theories, gathering information, and contrasting them using gene profiles. The alternative hypothesis is accepted, and the null hypothesis is rejected if a substantial difference is discovered. This test provides essential information on how genetic profiles influence immune responses. All $P$ values remained two-sided since a $P$ -value was considered significant if the significance level was less than 0.05.

Figure 1.

Flowchart of the study.

Figure 2.

Cell cluster analysis of GSE117570 dataset. (A) t-SNE was employed to envisage the distribution of the 14 cell subsets, and different colours represented different cell subsets. (B) Violin plots of the top 5 DEGs of each cell subset. The vertical axis with different colours represented different cell subsets. The upper horizontal axis represented the number of each cell subset, and the lower horizontal axis represented the expression level of each gene. (C) Bubble Map of the top 5 DEGs of all cell subsets. The larger the bubble, the larger the percent of differentially expressed genes. Gene expression levels increase with darker blue hues. (D) A heatmap displaying each cell subset’s top ten DEGs. Gene expression levels increase with a deeper red colour. Gene expression levels decrease with increasing purple hue. (E) A heatmap displaying all cell subgroups’ top 20 DEGs. Gene expression levels increase with a deeper red colour. Gene expression levels decrease with increasing purple hue. (F) Bar plots of the cell subsets distribution of 4 patients. Different colours represented different cell subsets.

3. Results

3.1 DEGs of cell subsets

As shown in Fig. 1, we initiated our work with the cell subsets analysis of the GSE117570 dataset by Seurat package and generated 14 cell subsets (Fig. 2A). According to the cell subsets analysis, the differentially expressed genes (DEGs) for cell subsets were found. We extracted the top 5 DEGs of each cell subset. We visualized them with violin plots (Fig. 2B) and BubbleMap (Fig. 2C) and then figured out that the top 5 DEGs were explicitly highly expressed in each cell subset. The Seurat method is one approach for determining the top 5 differentially expressed genes (DEGs) in single-cell RNA sequencing data. Using the FindClusters and FindAllMarkers functions, it is pre-processed, normalized, and scaled before being grouped according to gene expression profiles. Furthermore, we visualized the top 10 (Fig. 2D) and top 20 (Fig. 2E) DEGs with heatmaps and also discovered that the top 10 and top 20 DEGs were specifically highly expressed in each cell subset. We elucidated the result with bar plots to determine the distribution of each cell subset in patients (Fig. 2F). We found that the proportion of each cell subset varied from patient to patient, which indicated heterogeneity within patients.

Darker blue in data visualizations indicate increased gene expression levels resulting from technological and biological causes. Technical techniques include colour gradients and normalization; biological mechanisms include transcription factor activity, epigenetic changes, gene amplification, post-transcriptional regulation, and signalling pathways. The development of non-small cell lung cancer and resistance to existing treatments can be better understood by researching genes related to endoplasmic reticulum stress pathways. Key regulatory nodes and therapeutic targets can be identified by analyzing these pathways using gene expression analysis, functional assays, protein interaction studies, pathway analysis, cellular and molecular phenotyping, in vivo models, and clinical correlation.

3.2 Expression analysis of ERSRDEGs of cell subsets

After analyzing the intersection genes of ERSR genes and DEGs of cell subsets, we generated the Venn diagram (Fig. 3A) and obtained 111 genes within the intersection. We elucidated the 111 genes with violin plots (Fig. 3B) and BubbleMap (Fig. 3C) and figured out those genes mainly in cell subsets 1, 5, 6, 7, 10, 11 and 13. Targeted cancer therapy is now possible because the study identifies 111 genes between ERSR genes and DEGs. These genes are relevant to tumour biology and might be therapeutic targets for cancer detection and treatment. Based on the analysis of the 111 genes, the majority of them are positively correlated, while some of them are negatively correlated (Fig. 3D) ( $P<$ 0.05).

Figure 3.

Distribution and correlation analysis of intersection genes from the GSE117570 dataset. (A) Venn diagram of ERSRDEGs and cell subsets DEGs. (B) Violin plots of intersection genes. The vertical axis with different colours represented different cell subsets. The lower horizontal axis showed every gene’s expression level, while the upper horizontal axis represented the total number of each cell subgroup. (C) Bubble Map of intersection genes of cell subsets. The larger the bubble, the larger the percent of differentially expressed genes. Darker purple hues indicate higher gene expression levels. (D) Gene intersection heatmap. Red and blue represented positive and negative connections, while the cross inside the circle showed no significant link.

Figure 4.

Enrichment analysis of high-score cell subsets and DEGs of high-score cell subsets from the GSE117570 dataset. (A) Diagram of 1064 high-score cells with AUC cutoff value as 0.1. (B) Bar plot of high-score cell subsets. (C) In BP, GO saturation evaluation of high-score cell subsets’ DEGs. (D) DEGs of high-score cell subsets in Cellular Components (CC) were subjected to GO enrichment assessment. (D) The MF’s DEGs of high-score cell subgroups were subjected to GO enrichment evaluation. (F) DEGs of high-score cell subsets were analyzed using the KEGG pathway.

Figure 5.

Reanalysis of high-score cell subsets. (A) t-SNE was utilized to visualize the distribution of the ten cell subsets, and different colours represented different cell subsets. (B) Bubble Map of top 5 DEGs of each cell subset. The percentage of genes with differential expression increases with the size of the bubble. Gene expression levels increase with a richer red colour. (C) Violin plots of the top 5 DEGs of each cell subset. The vertical axis with different colours represented different cell subsets. The lower horizontal axis showed every gene’s expression level, while the upper horizontal axis represented the number of each cell subgroup. (D) The ERSRDEGs Bubble Map. More genes with differential expression are represented in the giant bubble. Darker green hues indicate higher gene expression levels. (E) Violin plots of ERSRDEGs. The vertical axis with different colours represented different cell subsets. The lower horizontal axis showed every gene’s expression level, while the upper horizontal axis represented the number of each cell subgroup.

3.3 Screening of high-score cell subsets

We used the AUCell package to analyze the GSE117570 dataset to select the cell subsets with high-score DEGs. The cutoff value for AUC was set as 0.1. According to the AUCell_calcAUC algorithm, 1064 cells were chosen as high-score (Fig. 4A). The 1064 cells were mainly distributed in cell subsets 1, 5, 7, and 13 (Fig. 4B). The cluster profile software was utilized to perform functional enrichment analysis for GO and KEGG. The results of the GO enrichment study were visualized employing BubbleMaps, taking into account three factors: molecular function (MF) (Fig. 4E), cellular parts (CC) (Fig. 4D), as well as biological processes (BP) (Fig. 4C). We found that the regulation of high-score DEGs involved the presentation of peptide or polysaccharide antigens and their antigen processing MHC protein complex binding, MHC class II, vacuolar lumen, antigen processing and presentation, secretory granule lumen, antigen processing and presentation of peptide antigen, MHC class II protein complex binding, tertiary granule, and cysteine-type peptidase activity. In the KEGG enrichment study, the pathways related to rheumatoid arthritis and tuberculosis were where DEGs were most enriched.

3.4 Reanalysis of high-score cell subsets

We used the criteria for screening cell subsets with high-score DEGs to reanalyze all the cell subsets and then generated ten subsets (Fig. 5A). The top 5 DEGs were elucidated by BubbleMap (Fig. 5B) and violin plots (Fig. 5C), and all the DEGs were presented in Table 2. In addition, we visualized the ERSRDEGs in the selected 10-cell subsets with BubbleMap (Fig. 5D) and violin plots (Fig. 5E), and the ERSRDEGs were presented in Table 5. High-score DEGs and ERSRDEGs are all highly expressed in cell subset 7.

Table 2
Differentially expressed genes in high-score cell subsets

Gene symbol	p_val	avg_log2FC	pct.1	pct.2	p_val_adj
GPR183	4.63E-07	1.232619	0.955	0.567	0.001457
CAPG	2.11E-08	1.318706	0.921	0.595	6.63E-05
IFITM2	3.51E-11	1.924079	0.972	0.61	1.10E-07
CD55	5.59E-10	1.539309	0.958	0.733	1.76E-06
MT2A	5.94E-09	1.225833	0.925	0.692	1.87E-05
CD48	9.45E-09	1.514708	0.907	0.539	2.97E-05
SAMSN1	2.20E-08	1.123565	0.902	0.646	6.93E-05
C5AR1	3.71E-08	1.051542	0.864	0.558	0.000117
LINC01272	5.08E-08	1.324841	0.864	0.446	0.00016
BIRC3	4.92E-07	1.202303	0.879	0.408	0.001548
BID	5.41E-07	1.09266	0.78	0.561	0.001702
PAG1	8.23E-07	1.409966	0.762	0.327	0.002588
MYO1G	1.02E-05	1.101243	0.64	0.267	0.03201
THBS1	1.49E-07	2.364044	0.83	0.456	0.000469
ACSL1	4.80E-07	1.426615	0.81	0.578	0.001511
SLC2A3	6.14E-07	1.373168	0.837	0.613	0.001932
EREG	1.52E-06	2.19489	0.844	0.425	0.004783
APOC1	9.78E-18	2.611453	1	0.365	3.08E-14
ACP5	2.16E-16	1.805057	0.993	0.366	6.79E-13
FABP5	6.19E-16	1.990278	1	0.479	1.95E-12
IFI6	1.39E-15	1.79155	0.985	0.459	4.37E-12
GCHFR	1.11E-14	2.216214	0.942	0.232	3.48E-11
CD68	1.15E-14	1.657113	1	0.784	3.62E-11
FCGRT	4.19E-14	1.154592	0.985	0.774	1.32E-10
LGALS3	7.51E-14	1.866202	1	0.657	2.36E-10
LY6E	1.99E-13	1.717061	0.978	0.577	6.26E-10
ALDH2	5.83E-13	1.956402	0.993	0.538	1.83E-09
ARL6IP1	3.08E-12	1.251628	0.985	0.575	9.68E-09
CTSD	4.79E-12	1.704567	1	0.734	1.51E-08
S100A13	5.39E-12	1.367514	0.81	0.13	1.69E-08
CD81	9.16E-12	1.187754	0.912	0.337	2.88E-08
PDLIM1	1.13E-11	1.165191	0.664	0.077	3.55E-08
ALOX5AP	1.16E-11	1.625039	1	0.588	3.63E-08
STOM	1.23E-11	1.608843	0.912	0.278	3.85E-08
DYNLL1	1.64E-11	1.158149	1	0.761	5.17E-08
LMNA	1.78E-11	1.249141	0.964	0.457	5.61E-08
CTSC	4.02E-11	1.64876	0.985	0.679	1.27E-07
MGST3	4.60E-11	1.850716	0.956	0.408	1.45E-07
MS4A7	8.06E-11	1.167581	1	0.695	2.53E-07
HDDC2	1.64E-10	1.098528	0.861	0.181	5.16E-07
PLD3	1.93E-10	1.009478	0.81	0.282	6.08E-07
FBP1	2.46E-10	1.605717	0.964	0.422	7.74E-07
LRPAP1	7.06E-10	1.47797	0.964	0.432	2.22E-06
MS4A4A	7.22E-10	1.248829	0.956	0.393	2.27E-06
RETN	9.11E-10	1.71183	0.839	0.196	2.87E-06
TREM1	1.12E-09	1.195052	0.956	0.454	3.54E-06
MRC1	1.66E-09	1.338268	0.985	0.408	5.23E-06
CD59	5.06E-09	1.147143	0.832	0.301	1.59E-05
MCEMP1	5.52E-09	1.598322	0.964	0.275	1.74E-05
TGM2	1.24E-08	1.114192	0.839	0.257	3.89E-05
BSG	1.50E-08	1.581339	0.949	0.509	4.72E-05
CYB5A	1.70E-08	1.112015	0.839	0.225	5.36E-05
SERTAD1	3.46E-08	1.012572	0.847	0.469	0.000109

Table 2, continued
Gene symbol	p_val	avg_log2FC	pct.1	pct.2	p_val_adj
ANXA1	4.09E-08	1.621098	0.993	0.767	0.000129
PLBD1	1.15E-07	1.061536	0.847	0.363	0.000362
CD9	1.33E-07	1.145487	0.876	0.404	0.000419
MDH1	2.12E-07	1.041116	0.693	0.238	0.000668
PLAC8	5.07E-07	1.0424	0.861	0.381	0.001596
LTA4H	7.22E-07	1.082045	0.92	0.45	0.00227
C9orf16	8.91E-07	1.132596	0.949	0.64	0.002804
CD52	9.59E-07	1.41002	1	0.745	0.003018
JUN	1.02E-06	1.180005	0.956	0.561	0.003203
S100A9	1.78E-15	4.025089	1	0.604	5.59E-12
S100A8	5.95E-14	4.223924	0.984	0.398	1.87E-10
VCAN	1.79E-13	2.440669	0.96	0.35	5.62E-10
NOP10	9.42E-12	1.639488	0.992	0.725	2.96E-08
TPM4	5.20E-11	1.250641	0.895	0.581	1.64E-07
CTSL	1.01E-10	2.673414	0.944	0.461	3.18E-07
CXCL8	2.71E-10	2.330661	0.919	0.512	8.54E-07
MCEMP1	5.85E-10	1.793687	0.855	0.292	1.84E-06
ANPEP	8.45E-10	1.730163	0.758	0.169	2.66E-06
MIR4435-2HG	5.48E-09	2.014855	0.782	0.214	1.72E-05
THBD	1.12E-08	1.760273	0.871	0.389	3.53E-05
ZNF90	2.20E-08	1.481212	0.766	0.264	6.93E-05
STXBP2	5.09E-08	1.248022	0.879	0.582	0.00016
CXCL3	5.60E-08	2.273753	0.75	0.314	0.000176
FLNA	9.32E-08	1.185229	0.855	0.485	0.000293
SEC61G	1.10E-07	1.441876	0.855	0.624	0.000348
NDUFB9	1.11E-07	1.012016	0.855	0.554	0.000349
MIF	1.18E-07	1.865116	0.774	0.496	0.000372
CEBPB	1.31E-07	1.574994	0.879	0.509	0.000412
FCN1	3.34E-07	1.299782	0.903	0.41	0.001052
PET100	4.82E-07	1.01561	0.677	0.349	0.001515
RETN	6.01E-07	1.625516	0.694	0.216	0.001892
FNDC3B	1.75E-06	1.608596	0.75	0.261	0.0055
CXCL2	1.79E-06	1.805453	0.806	0.44	0.005623
ENO1	2.26E-06	1.361811	0.935	0.725	0.007104
BZW1	3.11E-06	1.143589	0.871	0.535	0.009786
PHLDA1	3.34E-06	1.887271	0.823	0.352	0.010502
KDELR2	3.52E-06	1.097423	0.79	0.437	0.011078
C1orf122	3.95E-06	1.203756	0.605	0.219	0.012416
LINC00152	3.98E-06	1.565179	0.726	0.351	0.012517
SPP1	4.87E-06	3.301312	0.71	0.157	0.015334
PPIF	4.94E-06	1.586471	0.839	0.391	0.015533
LINC01272	5.92E-06	1.156673	0.879	0.473	0.018629
PKM	6.70E-06	1.241654	0.952	0.686	0.021065
NDUFA13	6.82E-06	1.091991	0.79	0.566	0.021465
NBEAL1	7.56E-06	1.338007	0.903	0.62	0.023786
MARCKSL1	9.74E-06	1.242786	0.54	0.158	0.030639
CCL20	1.05E-05	1.296372	0.573	0.245	0.032887
TMEM167A	1.24E-05	1.176835	0.815	0.556	0.038958
MAP3K8	1.25E-05	1.145569	0.79	0.518	0.039316
HSPA6	1.69E-09	2.096543	0.941	0.592	5.31E-06
RGS1	4.46E-09	1.571781	0.941	0.526	1.40E-05
BAG3	7.80E-09	1.510879	0.907	0.669	2.45E-05
ATF3	4.62E-08	1.388666	0.924	0.627	0.000145
NR4A1	4.82E-08	1.281168	0.915	0.658	0.000152

Table 2, continued
Gene symbol	p_val	avg_log2FC	pct.1	pct.2	p_val_adj
PAK1	1.15E-07	1.091995	0.72	0.278	0.000361
DNAJA4	2.01E-07	1.005721	0.822	0.492	0.000632
FOSB	4.10E-07	1.053759	0.949	0.684	0.001288
ZFAND2A	5.31E-07	1.213106	0.864	0.642	0.001671
PMAIP1	1.01E-06	1.094872	0.78	0.425	0.003176
JUN	1.28E-06	1.106467	0.856	0.575	0.004034
SERPINH1	6.13E-06	1.062973	0.61	0.265	0.019274
CD69	6.92E-06	1.092918	0.576	0.175	0.021766
TXN	1.14E-09	2.589908	0.984	0.693	3.60E-06
FNBP1	1.96E-09	1.416149	0.921	0.371	6.15E-06
DUSP4	2.09E-09	2.12398	0.889	0.288	6.58E-06
BIRC3	7.75E-08	2.821854	0.873	0.458	0.000244
CPNE3	2.63E-07	1.69076	0.651	0.192	0.000827
RAB11FIP1	3.47E-07	1.006586	0.825	0.315	0.001093
ID2	2.76E-06	1.92887	0.952	0.676	0.008694
RP11-1143G9.4	1.27E-13	3.235943	1	0.502	3.98E-10
CFD	3.36E-13	2.691383	0.974	0.499	1.06E-09
PLAC8	2.85E-12	2.695592	0.921	0.413	8.97E-09
MCEMP1	2.98E-10	1.649506	0.947	0.323	9.39E-07
CSTA	3.57E-10	1.544119	0.974	0.747	1.12E-06
ALDH2	4.93E-10	1.465781	0.947	0.571	1.55E-06
EGR1	1.93E-08	1.483209	0.842	0.359	6.06E-05
RETN	3.83E-08	1.933791	0.868	0.24	0.00012
CTSA	6.28E-08	1.368507	0.895	0.453	0.000198
PIM1	7.17E-08	1.135482	0.632	0.151	0.000226
TXNIP	7.25E-08	1.422333	1	0.662	0.000228
LYAR	2.81E-07	1.283747	0.763	0.255	0.000885
JUN	3.10E-07	1.261815	0.895	0.589	0.000974
LRPAP1	4.37E-07	1.329878	0.921	0.47	0.001375
SLPI	6.67E-07	1.072981	0.553	0.102	0.0021
GAA	1.21E-06	1.172071	0.763	0.353	0.003808
TMEM173	1.54E-06	1.23804	0.684	0.224	0.004857
UGP2	2.16E-06	1.039598	0.895	0.443	0.006782
FAM46A	1.02E-05	1.214129	0.579	0.325	0.032181
TSPAN13	2.65E-13	2.965592	0.87	0.049	8.35E-10
SLC7A5	8.81E-13	2.140355	0.957	0.298	2.77E-09
UGCG	1.08E-11	2.248421	0.913	0.22	3.41E-08
RAB11FIP1	1.46E-11	2.001587	0.913	0.328	4.60E-08
SEC61B	3.90E-11	2.423084	1	0.749	1.23E-07
GPR183	1.22E-10	2.80076	1	0.661	3.83E-07
LPIN1	2.08E-10	2.093885	0.783	0.133	6.54E-07
HERPUD1	8.52E-10	2.414289	1	0.722	2.68E-06
NR4A3	4.85E-09	2.039546	0.957	0.549	1.53E-05
SRSF7	1.28E-08	1.6215	1	0.568	4.02E-05
PPP1R14B	1.53E-08	2.003566	0.652	0.079	4.81E-05
EZR	1.58E-08	1.915236	0.957	0.635	4.98E-05
PLAC8	2.33E-08	1.715435	0.957	0.417	7.33E-05
IRF7	2.99E-08	2.580387	0.913	0.389	9.42E-05
IRF2BP2	3.15E-08	1.895549	0.913	0.447	9.90E-05
GNA15	5.72E-08	2.211835	0.913	0.485	0.00018
FYTTD1	6.25E-08	1.579253	0.826	0.239	0.000197
MALT1	1.21E-07	1.508184	0.87	0.214	0.00038
PLP2	2.79E-07	1.725175	0.87	0.481	0.000878
HIST1H1C	2.97E-07	2.124209	0.696	0.156	0.000935

Table 2, continued
Gene symbol	p_val	avg_log2FC	pct.1	pct.2	p_val_adj
HSP90B1	3.45E-07	2.035608	1	0.698	0.001087
DDX18	6.61E-07	1.571776	0.826	0.392	0.002081
SOX4	7.90E-07	3.739845	0.739	0.173	0.002486
ETV3	9.01E-07	1.399639	0.696	0.262	0.002836
PMEPA1	1.49E-06	1.382095	0.522	0.069	0.004674
ANKRD11	3.12E-06	1.773443	0.783	0.379	0.009816
CXCR4	5.07E-06	1.420921	1	0.783	0.015961
SPCS1	5.78E-06	1.294576	1	0.624	0.018182
NOP58	6.43E-06	1.176173	0.696	0.285	0.020216
OFD1	1.35E-05	1.138411	0.435	0.172	0.042592
C9orf142	1.38E-05	1.537247	0.696	0.278	0.043524
C12orf57	1.56E-05	1.265563	0.739	0.292	0.049144

Table 3

GO enrichment analysis

Category	GO	Description	$P$ value	adjust $P$ value	$Q$ value
GO biological processes	GO:0032496	Response to lipopolysaccharide	5.78E-09	8.50E-06	5.42E-06
GO biological processes	GO:0002237	Response to molecule of bacterial origin	1.04E-08	8.50E-06	5.42E-06
GO biological processes	GO:0001819	Positive regulation of cytokine production	1.29E-08	8.50E-06	5.42E-06
GO biological processes	GO:0032103	Positive regulation of response to external stimulus	5.48E-08	2.71E-05	1.73E-05
GO biological processes	GO:0048661	Positive regulation of smooth muscle cell proliferation	7.32E-07	0.000244736	0.000156001
GO cellular components	GO:0030667	Secretory granule membrane	3.60E-06	0.00060485	0.000420666
GO cellular components	GO:0009897	External side of plasma membrane	3.23E-05	0.001878416	0.001306417
GO cellular components	GO:0062023	Collagen-containing extracellular matrix	3.46E-05	0.001878416	0.001306417
GO cellular components	GO:0005793	Endoplasmic reticulum-Golgi intermediate compartment	4.47E-05	0.001878416	0.001306417
GO cellular components	GO:0042581	Specific granule	0.000111361	0.003501638	0.00243535
GO molecular function	GO:0005504	Fatty acid binding	6.78E-06	0.001326	0.000972
GO molecular function	GO:0035259	Glucocorticoid receptor binding	1.06E-05	0.001326	0.000972
GO molecular function	GO:0005518	Collagen binding	6.64E-05	0.005176	0.003792
GO molecular function	GO:0033293	Monocarboxylic acid binding	8.28E-05	0.005176	0.003792
GO molecular function	GO:0061629	RNA polymerase II-specific DNA-binding transcription factor binding	0.000352	0.017478	0.012805

Figure 6.

Differentiation of ERSRDEGs between normal and NSCLC samples. (A) ERSRDEGs’ volcano plot. 1028 showed an increase in regulation, whereas 1034 showed a decrease. The heatmap displaying the top 58 ERSRDEGs is shown in (B). (C) GO enrichment analysis in BP for ERSRDEGs. (D) ERSRDEGs CC’s GO enrichment analysis. (E) GO evaluation of enrichment in MF using ERSRDEGs. (F) KEGG pathway study of ERSRDEGs.

Figure 7.

Risk score system established using TCGA data. (A) The forest plot for univariate Cox regression analysis. (B) depicts LASSO coefficient profiles for the four ERSRDEGs linked with survival. (C) Use cross-validation to determine the number of variables in the LASSO regression framework. (D) Kaplan-Meier survival curves for the different training group risk groups. (E) The training group’s ROC curve for the risk score model. (F) Kaplan-Meier survival curves for the different risk groups in the testing population. (G) The evaluation group’s ROC curve for the risk score framework.

Figure 8.

Nomogram created and validated using TCGA dataset. An examination of univariate Cox regression using the forest plot (A). The multivariate Cox regression analysis’s forest plot is shown in (B). A nomogram’s calibration curve (C). The nomogram that uses TCGA data is (D). The nomogram’s (E–G) ROC curve with a 3-, 5-, or 10-year OS. (H–J) DCA of Nomogram with 3-, 5- or 10-year OS.

3.5 Identification of ERSRDEGs

After analyzing the TCGA data with the R Limma programme, 2062 ERSRDEGs with expression differences between tumour and normal tissues were identified (Fig. 6A). ERSRDEGs may be found by analyzing TCGA data using the Limma tool. RNA-seq expression data from tumour and standard samples are obtained and normalized. Then, a design matrix is made, a linear model is fitted, Bayes moderation is used, and differential expressions are found. These included 1028 and 1034, which were upregulated and downregulated, accordingly. Then, we selected the intersection of the 2062 ERSRDEGs and DEGs generated from high-score cell subsets reanalysis and got 58 genes. Those genes were shown in a heatmap (Fig. 6B). For the 58 genes, we performed GO enrichment analysis, and the results showed 15 enriched GO keywords (Table 3). The most enriched categories in the BP category were reaction to lipopolysaccharide, response to bacterial molecules, and positive regulation of soft tissue cell growth (Fig. 6C). Categorized by “cellular component” observed, they were mainly associated with specific granules, secretory granule membranes, and collagen-containing extracellular matrix.

Table 4
KEGG enrichment analysis

Category	KEGG	Description	$P$ value	adjust $P$ value	$Q$ value
KEGG	hsa04657	IL-17 signaling pathway	0.000151939	0.024310185	0.021431347
KEGG	hsa04668	TNF signaling pathway	0.000345093	0.024815489	0.021876813
KEGG	hsa04621	NOD-like receptor signaling pathway	0.00046529	0.024815489	0.021876813
KEGG	hsa04510	Focal adhesion	0.000742619	0.026871123	0.023689016
KEGG	hsa04210	Apoptosis	0.000839723	0.026871123	0.023689016

Table 5

Differentially expressed endoplasmic reticulum-related genes information

Gene symbol	logFC	$P$ value	adjust $P$ value	State
RETN	1.991083	4.33E-69	3.07E-66	Up
SPP1	$-$ 3.74258	4.61E-28	3.19E-26	Down
FOSB	3.091937	2.03E-27	1.34E-25	Up
NR4A3	1.855239	8.21E-20	3.19E-18	Up
CXCL2	2.02553	4.50E-18	1.62E-16	Up
NR4A1	2.31729	2.62E-17	8.75E-16	Up
FBP1	1.979536	3.08E-14	7.21E-13	Up
SLC7A5	$-$ 2.24199	3.33E-14	7.72E-13	Down
THBD	1.981007	7.36E-14	1.65E-12	Up
CD69	1.424426	1.68E-13	3.61E-12	Up
C5AR1	1.682673	4.68E-12	8.24E-11	Up
ALOX5AP	1.595083	3.70E-11	5.71E-10	Up
SLPI	2.10868	5.45E-11	8.21E-10	Up
MARCKSL1	$-$ 1.8124	5.25E-10	7.16E-09	Down
ALDH2	1.809134	7.41E-10	9.88E-09	Up
APOC1	1.551748	6.53E-09	7.66E-08	Up
MAP3K8	1.241514	7.13E-09	8.23E-08	Up
CD55	1.753784	1.77E-08	1.92E-07	Up
ACP5	1.595963	1.99E-08	2.13E-07	Up
CYB5A	1.489301	5.02E-08	5.02E-07	Up
CD68	0.668328	5.19E-08	5.17E-07	Up
ATF3	1.402443	9.90E-08	9.40E-07	Up
MIF	$-$ 1.50124	1.17E-07	1.10E-06	Down
PAK1	$-$ 1.24616	3.22E-06	2.35E-05	Down
TGM2	1.472747	4.56E-06	3.24E-05	Up
TXN	$-$ 1.3075	5.57E-06	3.90E-05	Down
TXNIP	1.557181	6.71E-06	4.61E-05	Up
PMAIP1	$-$ 1.00641	1.07E-05	6.98E-05	Down
SERPINH1	$-$ 1.23685	1.78E-05	0.000112	Down
CD59	1.310003	2.88E-05	0.000172	Up
BIRC3	1.157993	3.73E-05	0.000216	Up
LTA4H	1.169449	4.15E-05	0.000238	Up
SLC2A3	1.143354	4.61E-05	0.000262	Up
SEC61G	$-$ 1.03819	4.74E-05	0.000267	Down
EGR1	1.210433	6.93E-05	0.000374	Up
STOM	1.140679	9.08E-05	0.000476	Up
PPIF	$-$ 1.0095	0.000175	0.00086	Down
FABP5	0.939528	0.000178	0.00087	Up
FCGRT	1.003773	0.000348	0.001608	Up
PLP2	$-$ 1.00005	0.000591	0.002567	Down
ID2	0.832938	0.000592	0.002571	Up
IRF7	$-$ 0.82934	0.000815	0.003398	Down
JUN	0.954773	0.00172	0.00654	Up
VCAN	$-$ 0.94122	0.002731	0.009696	Down
LRPAP1	0.821484	0.002865	0.010148	Up
CTSL	0.826517	0.003404	0.011748	Up
KDELR2	$-$ 0.81301	0.003984	0.013455	Down
ACSL1	0.802969	0.004479	0.014819	Up
CD81	0.842776	0.004522	0.014942	Up
PDLIM1	0.781085	0.006382	0.020016	Up
ANPEP	0.64441	0.006732	0.020998	Up
FLNA	0.902272	0.006744	0.021026	Up
ANXA1	0.798033	0.011075	0.031947	Up

Table 5, continued
Gene symbol	logFC	$P$ value	adjust $P$ value	State
CEBPB	0.694077	0.011775	0.03354	Up
LMNA	0.705779	0.018041	0.047416	Up
S100A8	0.664589	0.020072	0.051608	Up
S100A13	$-$ 0.5489	0.035352	0.08215	Down
THBS1	0.63131	0.040865	0.092382	Up

Furthermore, the “molecular function” category identified GO keywords mostly in transcription factor binding, fatty acid binding, and DNA binding unique to RNA polymerase II. According to KEGG enrichment, the TNF and IL-17 signalling pathways were the most highly enriched (Table 4). According to the KEGG enrichment analysis, TNF and IL-17 signalling pathways are significantly enriched when genes of interest are over-represented. These pathways are linked to autoimmune illnesses, cancer, infections, and inflammatory responses in conditions including psoriasis and rheumatoid arthritis.

3.6 Establishment of the risk score model

Four genes were filtered using univariate Cox regression to calculate the risk score (Fig. 7A). The risk score model (Fig. 7B and C) was developed using Lasso-Cox analysis. Univariate Cox Regression and Lasso-Cox Analysis are steps in the process of creating a risk score model. Initially, genes linked to survival are screened for, significant genes are chosen, and relevant genes are chosen using LASSO with cross-validation. The chosen genes are then used to fit the final model. The formula is “Risk score $=$ ( $-$ 0.051) $\times$ LTA4H expression $+$ (0.050) $\times$ CTSL expression $+$ ( $-$ 0.063) $\times$ FABP5 expression $+$ (0.057) $\times$ C5AR1 expression.” The validation dataset was the GSE37745 dataset. The median risk scores were used to determine the threshold value for categorizing the groups as high or low risk. It seemed that the difference between training (Fig. 7D) and testing (Fig. 7F) datasets was significant ( $P<$ 0.05). In addition, the risk score model’s efficacy was assessed using an adjusted ROC curve (Fig. 7E and G). AUC ( $>$ 0.7) shows the capacity to forecast patients’ prognosis.

3.7 Construction and validation of the nomogram

We used univariate (Fig. 8A) and multivariate Cox regression (Fig. 8B) to examine the independence of the four discovered genes as prognostic indicators for NSCLC. A simple way to discuss how each gene affects survival is to use univariate Cox regression, although this technique may overstate results because it needs to account for other factors. In contrast, multivariate Cox Regression accounts for multicollinearity and confounding by simultaneously analyzing several genes. The results showed a relationship between the outcome of NSCLC patients and the risk score, N stage, and T stage (Table 6). A nomogram was generated using the risk score, N stage, and T stage to estimate the patients’ 3-, 5-, and 10-year OS. Patients’ 3-, 5-, and 10-year OS could be predicted using the nomogram, according to Fig. 8E–G’s ROC curve and Fig. 8H–J’s AUC. The calibration curves also supported the nomogram’s efficacy in prognosticating NSCLC patients (Fig. 8C). Calibration curves are necessary to evaluate how well nomograms predict patients with non-small cell lung cancer. They show that there is good calibration when they visually compare projected probability with actual outcomes. To ensure precision, deviations from the 45-degree line are measured. Models are validated for internal and external validation, biases are found, and calibration curves guide modifications.

Table 6
Univariate and multivariate cox analysis

Variable	HR	HR.95L	HR.95H	$p$ value
Univariate cox analysis
riskScore	2.79739630234182	1.82260987922751	4.29352773818629	2.5247756918211e-06
Age	1.20114072582545	0.951347887733755	1.51652099283398	0.123393900784911
pM	2.07296327167352	1.26742805118518	3.39046995345343	0.0036834320684122
pN	1.89645797854317	1.40481216616489	2.5601663702832	2.91505575845041e-05
pT	1.73621485975339	1.31161252326478	2.2982717729205	0.000115406873078976
Gender	1.17187770682912	0.920845510177068	1.49134392749444	0.197217869982789
Stage	1.90523120702467	1.48597992373633	2.44276917489819	3.7052252200113e-07
Multivariate cox analysis
riskScore	2.81394300844018	1.83203902158835	4.32211058904429	2.30154418167587e-06
Age	1.28782301161155	1.01670419649674	1.63123956304194	0.0359652532586794
pM	1.78211299535064	0.962269861272801	3.30045328864067	0.0661148736107527
pN	1.7620625070016	1.07104971934632	2.89889836344451	0.0257348980965454
pT	1.52425132301951	1.07252804169453	2.16622969787902	0.0187530404289657
Gender	1.18842406983797	0.931590032934003	1.51606577983891	0.164662336414826
Stage	1.09181646490037	0.659080889364484	1.80867509931441	0.733031480640883

3.8 The relationship between the 4 chosen genes’ expression and immune infiltration

We employed ssGSEA and CIBERSORT to evaluate each sample’s immune cell enrichment levels. According to CIBERSORT profiling, the proportion of T cells with gamma delta capabilities was the only variable showing significant variation ( $P<$ 0.05) between the high and low-risk groups (Fig. 9A). Gamma delta ( $\gamma\delta$ ) T lymphocytes significantly affect non-small cell lung cancer patients by producing cytokines, presenting antigens, and changing the tumour microenvironment. Extended activation causes fatigue, which lowers cytotoxic potential. Targeted therapy development can benefit from an understanding of these processes.

Furthermore, as illustrated in Fig. 9B, the ssGSEA data revealed significant differences ( $P<$ 0.05) in the inclusion rating for various immune cells between these two groups. The CIBERSORT results also indicated that among the 22 assessed immune cells, Macrophages M0 cells demonstrated a significant correlation with other cells ( $P<$ 0.05) (Fig. 9C). In contrast, the ssGSEA data in Fig. 9D showed a substantial correlation ( $P<$ 0.05) among all 28 immune cells. Figure 9E displayed a heatmap illustrating the correlation between the infiltration level of immune cells ( $P<$ 0.05) determined by the CIBERSORT technique and four selected genes (C5AR1, CTSL, FABP5, and LTA4H). The study used Spearman correlation to explore the association between the expression levels of these four genes and the quantity of immune cell infiltration as determined by ssGSEA. The results indicated that only C5AR1 showed statistical significance ( $P<$ 0.05) (Fig. 9F).

Figure 9.

Employing ssGSEA and CIBERSORT to analyze immune cell infiltrates. (A) Box plot of 22 immune cells’ distribution (CIBERSORT analysis). The colour purple symbolized high-risk groups, while the colour pink represented low-risk groups. (B) A box plot was utilized to display the distribution of 28 immune cells’ spatial organization, with the analysis being done using ssGSEA. (C) Correlation analysis of the 22 immune cells in NSCLC. The purple color represented positive correlation; the green color represented negative correlation; the cross in the circle represented no significant correlation. (D) Correlation study of NSCLC immune cells, totalling 28. The purple color represented positive correlation; the green color represented negative correlation; the cross in the circle represented no significant correlation. (E) Heatmap of the correlation of the four selected genes and 22 immune cells. The pink color represented positive correlation; the purple color represented negative correlation. (F) Heatmap showing how the four chosen genes and the 28 immune cells are correlated. The pink color represented positive correlation; the purple color represented negative correlation. ^*: $P<$ 0.05; ^**: $P<$ 0.01; ^***: $P<$ 0.0001; ^****: $P<$ 0.000001.

Figure 10.

Drug sensitivity in high- and low-risk populations as determined by TCGA data. (A–L) Distribution plots of IC50 of 12 drugs with significant differences between high and low-risk groups. The blue colour represented high-risk groups, and the red colour represented low-risk groups.

Figure 11.

The correlation analysis of the four key genes and reaction to immunotherapy based on the GSE117570 dataset. (A) The heatmap of the four essential genes and 69 immune checkpoint genes. Blue represents a negative association, while red represents a positive one. (B) Bar plots of the expression level of C5AR1 gene and reaction to immunotherapy. (C) Bar graphs showing the CTSL gene expression level and immunotherapy response. (D) Bar charts showing the FABP5 gene’s expression level and response to immunotherapy. (E) Bar graphs showing the LTA4H gene expression level and immunotherapy response. ^*: $P<$ 0.05; ^**: $P<$ 0.01; ^***: $P<$ 0.0001; ^****: $P<$ 0.000001.

3.9 Drug sensitivity of high and low-risk groups

$P<$ 0.01 was the threshold value when comparing the IC50 of 138 medications using the prophetic program. Twelve different medication types were ultimately chosen. The 12 medications seemed to benefit the high-risk group more, indicating that they were more prone to gain from using them (Fig. 10).

3.10 Analysis of the correlation between critical genes and reaction to immunotherapy

As per the findings of the correlation study, there was a substantial association ( $P<$ 0.05) between the majority of the 69 immune checkpoint genes and all four critical genes (C5AR1, CTSL, FABP5, and LTA4H) in TCGA data. The degree of expression of essential genes might influence the immunological checkpoint genes’ expression. To confirm that, we utilized the GSE61676 dataset to validate. It was discovered that there was a correlation between the expression levels of all four genes and the immunological checkpoint gene expression (Fig. 11B–E). As a result, the response to immunotherapy may be predicted based on the expression levels of the four genes.

4. Discussion

Owing to the ageing and expanding global population, cancer ranks among the leading causes of mortality [42]. Lung cancer remains the leading cause of death related to cancer [43]. As more targeted medications and immunotherapies have been developed, lung cancer patients have more treatment options, and their OS and quality of life have somewhat improved [44, 45]. However, the long-term survival of NSCLC patients is still poor [5], which leads to a deeper investigation of tumorigenesis and tumour progression that will benefit those patients. Although long-term use of targeted drugs and immunotherapies may result in toxicities and autoimmune disorders, they can increase the quality of life and survival rates for lung cancer survivors. Implications for long-term survivability include patient education, monitoring, supportive care, and research into novel treatments.

Different kinds of cells, including tumour, stromal, immune, and fibroblast cells, are controlled in tumours and closely interact with and impact one another. It takes a multifaceted approach to understand the relationships between tumour cells, stromal cells, immune cells, and fibroblasts [46]. This includes functional assays to detect the impact of ER stress, multiplex imaging, single-cell RNA-seq, and spatial transcriptomics. To ensure equal access, protect sensitive data, prevent genetic discrimination, and conduct ethical clinical trials, personalized medicine for tumour cells needs to take ethical factors like equity, privacy, data security, informed consent, genetic discrimination, clinical validity, psychological impact, autonomy, and ethical research practices into account. Therefore, it is essential to examine the cellular makeup of the tumour tissue cells and their interactions. Analyzing the cellular composition and interactions in tumour tissues is necessary to comprehend the tumour’s course and the therapy’s efficacy. It assists in identifying essential participants in the Tumour MicroEnvironment (TME), exposing novel targets for treatment and methods for adjusting immune responses. To predict prognosis and guide therapy, we built a 4-gene risk model and found ERSRDEGs between tumour and standard samples in our study. A 4-gene risk model for NSCLC becomes more accurate and predictive when ERSRDEGs are included. To improve patient classification and treatment plans, the model identifies high-risk patients with significant connections with outcomes and incorporates multifactorial analysis for nuanced risk forecasts. Gene signatures are now helpful for prognosis prediction and clinical decision support. There has been little research on ER stress-related predictive risk models for NSCLC, but there have been several on hepatocellular carcinoma [47], oesophagal cancer [48], as well as diffuse glioma [49]. The risk signature conducted in our present study was validated to present considerable prognostic accuracy and feasibility.

Furthermore, a nomogram was developed to predict the likelihood of a 1-, 3-, and 5-year survival for non-small cell lung cancer individuals. Risk ratings and clinical characteristics made up this nomogram. Additionally, we looked into how various risk groups’ immune systems infiltrated, providing guidance for immunotherapy in NSCLC. Additionally, we examined drug sensitivity and assessed prospective medications for use in NSCLC patients according to various risk categories. For NSCLC patients, our identified signature may be a unique biomarker for individualized care and the best possible follow-up.

Four ERSRDEGs – FABP5, C5AR1, CTSL, and LTA4H – were found in our study using LASSO Cox regression analysis. FABP5 may have a role in lipid homeostasis and tumour immunology, as evidenced by prior studies that have connected it to various tumours’ growth, metastasis, and proliferation [50, 51, 52]. Yang et al. found that FABP5 regulated the maturation of natural killer cells in the lung, which might control lung cancer metastasis [53]. Furthermore, FABP5 has been suggested as a possible therapeutic target for hepatocellular carcinoma as well as breast cancer [54, 55]. C5AR1 was found to be highly expressed in NSCLC patients [56]. C5AR1 was involved in the non-inflammatory tumour microenvironment and was reported to be related to immune evasion in gastric cancer [57], which might explain our finding regarding the reaction to immunotherapy. According to reports, glioma invasion and proliferation are inhibited by the downregulation of CTSL, which may present a unique approach to glioma treatment [58].

Similarly, our research suggests that CTSL may affect the prognosis of NSCLC. According to a recent study, LTA4H overexpression in laryngeal squamous cell carcinoma impacts the tumour’s growth, migration, and metastasis. LTA4H is also considered an exciting cancer therapy target [59]. All of FABP5’s and LTA4H’s HRs were less than 1, and our multivariate Cox regression demonstrated that their high expression levels were linked to better survival, indicating that they may be used to treat non-small cell lung cancer. Their precise involvement in NSCLC still needs to be determined by further research.

Molecularly classified tailored immunotherapy has been effective in treating NSCLC. A significant development in treating NSCLC is individualized immunotherapy, which offers individualized care based on the tumour’s molecular profile. Higher response rates and longer overall lifespan are possible with this strategy’s advantages, which include improved clinical outcomes, combination medicines, less toxicity, and greater effectiveness. This investigation examined the variations in 22 tumour-infiltrating immune cells among the risk categories. The more abundant of T cells, namely gamma delta cells, in the high-risk group may have contributed to their poorer OS despite contradictory findings from previous research [60]. Moreover, macrophages were present in the high-risk group and may be associated with unfavorable OS. As a result, a higher status of immune cells invading the tumour might indicate a poorer outcome.

Moreover, tumour-infiltrating immune cells could also predict the response to immunotherapy. Our risk profile may indirectly indicate the efficacy of immunotherapy. However, it is yet unclear how ER stress affects the anti-tumor immune response in NSCLC.

We also investigated potentially sensitive anti-NSCLC drugs. Notably, we excluded numerous small molecule medications responsive to the high-risk population. Some of the drugs, such as DMOG, KIN001.135, Nutlin.3a, and PD.0332991, were under investigation for druggability. Several medications have received approval for the treatment of certain tumours. Prostate cancer [61], recurrent ovarian epithelial carcinoma, fallopian tube, and primary peritoneal cancer [62] were among the indications for which AG.014699 was authorized. For some types of lymphomas, such as marginal zone lymphoma, follicular lymphoma, and diffuse large B-cell lymphoma [63], lenalidomide was authorized [64]. To treat kidney cancer [65], pancreatic neuroendocrine tumours [66], and gastrointestinal stromal tumours, sunitinib was authorized. Trials for other medications were being conducted. Verifying the synergistic impact of combination therapy techniques in non-small cell lung cancer requires more practice and clinical studies.

Our present study has several gaps. First, we needed to find a way to compile data on more precise clinical factors, such as tumour biomarkers and treatment modalities. Conducting extensive multicenter cohorts and obtaining outside verification of the signature in additional databases will be essential. In addition, due to the experimental tools for detecting and monitoring ER stress responses not being extensively utilized in cancer research [67], more functional experiments are essential to acknowledge the future interaction between the tumour immune microenvironment and ER stress.

5. Conclusion

In summary, by utilizing a well-validated nomogram based on the four ERS-related genes, NSCLC patients could more accurately predict their prognosis. Our findings also pointed out exciting references for NSCLC immunotherapy and drug treatment. In future research, we will use clinical samples and in vitro molecular investigations to better explore its predictive usefulness and the underlying mechanism.

Funding

There is no specific funding for this research.

Consent for publication

All authors have checked the outcomes, endorsed the final draft of the manuscript, and consented to its publication.

Data availability

The findings of this study can be supported by experimental data provided by the corresponding author upon request.

Supplementary data

The supplementary files are available to download from https://dx-doi-org.web.bisu.edu.cn/10.3233/THC-241059.

Footnotes

Acknowledgments

The authors would like to show sincere thanks to those techniques who have contributed to this research.

Conflict of interest

The authors stated that they have no competing interests related to this study.

References

Siegel

Miller

Wagle

Jemal

. Cancer statistics, 2023. CA Cancer J Clin. 2023 Jan; 73(1): 17-48.

Zappa

Mousa

. Non-small cell lung cancer: Current treatment and future advances. Transl Lung Cancer Res. 2016 Jun; 5(3): 288-300.

Raman

Yang

CFJ

Deng

D’Amico

. Surgical treatment for early-stage non-small cell lung cancer. J Thorac Dis. 2018 Apr; 10(Suppl 7): S898-904.

Šutić

Vukić

Baranašić

Försti

Džubur

Samaržija

, et al. Diagnostic, Predictive, and Prognostic Biomarkers in Non-Small Cell Lung Cancer (NSCLC) Management. J Pers Med. 2021 Oct 27; 11(11): 1102.

Lung Cancer Survival Rates

|

5-Year Survival Rates for Lung Cancer [Internet]. [cited 2023 Mar 22]. Available from: https://www.cancer.org/cancer/lung-cancer/detection-diagnosis-staging/survival-rates.html.

Dagogo-Jack

Shaw

. Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol. 2018 Feb; 15(2): 81-94.

Negrini

Gorgoulis

Halazonetis

. Genomic instability – an evolving hallmark of cancer. Nat Rev Mol Cell Biol. 2010 Mar; 11(3): 220-8.

Hanahan

Weinberg

. Hallmarks of cancer: The next generation. Cell. 2011 Mar 4; 144(5): 646-74.

Eberwine

Sul

Bartfai

Kim

. The promise of single-cell sequencing. Nat Methods. 2014 Jan; 11(1): 25-7.

10.

Schwarz

Blower

. The endoplasmic reticulum: Structure, function, and response to cellular signaling. Cell Mol Life Sci. 2016; 73: 79-94.

11.

Corazzari

Gagliardi

Fimia

Piacentini

. Endoplasmic reticulum stress, unfolded protein response, and cancer cell fate. Front Oncol. 2017; 7: 78.

12.

da Silva

Valentão

Andrade

Pereira

. Endoplasmic reticulum stress signaling in cancer and neurodegenerative disorders: Tools and strategies to understand its complexity. Pharmacol Res. 2020 May; 155: 104702.

13.

Joo

Liao

Collins

Grissom

Jetten

. Farnesol-induced apoptosis in human lung carcinoma cells is coupled to the endoplasmic reticulum stress response. Cancer Res. 2007 Aug 15; 67(16): 7929-36.

14.

Thirugnanam

Galety

Pradhan

Agarwal

Shobanadevi

Almufti

Lakshmana Kumar

. PIRAP: Medical Cancer Rehabilitation Healthcare Center Data Maintenance Based on IoT-Based Deep Federated Collaborative Learning. International Journal of Cooperative Information Systems. 2023 June; 33(10): 2350005.

15.

Song

Hawkins

Wudel

Chou

Forbes

Pullikuth

, et al. Dissecting intratumoral myeloid cell plasticity by single-cell RNA-seq. Cancer Med. 2019 Jun; 8(6): 3072-85.

16.

Lohr

Hellwig

Edlund

Mattsson

JSM

Botling

Schmidt

, et al. Identification of sample annotation errors in gene expression datasets. Arch Toxicol. 2015 Dec; 89(12): 2265-72.

17.

Baty

Joerger

Früh

Klingbiel

Zappa

Brutsche

. 24 h-gene variation effect of combined bevacizumab/erlotinib in advanced non-squamous non-small cell lung cancer using exon array blood profiling. J Transl Med. 2017 Mar 30; 15: 66.

18.

Barrett

Wilhite

Ledoux

Evangelista

Kim

Tomashevsky

, et al. NCBI GEO: Archive for functional genomics data sets-update. Nucleic Acids Res. 2013 Jan; 41(Database issue): D991-995.

19.

Chicco

. geneExpressionFromGEO: An R Package to Facilitate Data Reading from Gene Expression Omnibus (GEO). Methods Mol Biol. 2022; 2401: 187-194.

20.

Clough

Barrett

. The Gene Expression Omnibus Database. Methods Mol Biol. 2016; 1418: 93-110.

21.

Goldman

Craft

Hastie

Repečka

McDade

Kamath

, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol. 2020 Jun; 38(6): 675-8.

22.

Gautier

Cope

Bolstad

Irizarry

. Affy-analysis of Affymetrix GeneChip data at the probe level. Bioinforma Oxf Engl. 2004 Feb 12; 20(3): 307-15.

23.

Hao

Andersen-Nissen

Mauck

Zheng

Butler

, et al. Integrated analysis of multimodal single-cell data. Cell. 2021 Jun 24; 184(13): 3573-3587.e29.

24.

Stelzer

Rosen

Plaschkes

Zimmerman

Twik

Fishilevich

, et al. The GeneCards Suite: From gene data mining to disease genome sequence analyses. Curr Protoc Bioinforma. 2016 Jun 20; 54: 1.30.1-1.30.33.

25.

Van de Sande

Flerin

Davie

De Waegeneer

Hulselmans

Aibar

, et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat Protoc. 2020 Jul; 15(7): 2247-76.

26.

Ritchie

Phipson

Law

Shi

, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015 Apr 20; 43(7): e47.

27.

Ito

Murphy

. Application of ggplot2 to Pharmacometric Graphics. CPT Pharmacomet Syst Pharmacol. 2013 Oct; 2(10): e79.

28.

Eils

Schlesner

. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinforma Oxf Engl. 2016 Sep 15; 32(18): 2847-9.

29.

Ashburner

Ball

Blake

Botstein

Butler

Cherry

, et al. Gene Ontology: A tool for the unification of biology. Nat Genet. 2000 May; 25(1): 25-9.

30.

Kanehisa

Furumichi

Tanabe

Sato

Morishima

. KEGG: New perspectives on genomes, pathways, diseases, and drugs. Nucleic Acids Res. 2017 Jan 4; 45(D1): D353-61.

31.

Wang

Han

. clusterProfiler: An R package for comparing biological themes among gene clusters. Omics J Integr Biol. 2012 May; 16(5): 284-7.

32.

Engebretsen

Bohlin

. Statistical predictions with glmnet. Clin Epigenetics. 2019 Aug 23; 11: 123.

33.

Saha-Chaudhuri

Heagerty

. Non-parametric estimation of a time-dependent predictive accuracy curve. Biostatistics. 2013 Jan 1; 14(1): 42-59.

34.

Park

. Nomogram: An analogue tool to deliver digital knowledge. J Thorac Cardiovasc Surg. 2018 Apr; 155(4): 1793.

35.

Findlay

JWA

Dillard

. Appropriate calibration curve fitting in ligand binding assays. AAPS J. 2007 Jun 29; 9(2): E260-267.

36.

Vickers

Elkin

. Decision curve analysis: A novel method for evaluating prediction models. Med Decis Mak Int J Soc Med Decis Mak. 2006; 26(6): 565-74.

37.

Chen

Khodadoust

Liu

Newman

Alizadeh

. Profiling Tumor-Infiltrating Immune Cells with CIBERSORT. Methods Mol Biol Clifton NJ. 2018; 1711: 243-59.

38.

Hänzelmann

Castelo

Guinney

. GSVA: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013 Jan 16; 14: 7.

39.

Yang

Soares

Greninger

Edelman

Lightfoot

Forbes

, et al. Genomics of Drug Sensitivity in Cancer (GDSC): A resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2013 Jan; 41(Database issue): D955-961.

40.

Geeleher

Cox

Huang

. pRRophetic: An R package for prediction of clinical chemotherapeutic response from tumor gene expression levels. PloS One. 2014; 9(9): e107468.

41.

Liu

Zhang

Guo

. Expression profile of immune checkpoint genes and their roles in predicting immunotherapy response. Brief Bioinform. 2021 May 20; 22(3): bbaa176.

42.

Lee

Kong

Lee

Won

Jung

, et al. Causes of death among cancer patients in the era of cancer survivorship in Korea: Attention to the suicide and cardiovascular mortality. Cancer Med. 2020 Jan 20; 9(5): 1741-52.

43.

Thandra

Barsouk

Saginala

Aluru

Barsouk

. Epidemiology of lung cancer. Contemp Oncol. 2021; 25(1): 45-52.

44.

Billingy

Tromp

VNMF

van den Hurk

CJG

Becker-Commissaris

Walraven

. Health-related quality of life and survival in metastasized non-small cell lung cancer patients with and without a targetable driver mutation. Cancers. 2021 Aug 25; 13(17): 4282.

45.

Hechtner

Eichler

Wehler

Buhl

Sebastian

Stratmann

, et al. Quality of life in NSCLC survivors – a multicenter cross-sectional study. J Thorac Oncol. 2019 Mar 1; 14(3): 420-35.

46.

Quail

Joyce

. Microenvironmental regulation of tumor progression and metastasis. Nat Med. 2013 Nov; 19(11): 1423-37.

47.

Liu

Wei

Mao

Xin

Duan

, et al. Establishment of a prognostic model for hepatocellular carcinoma based on endoplasmic reticulum stress-related gene analysis. Front Oncol. 2021; 11: 641487.

48.

Tang

Zhu

Luo

. Endoplasmic reticulum stress-related four-biomarker risk classifier for survival evaluation in esophageal cancer. J Oncol. 2022; 2022: 5860671.

49.

Huang

Wang

Zeng

, et al. Comprehensive analysis of the clinical and biological significances of endoplasmic reticulum stress in diffuse gliomas. Front Cell Dev Biol. 2021; 9: 619396.

50.

Guaita-Esteruelas

Gumà

Masana

Borràs

. The peritumoural adipose tissue microenvironment and cancer. The roles of fatty acid binding protein 4 and fatty acid binding protein 5. Mol Cell Endocrinol. 2018 Feb 15; 462(Pt B): 107-18.

51.

Pan

Xiao

Liao

Chen

Peng

Zhang

, et al. Fatty acid binding protein 5 promotes tumor angiogenesis and activates the IL6/STAT3/VEGFA pathway in hepatocellular carcinoma. Biomed Pharmacother Biomedicine Pharmacother. 2018 Oct; 106: 68-76.

52.

Wang

Chu

Liang

Huang

Shang

Tan

, et al. FABP5 correlates with poor prognosis and promotes tumor cell growth and metastasis in cervical cancer. Tumour Biol J Int Soc Oncodevelopmental Biol Med. 2016 Nov; 37(11): 14873-83.

53.

Yang

Kobayashi

Sekino

Kagawa

Miyazaki

, Shil

, et al. Fatty acid-binding protein 5 controls lung tumor metastasis by regulating the maturation of natural killer cells in the lung. FEBS Lett. 2021 Jul; 595(13): 1797-805.

54.

Yousuf

Sofi

Makhdoomi

Mir

. Identification and analysis of dysregulated fatty acid metabolism genes in breast cancer subtypes. Med Oncol Northwood Lond Engl. 2022 Oct 12; 39(12): 256.

55.

Liu

Sun

Guo

Yang

Zhao

Gao

, et al. Lipid-related FABP5 activation of tumor-associated monocytes fosters immune privilege via PD-L1 expression on Treg cells in hepatocellular carcinoma. Cancer Gene Ther. 2022 Dec; 29(12): 1951-60.

56.

Pakvisal

Kongkavitoon

Sathitruangsak

Pornpattanarak

Boonsirikamchai

Ouwongprayoon

, et al. Differential expression of immune-regulatory proteins C5AR1, CLEC4A, and NLRP3 on peripheral blood mononuclear cells in early-stage non-small cell lung cancer patients. Sci Rep. 2022 Nov 2; 12(1): 18439.

57.

Shen

Xiang

Zhang

Shi

, et al. C5aR1 shapes a non-inflammatory tumor microenvironment and mediates immune evasion in gastric cancer. Bosn J Basic Med Sci. 2022 Dec 10;

58.

Qian

Zhang

. Methionine deprivation inhibits glioma growth through downregulation of CTSL. Am J Cancer Res. 2022; 12(11): 5004-18.

59.

Ren

Wang

Zhang

Zhou

Wang

Zhao

, et al. LTA4H extensively associates with mRNAs and lncRNAs, indicative of its novel regulatory targets. PeerJ. 2023; 11: e14875.

60.

Raverdeau

Cunningham

Harmon

Lynch

. γ⁢δ T cells in cancer: A small population of lymphocytes with big implications. Clin Transl Immunol. 2019 Oct 10; 8(10): e01080.

61.

Rubraca (Rucaparib) Approved in the US as Monotherapy Treatment for Patients with BRCA1/2-Mutant, Metastatic Castration-Resistant Prostate Cancer (mCRPC) Who Have Been Treated with Androgen Receptor-Directed Therapy and a Taxane-Based Chemotherapy [Internet]. 2020 [cited 2023 Mar 27]. Available from: https://www.businesswire.com/news/home/20200515005527/en/Rubraca%C2%AE-Rucaparib-Approved-in-the-U.S.-as-Monotherapy-Treatment-for-Patients-with-BRCA12-Mutant-Metastatic-Castration-Resistant-Prostate-Cancer-mCRPC-Who-Have-Been-Treated-with-Androgen-Receptor-Directed-Therapy-and-a-Taxane-Based-Chemotherapy.

62.

Rubraca (rucaparib) is Approved in the US as a Maintenance Treatment for Recurrent Ovarian Cancer [Internet]. [cited 2023 Mar 27]. Available from: https://www.drugs.com/newdrugs/rubraca-rucaparib-approved-u-s-maintenance-recurrent-ovarian-cancer-4721.html.

63.

FDA Approves Monjuvi (tafasitamab-cxix) in Combination with Lenalidomide for the Treatment of Adult Patients with Relapsed or Refractory Diffuse Large B-cell Lymphoma (DLBCL)

|

Business Wire [Internet]. [cited 2023 Mar 27]. Available from: https://www.businesswire.com/news/home/20200731005497/en/FDA-Approves-Monjuvi%C2%AE-tafasitamab-cxix-Combination-Lenalidomide-Treatment.

64.

FDA Approves REVLIMID (Lenalidomide) In Combination With Rituximab For the Treatment of Adult Patients with Previously Treated Follicular Lymphoma or Marginal Zone Lymphoma [Internet]. 2019 [cited 2023 Mar 27]. Available from: https://www.businesswire.com/news/home/20190528005626/en/FDA-Approves-REVLIMID%C2%AE-Lenalidomide-In-Combination-With-Rituximab-For-the-Treatment-of-Adult-Patients-with-Previously-Treated-Follicular-Lymphoma-or-Marginal-Zone-Lymphoma.

65.

FDA Approves Sutent for Resistant GIST and Kidney Cancer [Internet]. [cited 2023 Mar 27]. Available from: https://www.cancernetwork.com/view/fda-approves-sutent-resistant-gist-and-kidney-cancer.

66.

SUTENT Receives US FDA Approval for Advanced Pancreatic Neuroendocrine Tumors

|

Pfizer [Internet]. [cited 2023 Mar 27]. Available from: https://www.pfizer.com/news/press-release/press-release-detail/sutent_receives_u_s_fda_approval_for_advanced_pancreatic_neuroendocrine_tumors.

67.

Chen

Cubillos-Ruiz

. Endoplasmic reticulum stress signals in the tumor and its microenvironment. Nat Rev Cancer. 2021 Feb; 21(2): 71-88.

Validation of endoplasmic reticulum stress-related gene signature to predict prognosis and immune landscape of patients with non-small cell lung cancer

Abstract

BACKGROUND:

OBJECTIVE:

METHODS:

RESULTS:

CONCLUSION:

Keywords

1. Introduction

2.1 Data sources and data processing

Table 1 Data set information

2.3 AUCell: Analysis of ‘gene set’ activity in single-cell RNA-seq data

2.4 Identification of differentially expressed genes (ERSRDEGs) associated with endoplasmic reticulum stress in TCGA

2.5 Functional enrichment analysis

2.6 Construction of the prognosis model

2.7 Independence analysis of prognosis model and construction of nomogram

2.8 Immune infiltration analysis

2.9 Drug sensitivity

2.10 Analysis of the correlation between critical genes and reaction to immunotherapy

2.11 Statistical analysis

3.1 DEGs of cell subsets

3.2 Expression analysis of ERSRDEGs of cell subsets

3.4 Reanalysis of high-score cell subsets

Table 2 Differentially expressed genes in high-score cell subsets

Table 4 KEGG enrichment analysis

3.7 Construction and validation of the nomogram

Table 6 Univariate and multivariate cox analysis

3.10 Analysis of the correlation between critical genes and reaction to immunotherapy

4. Discussion

5. Conclusion

Funding

Consent for publication

Data availability

Supplementary data

Footnotes

Acknowledgments

Conflict of interest

References

Table 1
Data set information

Table 2
Differentially expressed genes in high-score cell subsets

Table 4
KEGG enrichment analysis

Table 6
Univariate and multivariate cox analysis