A Splicing Transcriptome-Wide Association Study Identifies Candidate Altered Splicing for Prostate Cancer Risk

Abstract

Prostate cancer (PCa) represents a huge public health burden among men. Many susceptibility genetic factors for PCa still remain unknown. In this study, we performed a large splicing transcriptome-wide association study (spTWAS) using three modeling strategies to develop alternative splicing genetic prediction models for identifying novel susceptibility loci and splicing introns for PCa risk by assessing 79,194 cases and 61,112 controls of European ancestry in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia. We identified 120 splicing introns of 97 genes showing an association with PCa risk at false discovery rate (FDR)-corrected threshold (FDR <0.05). Of them, 33 genes were enriched in PCa-related diseases and function categories. Fine-mapping analysis suggested that 21 splicing introns of 19 genes were likely causally associated with PCa risk. Thirty-five splicing introns of 34 novel genes were identified to be related to PCa susceptibility for the first time, and 11 of the genes were enriched in a cancer-related network. Our study identified novel loci and splicing introns associated with PCa risk, which can improve our understanding of the etiology of this common malignancy.

Introduction

Prostate cancer (PCa) is a common malignancy and cause of mortality among men. With an estimated 1.6 million men diagnosed with PCa and 366,00 deaths attributed to it annually (Pernar et al., 2018), the growing and aging population is projected to result in almost 2.3 million new PCa cases and 740,000 related deaths annually by 2040 (Culp et al., 2020). Strong evidence from family studies supports a genetic predisposition to PCa (Eeles et al., 2008). For PCa, previous genome-wide association studies (GWASs) have identified 269 genetic risk variants, which explain <44% of its familial relative risk (Wu et al., 2019b). Transcriptome-wide association study (TWAS) integrating gene expression and genomic data sets has been applied to identify candidate disease susceptibility genes for multiple diseases, including PCa (Gusev et al., 2018; Mancuso et al., 2018; Wu et al., 2019b; Wu et al., 2018).

To date, previous TWASs have identified more than 800 candidate susceptibility genes for PCa risk (Emami et al., 2019; Fiorica et al., 2020; Gusev et al., 2019; He et al., 2022; Liu et al., 2022; Mancuso et al., 2018; Wu et al., 2019b). However, these loci and genes still fall short in fully explaining the genetic susceptibility of PCa, which calls for additional efforts to identify novel genetic factors for PCa.

As a posttranscriptional regulatory mechanism, alternative splicing through premessenger RNA (mRNA) molecules can produce many distinct mRNAs (Raj et al., 2018) that change the structure of transcripts and their encoded proteins (Stamm et al., 2005). This is the main source of protein diversity due to 90% of human gene alternative splicing expression (Martinez-Montiel et al., 2018). It has been reported that more than 15,000 alternative splicing events are associated with cancer biology (Marasco and Kornblihtt, 2023). Such alternative splicing events have become a distinguishing feature for cancer, with strong potential for developing effective prognostic and therapeutic options (Martinez-Montiel et al., 2018). Alternative splicing has also been implicated to play key roles in normal tissue development (Chen and Weiss, 2015), and aberrant splicing can promote the proliferation, growth, and survival of cancer cells (Bradley and Anczukow, 2023; Chen and Weiss, 2015).

Aberrant alternative splicing yielding protein isoforms can also affect cell phenotypes and survival of PCa patients (Rajan et al., 2009). The critical role of alternative splicing in PCa development has already been suggested in previous studies (Carstens et al., 1997; Gusev et al., 2019; Mancuso et al., 2018; Munkley et al., 2017; Paschalis et al., 2018). For example, androgen receptor (AR) transcription factor is a major driver of PCa pathology (Munkley et al., 2017), and AR splicing variants have been implicated in the development and progression of PCa (Martinez-Montiel et al., 2018). Fibroblast growth factor receptor 2 (FGF-R2) splicing is shown to be associated with androgen insensitivity in human prostate tumors and loss of FGF-R2 isoform b can lead to progression of human PCa (Carstens et al., 1997; Paschalis et al., 2018).

However, a comprehensive understanding of splicing events across the genome related to PCa risk is largely lacking. There have been five conducted TWASs for PCa risk that identified more than 800 candidate susceptibility genes for PCa (Emami et al., 2019; He et al., 2022; Liu et al., 2022; Mancuso et al., 2018; Wu et al., 2019b). However, these studies largely focused on assessing the overall gene expression but not splicing expression. Although splicing expression was also studied in the study of Mancuso et al. (2018), only prostate tumor-derived splicing expression genetic prediction models are used. It is known that for studying PCa susceptibility, normal tissue is more accurate due to that the splicing expression could have been changed during tumor development for many introns (Liu et al., 2020).

Herein, leveraging a large available reference data set (the Genotype-Tissue Expression, GTEx) for normal prostate tissue, we performed a comprehensive splicing TWAS (spTWAS) using complementary alternative splicing expression genetic prediction model building methods elastic net (ENET) (Zou and Hastie, 2005), least absolute shrinkage and selection operator (LASSO) (Baselmans et al., 2019), and minimax concave penalty (MCP) (Zhang, 2010), to identify splicing introns for PCa risk.

Materials and Methods

Building splicing intron expression genetic prediction models

Three sets of splicing intron expression genetic prediction models were established using complementary methods of ENET (Zou and Hastie, 2005), LASSO (Baselmans et al., 2019), and MCP (Zhang, 2010), which have been widely used in TWASs to predict genetically regulated expression levels (Schaid et al., 2018). For each model, we considered single nucleotide polymorphisms (SNPs) within the gene's cis-regulatory (within a 1-Mb window of the gene's transcription site) or enhancer–promoter interaction regions as candidate predictors. The enhancer–promoter interaction regions were determined by GenHancer (Fishilevich et al., 2017), an existing comprehensive database that integrates reported enhancers from four genome-wide resources, including ENCODE, Ensembl, FANTOM, and VISTA. To determine the best-performing prediction model, we compared the R² values of the models yielding a satisfactory prediction accuracy, that is, R² > 0.01. The models for 13,427 splicing introns in total were used in further analyses for PCa-risk associations.

Association analyses of predicted splicing intron expression with PCa risk

Associations of genetically predicted splicing intron expression with PCa risk were investigated using the summary statistics generated from 79,194 PCa cases and 61,112 controls of European ancestry included in the PRACTICAL, CRUK, CAPS, BPC3, and PEGASUS consortia (Schumacher et al., 2018; Wu et al., 2019a). The detailed information for this data set has been described elsewhere (Wu et al., 2020; Wu et al., 2019b).

In brief, 46,939 cases and 27,910 controls were genotyped using OncoArray with 570,000 SNPs (http://epi.grants.cancer.gov/oncoarray). The data from several previous PCa GWASs of European ancestry: UK stage 1 (1854 cases and 1894 controls) and stage 2 (3650 cases and 3940 controls), CaPS 1 (478 cases and 428 controls) and CaPS 2 (1458 cases and 512 controls), BPC3 (2068 cases and 2993 controls), NCI PEGASUS (4600 cases and 2941 controls), and iCOGS (20,219 cases and 20,440 controls), were also included. Using an inverse variance fixed-effect method, logistic regression summary statistics was then meta-analyzed.

The associations of predicted splicing intron expression with PCa risk were further estimated using intron expression prediction weights, summary statistics of SNP-PCa risk associations, and an SNP-correlation (linkage disequilibrium [LD]) matrix with 1000G EUR population as reference, using the S-PrediXcan framework (Gusev et al., 2016).

The false discovery rate (FDR)-corrected p-value threshold was used to determine significant associations between genetically predicted splicing intron expression and PCa risk.

Fine-mapping analyses to prioritize putatively causal splicing introns

Fine-mapping of causal gene sets (FOCUS) fine-mapping analyses of the identified associated splicing introns were performed to prioritize the most likely causal splicing introns for PCa risk, as described elsewhere (Mancuso et al., 2019). In summary, three files were used to run FOCUS fine-mapping analysis, including sQTL weights from each corresponding splicing intron prediction model, GWAS summary statistics for PCa risk, and LD estimated from plink-formatted 1000G.EUR.QC. The default 90%-credible splicing intron set was used to identify the most likely causal splicing introns after FOCUS outputted the posterior probability for each splicing intron.

Functional enrichment analysis

“Canonical Pathway,” “Disease and Functions,” and “Network” function of the “Core Analysis” in Ingenuity Pathway Analysis (IPA) (Krämer et al., 2014) were performed to assess the enriched pathways, functional categories, and networks of the identified PCa risk genes. For significantly enriched “Canonical Pathway,” the significant p-value was calculated using the right-tailed Fisher's exact test. “Disease and Functions” displayed the annotated biological function and/or linked diseases of the genes. The “Network” function was constructed based on the Ingenuity Knowledge Base from published studies.

Ethics approval

This study used deidentified data and no ethics approval or consent was required.

Data Availability

The PCa GWAS summary statistics in the PRACTICAL consortium is available at http://practical.icr.ac.uk/blog/?page_id=8164 The full association results from our analyses and codes are available upon request.

Results

Splicing intron expression genetic prediction models

In this study, three sets of splicing intron expression prediction models developed using three complementary methods, namely, ENET, LASSO, and MCP models, were used to detect the associations of predicted expression of splicing introns with PCa risk. There are satisfactory prediction models for 4366 splicing introns using the ENET method, 3812 for LASSO, and 5249 for MCP, which were tested for their associations with PCa risk.

Associations of predicted expression of splicing introns with PCa risk

For each splicing intron with two or three expression prediction models developed, only the model showing the highest prediction performance was used to assess the splicing intron of interest. Based on the association analyses, 120 splicing introns of 97 genes were significantly associated with PCa risk at FDR-corrected threshold p < 0.05. Of them, 38 splicing introns of 31 genes were associated at the Bonferroni-corrected threshold p < 4.90 × 10⁻⁷ (0.05/10,207), after excluding 49 splicing introns of 35 genes located in LD-extensive regions (Fig. 1 and Table 1 and Supplementary Tables S1–S3). Of the 120 splicing introns, the higher predicted expression of 63 splicing introns was associated with increased PCa risk. Inversely, the lower predicted expression of 57 splicing introns was associated with increased PCa risk.

FIG. 1.

Manhattan plot of association results from the prostate cancer transcriptome-wide association study using splicing intron expression genetic prediction models. The x axis represents the genomic position of each tested splicing intron, and the y axis represents Z value of the association. Each dot represents the genetically predicted expression of one specific splicing intron. The upper dark line represents p = 9.96 × 10⁻⁷ for the Bonferroni correction threshold (0.05/50,220), and the lower gray line represents p = 7.53 × 10⁻⁴ for the false discovery rate-corrected threshold.

Table 1.

Associations of 35 Splicing Introns of 34 Genes That Have Not Been Reported in Previous Transcriptome-Wide Association Study or Splicing Transcriptome-Wide Association Study of Prostate Cancer Risk and at Novel Loci That Have Not Been Reported in Genome-Wide Association Studies of Prostate Cancer Risk

Chromosome	Gene	Type ^a	Intron (start-end, bp)	Model	Number of SNPs in model ^b	Z score	p ^c	p-Value after FDR	Closest risk SNP ^d	Distance to the risk SNP (kb)
1	TCEA3	Protein	23,381,474–23,382,123	MCP	3	−3.74	1.87 × 10⁻⁴	1.69 × 10⁻²	rs10803412	7005
	XKR8	Protein	27,963,693–27,964,185	LASSO	18	−3.69	2.27 × 10⁻⁴	1.99 × 10⁻²	rs10803412	11,587
	SNAP47	Protein	227,731,747–227,732,389	MCP	2	3.46	5.49 × 10⁻⁴	3.71 × 10⁻²	rs1775148	21,974
2	THAP4	Protein	241,584,725–241,601,896	ENET	10	3.60	3.21 × 10⁻⁴	2.60 × 10⁻²	rs2074840	540
3	HEMK1	Protein	50,577,875–50,579,844	LASSO	1	3.43	5.99 × 10⁻⁴	3.89 × 10⁻²	rs34680713	956
	CNBP	Protein	129,171,508–129,171,634	MCP	16	5.09	3.62 × 10⁻⁷	1.03 × 10⁻⁴	rs56325233	919
7	PRKAR1B	Protein	711,527–727,210	MCP	1	−4.41	1.03 × 10⁻⁵	1.57 × 10⁻³	rs527510716	1217
	NSUN5P2	Other	72,950,196–72,954,359	LASSO	11	3.39	7.07 × 10⁻⁴	4.47 × 10⁻²	rs6955627	19,623
	GTPBP10	Protein	90,372,228–90,374,302	MCP	8	3.47	5.21 × 10⁻⁴	3.61 × 10⁻²	rs6955627	2203
	LRGUK	Protein	134,221,918–134,247,556	MCP	9	3.97	7.14 × 10⁻⁵	8.01 × 10⁻³	rs6465657	36,406
9	HABP4	Protein	96,484,659–96,488,089	MCP	3	3.39	7.01 × 10⁻⁴	4.47 × 10⁻²	rs142727307	14,394
	LCN2	Protein	128,149,663–128,150,238	LASSO	3	−3.53	4.17 × 10⁻⁴	3.06 × 10⁻²	rs1571801	3722
11	PNPLA2	Protein	819,905–821,628	ENET	1	4.25	2.16 × 10⁻⁵	2.98 × 10⁻³	rs1881502	686
	SCYL1	Protein	65,526,043–65,526,124	MCP	3	3.55	3.84 × 10⁻⁴	2.86 × 10⁻²	rs12785905	1426
15	LOXL1	Protein	73,942,962–73,946,417	LASSO	8	−3.71	2.06 × 10⁻⁴	1.82 × 10⁻²	rs12913603	3274
16	DNAJA3	Protein	4,446,084–4,446,886	ENET	1	−3.67	2.39 × 10⁻⁴	2.05 × 10⁻²	rs7188897	50,022
	ANKS3	Protein	4,698,941–4,699,052	LASSO	1	−4.61	4.07 × 10⁻⁶	7.69 × 10⁻⁴	rs7188897	49,770
17	MIR22HG	lncRNA	1,714,014–1,716,130	MCP	3	3.46	5.50 × 10⁻⁴	3.71 × 10⁻²	rs461251	1095
	RAI1	Protein	17,681,793–17,724,028	LASSO	8	4.23	2.36 × 10⁻⁵	3.09 × 10⁻³	rs72811270	5096
			17,683,703–17,684,065	ENET	9	−3.46	5.32 × 10⁻⁴	3.65 × 10⁻²	rs72811270	5098
	SHMT1	Protein	18,335,675–18,340,043	LASSO	55	−3.55	3.83 × 10⁻⁴	2.86 × 10⁻²	rs72811270	5750
	ACACA	Protein	37,121,490–37,122,531	MCP	6	4.25	2.10 × 10⁻⁵	2.94 × 10⁻³	rs11263763	1018
	DDX42	Protein	63,787,270–63,792,412	MCP	1	3.97	7.34 × 10⁻⁵	8.14 × 10⁻³	rs17765344	5314
18	ZADH2	Protein	75,202,340–75,208,850	LASSO	14	3.51	4.56 × 10⁻⁴	3.29 × 10⁻²	rs9959454	1562
19	SSBP4	Protein	18,430,930–18,431,647	MCP	1	3.67	2.42 × 10⁻⁴	2.06 × 10⁻²	rs11666569	1217
	U2AF1L4	Protein	35,744,981–35,745,125	MCP	2	3.59	3.35 × 10⁻⁴	2.67 × 10⁻²	rs59710626	2803
	ARHGAP33	Protein	35,785,483–35,786,413	LASSO	7	−3.97	7.09 × 10⁻⁵	8.01 × 10⁻³	rs59710626	2762
	PLD3	Protein	40,348,768–40,366,419	MCP	2	3.48	4.97 × 10⁻⁴	3.50 × 10⁻²	rs11672691	1619
	B9D2	Protein	41,361,203–41,363,432	ENET	1	3.56	3.73 × 10⁻⁴	2.82 × 10⁻²	rs11672691	622
	BCKDHA	Protein	41,423,169–41,424,502	MCP	2	−3.96	7.43 × 10⁻⁵	8.15 × 10⁻³	rs11672691	561
	DMPK	Protein	45,770,640–457,71,568	MCP	1	3.43	5.95 × 10⁻⁴	3.89 × 10⁻²	rs61088131	3070
	NTN5	Protein	48,663,543–48,663,761	LASSO	4	−4.89	1.01 × 10⁻⁶	2.45 × 10⁻⁴	rs2659124	2691
20	SNTA1	Protein	33,408,888–33,412,186	ENET	25	3.47	5.24 × 10⁻⁴	3.61 × 10⁻²	rs6141551	595
21	PCBP3	Protein	45,935,305–45,940,030	LASSO	12	3.72	2.02 × 10⁻⁴	1.81 × 10⁻²	rs1041449	3034
22	SLC2A11	Protein	23,877,171–23,882,459	LASSO	7	−3.37	7.53 × 10⁻⁴	4.61 × 10⁻²	rs2238776	4119

Protein, protein coding genes; lncRNA; other, transcribed unprocessed pseudogene.

Number of SNPs among “SNPs” located in either gene-body region or enhancer–promoter region and used to construct nonzero weights.

Bonferroni-corrected significant introns shown in bold.

The closest risk SNPs identified in previous GWASs for PCa risk.

ENET, elastic net; FDR, false discovery rate; GWAS, genome-wide association study; LASSO, least absolute shrinkage and selection operator; lncRNA, long noncoding RNA; MCP, minimax concave penalty; PCa, prostate cancer; SNPs, single nucleotide polymorphisms.

Of the identified 120 splicing introns, 67 locate in 46 genes that have been reported in previous TWASs (Supplementary Table S1) and/or PCa spTWASs (Supplementary Table S2). Also, 18 splicing introns of 17 genes were at genomic loci within 500 kb of GWAS-identified PCa risk variants while have not yet been reported in previous TWASs or spTWASs (Supplementary Table S3).

Importantly, 35 splicing introns of 34 novel genes were identified to be associated with PCa risk for the first time (Table 1). These 34 genes include 32 protein-coding genes (TCEA3, XKR8, SNAP47, THAP4, HEMK1, CNBP, PRKAR1B, GTPBP10, LRGUK, HABP4, LCN2, PNPLA2, SCYL1, LOXL1, DNAJA3, ANKS3, RAI1, SHMT1, ACACA, DDX42, ZADH2, SSBP4, U2AF1L4, ARHGAP33, PLD3, B9D2, BCKDHA, DMPK, NTN5, SNTA1, PCBP3, and SLC2A11), 1 transcribed unprocessed pseudogene (NSUN5P2), and 1 long noncoding RNA (lncRNA; MIR22HG). Of these, 16 genes have already been reported to be related to PCa risk in previous studies, including TCEA3, SNAP47, THAP4, HEMK1, PRKAR1B, GTPBP10, LCN2, PNPLA2, RAI1, ACACA, U2AF1L4, ARHGAP33, B9D2, BCKDHA, DMPK, and PCBP3 (Supplementary Table S4) (Boutros et al., 2015; Brikun et al., 2018; Chen and Hu, 2019; Cheung et al., 2017; Ding et al., 2015; Jin et al., 2016; Jo et al., 2017; Martinez-Marin et al., 2017; Roussigne et al., 2003; Schroder et al., 2022; Shahabi et al., 2016; Svensson et al., 2014; Wang et al., 2020; Zahalka et al., 2017; Zhang et al., 2022; Zhao et al., 2020).

Of the identified novel splicing introns, an association between higher predicted expression and increased PCa risk was observed for 22 splicing introns. Conversely, an association between lower predicted expression and increased PCa risk was detected for 13 introns.

Fine-mapping analysis results

Of the identified 120 splicing introns of 97 genes associated with PCa risk, 21 splicing introns of 19 genes were further prioritized by FOCUS analysis (Table 2), with 90%-credible sets to be likely causal splicing introns (Mancuso et al., 2019). Of them, two splicing introns (chr16:4446084-4446886:DNAJA3 and chr16:4698941-4699052:ANKS3) of two genes (DNAJA3 and ANKS3) were reported for the first time.

Table 2.

Twenty-One Splicing Introns of 19 Genes Prioritized by Fine-Mapping of Causal Gene Sets Fine Mapping Analysis to Be Putatively Causal Units

Chromosome	Gene	Intron (start-end, bp)	Number of SNPs in model	TWAS p-value	Focus
1	CREB3L4	153,969,753–153,972,744	23	1.66 × 10⁻⁵	0.99
2	EHBP1	62,979,335–62,987,931	1	2.27 × 10⁻⁷	0.97
	VAMP8	85,577,649–85,581,576	31	9.23 × 10⁻⁷	1.00
	VAMP5	85,584,493–85,591,725	4	4.17 × 10⁻⁶	1.00
		85,584,493–85,592,948	5	2.46 × 10⁻⁸	1.00
	MLPH	237,527,516–237,534,564	9	6.34 × 10⁻¹¹	0.99
3	GATA2-AS1	128,490,069–128,501,609	2	3.56 × 10⁻⁶	1.00
4	PPA2	105,399,164–105,446,383	1	6.74 × 10⁻⁴	1.00
7	HOTTIP	27,202,638–27,204,808	1	2.24 × 10⁻⁸	1.00
	HIBADH	27,632,445–27,649,473	3	7.21 × 10⁻⁹	0.99
	TRRAP	98,933,387–98,935,579	1	5.94 × 10⁻²⁵	1.00
		98,933,402–98,935,579	3	2.79 × 10⁻²⁰	1.00
9	C9orf78	129,828,252–129,829,205	2	1.83 × 10⁻⁶	0.96
11	KCNQ1	2,776,054–2,777,976	1	2.46 × 10⁻¹⁰	1.00
	PPP6R3	68,610,023–68,613,066	1	5.82 × 10⁻²⁸	1.00
	TPCN2	69,159,351–69,159,446	12	5.77 × 10⁻²³	1.00
16	DNAJA3	4,446,084–4,446,886	1	2.39 × 10⁻⁴	1.00
	ANKS3	4,698,941–4,699,052	1	4.07 × 10⁻⁶	1.00
17	SYNRG	37,538,420–37,539,192	1	3.21 × 10⁻⁴	1.00
	SUPT4H1	58,347,241–58,351,402	2	3.21 × 10⁻⁷	1.00
19	PRRG2	49,583,304–49,583,542	1	8.99 × 10⁻⁷	1.00

TWAS, transcriptome-wide association study.

IPA results

“Core Analysis” function of IPA (version 01-20-04; Ingenuity System, Inc.), including “Canonical Pathway,” “Disease and Functions,” and “Network” analyses, was performed for the 97 genes identified in our spTWAS. For the “Disease and Functions,” we mainly focused on cancer, lipid metabolism, and PCa-related diseases and function categories. Twenty-two genes (ABHD12, ACACA, ADAM15, BCKDHA, CASP8, CREB3L4, FAAH, FARP2, GNB1L, HIBADH, MKNK1, NTN5, PLD3, PNPLA2, PRKAR1B, PTGR3, SEC61A1, SEC61A2, SHMT1, SIRT2, VAMP8, and YWHAQ) enriched in 32 canonical pathways are shown in Supplementary Table S5. These canonical pathways contain triacylglycerol degradation (enriched genes ABHD12, FAAH, and PNPLA2; p = 1.66 × 10⁻³) and autophagy (enriched genes CREB3L4, GNB1L, PRKAR1B, and VAMP8; p = 1.07 × 10⁻²). Previous research has supported that autophagy can regulate lipolysis and cell survival by degrading lipid droplets in PCa cells (Kaini et al., 2012).

Based on the “Disease and Functions” analysis, 89 identified genes were enriched for cancer, lipid metabolism, and 2 PCa-related functional categories (p < 0.05) (Supplementary Table S6). Functions of 89 associated genes were related to cancer, 16 were related to lipid metabolism, and 33 were related to PCa. Except for NSUN5P2, MIR22HG, and ZADH2, 31 novel genes (TCEA3, XKR8, SNAP47, THAP4, HEMK1, CNBP, PRKAR1B, GTPBP10, LRGUK, HABP4, LCN2, PNPLA2, SCYL1, LOXL1, DNAJA3, ANKS3, RAI1, SHMT1, ACACA, DDX42, SSBP4, U2AF1L4, ARHGAP33, PLD3, B9D2, BCKDHA, DMPK, NTN5, SNTA1, PCBP3, and SLC2A11) were enriched in cancer categories (p = 3.29 × 10⁻²–9.97 × 10⁻⁶). Reassuringly, nine of them, namely, SNAP47, LCN2, SCYL1, DNAJA3, RAI1, DDX42, PLD3, BCKDHA, and DMPK, were also enriched in the PCa functional category (p = 1.17 × 10⁻²).

Based on the “Network” analysis, six networks of spTWAS-identified genes (Score ≥13) and three networks of spTWAS-identified novel genes were identified (Score ≥15) (Supplementary Table S7 and Supplementary Figs. S1 and S2). Interestingly, 11 novel PCa susceptibility genes (ACACA, BCKDHA, DNAJA3, GTPBP10, HABP4, LCN2, LOXL1, NSUN5P2, PLD3, PNPLA2, and SLC2A11) were enriched in a cancer-related network: cancer, dermatological diseases and conditions, organismal injury, and abnormalities (score = 25, Fig. 2). Five genes, namely, ACACA, DNAJA3, LCN2, LOXL1, and PNPLA2 were in the node of this network.

FIG. 2.

Splicing TWAS identified 11 novel genes enriched in a cancer-related network. Eleven novel genes were highlighted in the solid dark line. TWAS, transcriptome-wide association study.

Discussion

In this study, we conducted a comprehensive spTWAS to identify candidate splicing introns with genetically predicted expression to be associated with PCa risk. The spTWAS analysis identified 97 genes, including 34 novel genes that were not reported in any previous TWAS and/or spTWAS (Emami et al., 2019; He et al., 2022; Liu et al., 2022; Mancuso et al., 2018; Wu et al., 2019b). The identified genes were enriched in triacylglycerol degradation and autophagy pathways, and lipid metabolism, cancer, and PCa functional categories, which provides novel knowledge for better understanding the etiology of PCa.

Some of the novel genes identified in our study have already been reported in the literature as potentially playing a role in PCa development. For example, TCEA3 encodes transcription elongation factor A3, which is a downregulated protein in PCa tissue, and the minor intron splicing efficiency of TCEA3 could increase the development of lethal PCa (Augspach et al., 2021). Another gene ACACA encodes acetyl-CoA carboxylase alpha, which is one of the rate-limiting enzymes for fatty acid synthesis (Shafi et al., 2015). Compared with normal tissue, high expression of ACACA in PCa tumor tissue was reported in PCa (Swinnen et al., 2000). As a result, downregulation of ACACA could inhibit the malignant progression of PCa (Zhang et al., 2021). Further functional studies are needed to better characterize the potential roles of these genes and splicing introns in prostate tumorigenesis.

In PCa cells, lipid degradation can promote cell survival (Itkonen et al., 2017) by providing a main bioenergetic pathway through fatty acid oxidation (Liu, 2006). In “Canonical Pathway” analysis, the top enriched canonical pathway was triacylglycerol degradation. Three genes (ABHD12, FAAH, and PNPLA2) were enriched in this canonical pathway. Besides these 3 genes, other 13 genes (ACACA, ACTN4, CASP8, CERS2, CNBP, LCN2, MKNK1, NPNT, PIBF1, SIRT2, SLC45A3, THADA, and XKR8) involved in lipid metabolism were uncovered by “Disease and Functions” analysis. These genes were involved in triacylglycerol degradation (such as ABHD12, FAAH, and PNPLA2), fatty acid synthesis (such as ACACA), and lipid transport (such as LCN2). The above results suggested that lipid metabolism might play a key role in PCa progression (Wu et al., 2014).

Of the identified genes, most (89/97) were implicated in cancer. It should be emphasized that the functions of nine novel genes (SNAP47, LCN2, SCYL1, DNAJA3, RAI1, DDX42, PLD3, BCKDHA, and DMPK) had been linked to PCa. Moreover, 11 novel PCa susceptibility genes were enriched in a cancer-related network. In particular, five genes (ACACA, DNAJA3, LCN2, LOXL1, and PNPLA2) were in the node of this network, implicating potential roles of these genes in PCa development. The splicing introns of the abovementioned genes could potentially serve as candidate markers for PCa risk, and further studies are needed to better characterize them.

Three methods for developing splicing expression genetic prediction models, namely, LASSO, MCP, and ENET, were leveraged in the current study. This provides a valuable opportunity to detect PCa-associated splicing introns with different genetic regulatory mechanisms. LASSO, MCP, and ENET are popular penalization methods, each of which performs optimally under different genetic architectures. To further enhance the prediction accuracy, we incorporated enhancer regions as they play key regulatory roles in splicing (Lee and Rio, 2015). Our study leveraging these complementary models can lead to more powerful discovery.

There are several strengths of the current study. We leveraged splicing expression prediction models of normal prostate tissue instead of prostate tumor tissue. For many introns, the splicing expression patterns change during tumor development (Ghigna et al., 2008). The directions of associations for a large proportion of the identified associated splicing introns were consistent across the ENET, LASSO, and MCP models (Supplementary Table S8), indicating that our findings are robust. The strategies we used helped to identify more genes than those identified in He et al. (2022), in which a splicing susceptible transcription factor (sTF) TWAS was conducted and six novel genes (NOL10, WTAP, BRI3, USP39, SNRPC, and CCND1) were reported.

On the contrary, several potential limitations need be considered for appropriately the interpretating results of the present study. One limitation is that the identified associations may not necessarily imply causality. This is a limitation known to exist for the conventional TWAS design (Wainberg et al., 2019). Future functional investigation would be needed to better understand whether the identified splicing introns and genes play a causal role in prostate tumorigenesis. Another limitation is that we were not able to evaluate whether the associations of our identified splicing introns differ according to family history of PCa and tumor stage/grade due to a lack of relevant information. Additional work investigating these is needed to better understand their relationship. A third limitation is that other potential factors, such as smoking, alcohol consumption, obesity, and family history of PCa, were not included in our model building, which may confound our association results.

Further study is needed to better evaluate the potential influence of such factors in detecting PCa-related splicing introns.

Conclusions

In summary, we performed a comprehensive spTWAS and identified novel susceptibility splicing introns for PCa risk. We detected associated splicing introns located in 34 novel genes for PCa risk. Our study identifies novel genes and splicing introns for PCa risk and our findings improve the etiology understanding of this common malignancy.

Footnotes

Authors' Contributions

L.W. and C.W. conceived and jointly supervised the study. Y.S., Y.E.B., and J.Z. contributed to the study design, performed statistical analyses, and wrote the article. Z.Z., H.Z., and C.C. contributed to data analysis and result interpretation. All authors contributed to the article revision and approved the final article.

Author Disclosure Statement

The authors declare they have no conflicting financial interests.

Funding Information

This study is supported by the University of Hawaii Cancer Center, and the Teacher Training Project of Longyan University, and the Provincial Key Science and Technology Project jointly funded by the Fujian Provincial Department of Industry and Information and the Fujian Provincial Department of Education, China (grant 2021G02015).

Supplementary Material

Abbreviations Used

References

Augspach

, Drake

, Roma

, et al. Minor intron splicing efficiency increases with the development of lethal prostate cancer. bioRxiv, 2021; doi: 10.1101/2021.12.09.471104

Baselmans

, Jansen

, Ip

, et al. Multivariate genome-wide analyses of the well-being spectrum. Nat Genet, 2019; 51(3):445–451; doi: 10.1038/s41588-018-0320-8

Boutros

, Fraser

, Harding

, et al. Spatial genomic heterogeneity within localized, multifocal prostate cancer. Nat Genet, 2015; 47(7):736–745; doi: 10.1038/ng.3315

Bradley

, Anczukow

. RNA splicing dysregulation and the hallmarks of cancer. Nat Rev Cancer, 2023; 23(3):135–155; doi: 10.1038/s41568-022-00541-7

Brikun

, Nusskern

, Decatus

, et al. A panel of DNA methylation markers for the detection of prostate cancer from FV and DRE urine DNA. Clin Epigenetics, 2018; 10:91; doi: 10.1186/s13148-018-0524-x

Carstens

, Eaton

, Krigman

, et al. Alternative splicing of fibroblast growth factor receptor 2 (FGF-R2) in human prostate cancer. Oncogene, 1997; 15(25):3059–3065; doi: 10.1038/sj.onc.1201498

Chen

, Weiss

. Alternative splicing in cancer: Implications for biology and therapy. Oncogene, 2015; 34(1):1–14; doi: 10.1038/onc.2013.570

Chen

, Hu

. Identification of prognosis biomarkers of prostatic cancer in a cohort of 498 patients from TCGA. Curr Probl Cancer, 2019; 43(6):100503; doi: 10.1016/j.currproblcancer.2019.100503

Cheung

, de Rooy

, Levinger

, et al. Actin alpha cardiac muscle 1 gene expression is upregulated in the skeletal muscle of men undergoing androgen deprivation therapy for prostate cancer. J Steroid Biochem Mol Biol, 2017; 174:56–64; doi: 10.1016/j.jsbmb.2017.07.029

10.

Culp

, Soerjomataram

, Efstathiou

, et al. Recent global patterns in prostate cancer incidence and mortality rates. Eur Urol, 2020; 77(1):38–52; doi: 10.1016/j.eururo.2019.08.005

11.

Ding

, Fang

, Tong

, et al. Over-expression of lipocalin 2 promotes cell migration and invasion through activating ERK signaling to increase SLUG expression in prostate cancer. Prostate, 2015; 75(9):957–968; doi: 10.1002/pros.22978

12.

Eeles

, Kote-Jarai

, Giles

, et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet, 2008; 40(3):316; doi: 10.1038/ng.90

13.

Emami

, Kachuri

, Meyers

, et al. Association of imputed prostate cancer transcriptome with disease risk reveals novel mechanisms. Nat Commun, 2019; 10(1):1–11; doi: 10.1038/s41467-019-10808-7

14.

Fiorica

, Schubert

, Morris

, et al. Multi-ethnic transcriptome-wide association study of prostate cancer. PLoS One, 2020; 15(9):e0236209; doi: 10.1371/journal.pone.0236209

15.

Fishilevich

, Nudel

, Rappaport

, et al. GeneHancer: Genome-wide integration of enhancers and target genes in GeneCards. Database (Oxford), 2017; 2017:bax028; doi: 10.1093/database/bax028

16.

Ghigna

, Valacca

, Biamonti

. Alternative splicing and tumor progression. Curr Genomics, 2008; 9(8):556–570; doi: 10.2174/138920208786847971

17.

Gusev

, Ko

, Shi

, et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat Genet, 2016; 48(3):245; doi: 10.1038/ng.3506

18.

Gusev

, Lawrenson

, Lin

, et al. A transcriptome-wide association study of high-grade serous epithelial ovarian cancer identifies new susceptibility genes and splice variants. Nat Genet, 2019; 51(5):815–823; doi: 10.1038/s41588-019-0395-x

19.

Gusev

, Mancuso

, Won

, et al. Transcriptome-wide association study of schizophrenia and chromatin activity yields mechanistic disease insights. Nat Genet, 2018; 50(4):538–548; doi: 10.1038/s41588-018-0092-1

20.

, Wen

, Beeghly

, et al. Integrating transcription factor occupancy with transcriptome-wide association analysis identifies susceptibility genes in human cancers. Nat Commun, 2022; 13(1):7118; doi: 10.1038/s41467-022-34888-0

21.

Itkonen

, Brown

, Urbanucci

, et al. Lipid degradation promotes prostate cancer cell survival. Oncotarget, 2017; 8(24):38264–38275; doi: 10.18632/oncotarget.16123

22.

Jin

, Jung

, DebRoy

, et al. Identification and validation of regulatory SNPs that modulate transcription factor chromatin binding and gene expression in prostate cancer. Oncotarget, 2016; 7(34):54616–54626; doi: 10.18632/oncotarget.10520

23.

, Oh

, Kim

, et al. A genetic variant in SLC28A3, rs56350726, is associated with progression to castration-resistant prostate cancer in a Korean population with metastatic prostate cancer. Oncotarget, 2017; 8(57):96893–96902; doi: 10.18632/oncotarget.18298

24.

Kaini

, Sillerud

, Zhaorigetu

, et al. Autophagy regulates lipolysis and cell survival through lipid droplet degradation in androgen-sensitive prostate cancer cells. Prostate, 2012; 72(13):1412–1422; doi: 10.1002/pros.22489

25.

Krämer

, Green

, Pollard

Jr.,

et al. Causal analysis approaches in ingenuity pathway analysis. Bioinformatics, 2014; 30(4):523–530; doi: 10.1093/bioinformatics/btt703

26.

Lee

, Rio

. Mechanisms and regulation of alternative pre-mRNA splicing. Annu Rev Biochem, 2015; 84:291–323; doi: 10.1146/annurev-biochem-060614-034316

27.

Liu

, Zhou

, Sun

, et al. A transcriptome-wide association study identifies candidate susceptibility genes for pancreatic cancer risk. Cancer Res, 2020; 80(20):4346–4354; doi: 10.1158/0008-5472.CAN-20-1353

28.

Liu

, Zhu

, Zhou

, et al. A transcriptome-wide association study identifies novel candidate susceptibility genes for prostate cancer risk. Int J Cancer, 2022; 150(1):80–90; doi: 10.1002/ijc.33808

29.

Liu

. Fatty acid oxidation is a dominant bioenergetic pathway in prostate cancer. Prostate Cancer Prostatic Dis, 2006; 9(3):230–234; doi: 10.1038/sj.pcan.4500879

30.

Mancuso

, Freund

, Johnson

, et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat Genet, 2019; 51(4):675–682; doi: 10.1038/s41588-019-0367-1

31.

Mancuso

, Gayther

, Gusev

, et al. Large-scale transcriptome-wide association study identifies new prostate cancer risk regions. Nat Commun, 2018; 9(1):1–11; doi: 10.1038/s41467-018-06302-1

32.

Marasco

, Kornblihtt

. The physiology of alternative splicing. Nat Rev Mol Cell Biol, 2023; 24(4):242–254; doi: 10.1038/s41580-022-00545-z

33.

Martinez-Marin

, Jarvis

, Nelius

, et al. PEDF increases the tumoricidal activity of macrophages towards prostate cancer cells in vitro. PLoS One, 2017; 12(4):e0174968; doi: 10.1371/journal.pone.0174968

34.

Martinez-Montiel

, Rosas-Murrieta

, Anaya Ruiz

, et al. Alternative splicing as a target for cancer treatment. Int J Mol Sci, 2018; 19(2):545; doi: 10.3390/ijms19020545

35.

Munkley

, Livermore

, Rajan

, et al. RNA splicing and splicing regulator changes in prostate cancer pathology. Hum Genet, 2017; 136(9):1143–1154; doi: 10.1007/s00439-017-1792-9

36.

Paschalis

, Sharp

, Welti

, et al. Alternative splicing in prostate cancer. Nat Rev Clin Oncol, 2018; 15(11):663–675; doi: 10.1038/s41571-018-0085-0

37.

Pernar

, Ebot

, Wilson

, et al. The epidemiology of prostate cancer. Cold Spring Harb Perspect Med, 2018; 8(12):a030361; doi: 10.1101/cshperspect.a030361

38.

Raj

, Li

, Wong

, et al. Integrative transcriptome analyses of the aging brain implicate altered splicing in Alzheimer's disease susceptibility. Nat Genet, 2018; 50(11):1584–1592; doi: 10.1038/s41588-018-0238-1

39.

Rajan

, Elliott

, Robson

, et al. Alternative splicing and biological heterogeneity in prostate cancer. Nat Rev Urol, 2009; 6(8):454–460; doi: 10.1038/nrurol.2009.125

40.

Roussigne

, Cayrol

, Clouaire

, et al. THAP1 is a nuclear proapoptotic factor that links prostate-apoptosis-response-4 (Par-4) to PML nuclear bodies. Oncogene, 2003; 22(16):2432–2442; doi: 10.1038/sj.onc.1206271

41.

Schaid

, Chen

, Larson

. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat Rev Genet, 2018; 19(8):491–504; doi: 10.1038/s41576-018-0016-z

42.

Schroder

, Pinoe-Schmidt

, Weiskirchen

. Lipocalin-2 (LCN2) deficiency leads to cellular changes in highly metastatic human prostate cancer cell line PC-3. Cells, 2022; 11(2):260; doi: 10.3390/cells11020260

43.

Schumacher

, Al Olama

, Berndt

, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet, 2018; 50(7):928–936; doi: 10.1038/s41588-018-0142-8

44.

Shafi

, Putluri

, Arnold

, et al. Differential regulation of metabolic pathways by androgen receptor (AR) and its constitutively active splice variant, AR-V7, in prostate cancer cells. Oncotarget, 2015; 6(31):31997–32012; doi: 10.18632/oncotarget.5585

45.

Shahabi

, Lewinger

, Ren

, et al. Novel gene expression signature predictive of clinical recurrence after radical prostatectomy in early stage prostate cancer patients. Prostate, 2016; 76(14):1239–1256; doi: 10.1002/pros.23211

46.

Stamm

, Ben-Ari

, Rafalska

, et al. Function of alternative splicing. Gene, 2005; 344:1–20; doi: 10.1016/j.gene.2004.10.022

47.

Svensson

, Ceder

, Iglesias-Gato

, et al. REST mediates androgen receptor actions on gene repression and predicts early recurrence of prostate cancer. Nucleic Acids Res, 2014; 42(2):999–1015; doi: 10.1093/nar/gkt921

48.

Swinnen

, Vanderhoydonc

, Elgamal

, et al. Selective activation of the fatty acid synthesis pathway in human prostate cancer. Int J Cancer, 2000; 88(2):176–179; doi: 10.1002/1097-0215(20001015)88:2<176::aid-ijc5>3.0.co;2-3

49.

Wainberg

, Sinnott-Armstrong

, Mancuso

, et al. Opportunities and challenges for transcriptome-wide association studies. Nat Genet, 2019; 51(4):592–599; doi: 10.1038/s41588-019-0385-z

50.

Wang

, Lin

, Yan

, et al. Identification of a robust five-gene risk model in prostate cancer: A robust likelihood-based survival analysis. Int J Genomics, 2020; 2020:1097602; doi: 10.1155/2020/1097602

51.

, Shi

, Long

, et al. A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer. Nat Genet, 2018; 50(7):968; doi: 10.1038/s41588-018-0132-x

52.

, Shu

, Bao

, et al. Analysis of over 140,000 European descendants identifies genetically predicted blood protein biomarkers associated with prostate cancer risk. Cancer Res, 2019a;79(18):4592–4598; doi: 10.1158/0008-5472.CAN-18-3997

53.

, Wang

, Cai

, et al. Identification of novel susceptibility loci and genes for prostate cancer risk: A transcriptome-wide association study in over 140,000 European descendants. Cancer Res, 2019b;79(13):3192–3204; doi: 10.1158/0008-5472.CAN-18-3536

54.

, Yang

, Guo

, et al. An integrative multi-omics analysis to identify candidate DNA methylation biomarkers related to prostate cancer risk. Nat Commun, 2020; 11(1):1–11; doi: 10.1038/s41467-020-17673-9

55.

, Daniels

, Lee

, et al. Lipid metabolism in prostate cancer. Am J Clin Exp Urol, 2014; 2(2):111–120.

56.

Zahalka

, Arnal-Estape

, Maryanovich

, et al. Adrenergic nerves activate an angio-metabolic switch in prostate cancer. Science, 2017; 358(6361):321–326; doi: 10.1126/science.aah5072

57.

Zhang

C-H

. Nearly unbiased variable selection under minimax concave penalty. Ann Stat, 2010; 38(2):894–942; doi: 10.1214/09-AOS729

58.

Zhang

, Liu

, Cai

, et al. Down-regulation of ACACA suppresses the malignant progression of prostate cancer through inhibiting mitochondrial potential. J Cancer, 2021; 12(1):232–243; doi: 10.7150/jca.49560

59.

Zhang

, Ding

, Peng

, et al. Identification of biomarkers for immunotherapy response in prostate cancer and potential drugs to alleviate immunosuppression. Aging (Albany NY), 2022; 14(11):4839–4857; doi: 10.18632/aging.204115

60.

Zhao

, Chang

, Gu

, et al. Systematic profiling of alternative splicing signature reveals prognostic predictor for prostate cancer. Cancer Sci, 2020; 111(8):3020–3031; doi: 10.1111/cas.14525

61.

Zou

, Hastie

. Regularization and variable selection via the elastic net. J R Stat Soc B Stat Methodol, 2005; 67(2):301–320; doi: 10.1111/j.1467-9868.2005.00527.x

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.69 MB

0.05 MB

0.35 MB

0.05 MB

0.03 MB

0.02 MB

0.04 MB

0.01 MB

0.02 MB