Abstract
Abstract
In an effort of validating the zebrafish model for studies of human tumors, our previous transcriptome analyses revealed striking molecular similarities between zebrafish and human liver neoplasia. However, as biological processes function at modular levels such as pathways and cascades, it is also important to capture conservation at the modular levels and in regulatory program(s) controlling these modules. In this study, we performed comparative transcriptome analyses with two modules, biological modules and transcription factor target modules, using gene set enrichment analysis against carcinogen-induced liver tumors in zebrafish with four tissue types of human tumors. We observed conservation of enriched modules that are associated to tumorigenesis such as cell cycle, metastasis, and hypoxia in these tumors. More importantly, we identified conserved regulatory programs linking these cancer-related modules with transcription factors and oncogenes such as Myc, E2F, STAT, and YY1. Taken together, our analyses revealed that carcinogen-induced liver tumors in zebrafish capture cancerous hallmarks observed in human tumors not only for cancer-related biological modules but also at transcriptional program, further implicating conserved tumorigenesis mechanism in both zebrafish and human.
Introduction
From our previous transcriptome analyses, we have shown molecular conservation of similarities between zebrafish liver tumor and human liver neoplasia.4,5 However, the complex functions in a biological system are often realized via concerted activities of genes whose roles are integrated and coordinated by common regulatory modules or sets of coregulated genes that shared common function. Thus, gene expression profiles are sometime referred to as “signatures” or “portrait,” as these profiles show expression patterns that are unique and discernable as to paintings. 3 Do tumors in zebrafish capture conservation at these modular levels and a regulatory program that controlled the activities of these modules? In this study, we performed transcriptome comparative analyses for zebrafish tumor data with four human tumor types, that is liver,6,7 gastric, 8 prostate,9,10 and lung.2,11 Overall, our analyses revealed conserved molecular modules related to cancerous hallmarks as well as a transcriptional regulatory program that regulates transcription of genes in these modules, suggesting conserved tumorigenesis mechanism in both zebrafish liver tumor and human tumors.
Materials and Methods
Generation of carcinogen-induced liver tumor and transcriptome in the zebrafish
The acquisition of carcinogen-induced liver tumor in zebrafish and microarray study have been described in our previous work. 4 Briefly, zebrafish fry were treated with 7,12-dimethylbenz(a)anthracene or dibenzo(a,l)pyrene, and 10 tumor samples were collected for microarray analyses. These tumors include hepatocellular carcinoma, cholangiocellular carcinoma, hepatocellular adenoma, and mixed carcinoma. The zebrafish microarray data have been submitted to the National Center for Biotechnology Information Gene Expression Omnibus database (GEO Accession Number: GSE 3519) and is compliant with MIAME standard. The sources and acquisition of microarray data of human cancers used in this study have also been described in our previous work. 4
Definition of modules
Two modules are used in this study, biological module and transcription factor target (TFT) module that constitute a part of the Molecular Signature Database (MSigDB) in the gene set enrichment analysis (GSEA) website (www.broad.mit.edu/gsea/msigdb/index.jsp). The biological module consists of genes that function in same canonical pathways, or coexpressed genes in response to chemical and/or genetic perturbations. This module contains a total of 1892 gene sets collected from various sources such as online databases, publications in PubMed, and knowledge from expert domains; there are gene sets represented in same pathways but were from different sources with slightly different gene identity. The second module is the TFT module that contains genes that share a transcription factor binding site as definedin the TRANSFAC database (version 7.4, www.gene-regulation.com/). There are 500 gene sets in this module. Detailed description of genes for each gene set in these two categories of modules can be found in the MSigDB website of GSEA (www.broad.mit.edu/gsea/msigdb/index.jsp).
Analysis of whole transcriptomics profile using GSEA
GSEA, as described in detail by Subramanian et al., 12 was used. The zebrafish genes were mapped to human homologous genes as previously described. 4 The human homologs of zebrafish genes from the carcinogen-induced tumor transcriptome data and data from four types of human tumors are ranked based on their p-values using Student's t-test. The “GSEAPreranked” option of GSEA was used. The ranking metric used was log10 (1/P) where P is the p-value of a gene from microarray data. Up-regulated genes will carry positive values of log10 (1/P), whereas down-regulated genes will carry negative values of log10 (1/P). The genes were then ranked in descending order based on values of log10 (1/P).
An enrichment score (ES) that reflects the degree to which a predefined gene set is overrepresented at the top or bottom rank of the ranked whole transcriptome profile was calculated by walking down the ranked profile, increasing a running-sum statistic when a gene in a predefined gene set was encountered, and decreasing it when the gene was absent. The ES is the maximum deviation from zero encountered in the random walk and corresponds to a weighted Kolmogorov-Smirnov-like statistic. The statistical significance (nominal p-value) of the ES was estimated using an empirical phenotype-based permutation test procedure. The phenotype labels were permuted and the ES of the gene set for the permuted data were recomputed that generate a null distribution for the ES. The empirical, nominal p-value of the observed ES was then calculated relative to this null distribution. The estimated significance level was adjusted with multiple hypothesis testing. The ES for each gene set was first normalized to the size of the set yielding a normalized ES (NES) with the following relation:
The number of permutation used was 1000. Modules with nominal p-value <0.05 were considered statistically significant. Positive and negative values of NESs indicate the activities of pathways that are up- and down-regulated, respectively.
Results and Discussion
Identification of enriched molecular modules involved in tumorigenesis from transcriptome data
There are five categories of modules available from the GSEA website (www.broad.mit.edu/gsea/msigdb/index.jsp): C1 for positional gene sets for chromosomal loci of genes, C2 for curated gene sets, C3 for motif gene sets, C4 for computational gene sets, and C5 for gene ontology gene sets. In this study, we aimed to explore conservation of modules for pathways, signatures associated to various perturbations such as gene knockdown as well as TFTs in tumors of zebrafish and human; thus, modules from C1 and C4 are not relevant here. Since our previous analyses4,5 had already explored cancerous hallmarks in zebrafish at the gene ontology level, modules from C5 category were not used in this analysis; only biological and TFT modules from C2 and C3 categories were used in this study. The definitions of the two modules are described in Materials and Methods section. Activities of modules are indicated with their respective NESs. Positive and negative values of NES indicate up-regulation and down-regulation of a module, respectively. Detailed description is given in the Materials and Methods section.
Conserved biological modules between the zebrafish liver tumor and various human tumors
Comparison of up- and down-regulated enriched modules between the zebrafish liver tumor and human tumors are shown in Figure 1A. Enriched conserved modules in both human datasets in liver, lung, gastric, and prostate are called liver conserved modules (LvCMs), lung conserved modules (LgCMs), gastric conserved modules (GsCMs), and prostates conserved modules (PrCMs), respectively. Identities of all these conserved up-regulated and down-regulated modules are listed in Supplemental Tables S1 and S2 (available online at www.liebertonline.com), respectively.

Conserved biological modules between the zebrafish liver tumor and various human tumors. (
Numbers of enriched modules shown in Figure 1A suggest closest molecular portraits of zebrafish and human liver tumors. Meanwhile, they also show that molecular portraits of liver tumors are similar to lung and gastric tumors but least similar to prostate tumors. This implicates that cancerous hallmarks of liver tumors are similar to those of lung and gastric tumors. However, there are more common enriched modules within same type of tumors from different laboratories; this may reflect that the gene expressed in the same cell type has been enriched compared with other cell types in normal tissue.
Enriched up- and down-regulated biological LvCMs are compared with lung, gastric, and prostate tumors in human. Comparative results shown in Figure 1B and C indicate that some LvCMs are also enriched mainly in lung and gastric tumors but least in prostate tumors, further implicating that molecular portraits of liver tumors are more similar to those of lung and gastric than to prostate tumors. Up-regulated LvCMs involve in categories such as cell cycle and DNA replication, survival and tumorigenesis, mRNA and protein syntheses, hypoxia, and proteasome. Categories of down-regulated LvCMs are liver functions, fatty acid metabolism, amino acid metabolism, and survival and tumorigenesis. A list of leading edge genes (genes whose expression is significant and contribute to calculation of ES in module enrichment for up-regulated biological LvCMs) to at least two other human tumor types are given in Supplemental Table S3. These highly conserved LvCMs in other human tumor types are mainly involved in cell cycle, DNA replication, metastasis, mRNA processing, and disruption by chemicals.
Up-regulation of mRNA processing and translation modules in LvCMs are characteristics of rapid cell growth and proliferation in liver tumors. 13 Besides, down-regulation of modules associated to liver function indicates disruption of liver function in liver tumors that can be due to undifferentiated state of tumor formation as suggested in up-regulation of modules associated to metastasis and undifferentiation (Fig. 1B, C). Deregulation of liver functions during tumor progression supports our observation of down-regulation of genes associated with liver function modules.6,13,14
Conserved modules of TFT in both zebrafish liver tumor and human tumors revealed core regulatory mechanism in tumorigenesis
Comparisons of up- and down-regulated enriched TFT modules between the zebrafish liver tumor and human tumors are shown in Figure 2A. Among them, only up-regulated TFT modules show conservation for all human tumors with zebrafish liver tumor. These conserved TFT modules are called TFT LvCMs, TFT LgCMs, TFT GsCMs, and TFT PrCMs for conserved TFT modules in liver, lung, gastric, and prostate, respectively, in both human datasets. Numbers of enriched TFT modules in Figure 2A consistently suggest that molecular portraits of zebrafish liver tumor are similar to lung tumors, followed by gastric tumors, and least similar to prostate tumors as indicated by conserved biological modules. This implicates that regulatory program in tumorigenesis in liver tumors is similar to that in lung and gastric tumors.

Conserved transcription factor target (TFT) modules between the zebrafish liver tumor and various human tumors. (
TFT LvCMs are compared with TFT LgCMs, TFT GsCMs, and TFT PrCMs. Most of these TFT LvCMs are also conserved in one or two other human tumor types, indicating similar transcriptional regulatory programs of tumorigenesis by some transcription factors such as E2F and NRF2. As shown in Figure 2B, many TFT modules associated with E2F are up-regulated in more than one type of tumor. The transcription factor E2F was shown to play a crucial role in governing cell proliferation through manipulation of the expression of many genes required for cell cycle progression.15,16 Also, expression of E2F is regulated by Myc. Thus, E2F may play important role in tumorigenesis for these tumors, and its regulated genes can be used as hallmarks to understand tumor progression and diagnostic development. It is also of interest to further characterize the tumorigenic roles of Myc and E2F in cancerous hallmarking biological modules in zebrafish.
Conserved transcriptional regulatory programs linking to cancerous hallmark characteristics in zebrafish liver tumor
Overexpression of the Myc proto-oncogene was implicated in the pathogenesis of most types of human cancers. 17 Numerous studies have shown that Myc is involved in diverse cancers, and its continuous expression is required for maintaining the transformed state of cancers.18–21 Highly invasive and malignant liver cancers exhibit rapid and sustained tumor regression upon Myc inactivation, and loss of Myc expression resulted in the differentiation of tumors as well as activation of apoptosis processes. 21 However, reactivation of Myc was capable of immediately restoring neoplastic properties in hepatocellular carcinoma in mice, showing that Myc acts synergistically with other oncogene such as Ras at early and intermediate stages to promote tumorigenesis.21,22
Myc-associated modules such as MENSSEN_MYC_UP, MYC_TARGETS, and ZELLER_MYC_UP are up-regulated (Supplemental Table S1) and LEE_MYC_DN, LEE_MYC_E2F1_DN, and LEE_MYC_TGFA_DN are down-regulated (Supplemental Table S2) in both zebrafish liver tumor and two human liver tumors.6,7 However, only one enriched Myc-regulated module (MENSSEN_MYC_UP or MYC_ONCOGENIC_SIGNATURE) is up-regulated between zebrafish liver tumor and human lung2,11 or gastric tumors,4,8 respectively. No Myc-regulated modules are enriched in human prostate cancers compared with zebrafish liver tumor. This observation is consistent with results from analyses of other research groups, showing that Myc is disproportionately overexpressed in specific tumor types. 23
Activity profiles of transcription factors as suggested in TFT LvCMs were used to compare with biological LvCMs in the zebrafish. Representative modules were used as there are more than one module representing the activity of these transcription factors. V$E2F1_Q6, V$YY1_02, V$STAT1_02, V$GABP_B, V$ELK1_02, V$NRF2_01, V$SREBP1_01, and V$ARNT_02 were used as representative TFT modules for E2F1, YY1, STAT1, GABPB, ELK1, NRF1, SREBP1, and ARNT, respectively. Leading edge genes that were common between biological and TFT LvCMs in zebrafish were noted, and the results are shown in Figure 3. Gradient of gray boxes indicates the number of common genes in these two categories of modules, with light and dark colors representing 1–2 and 3–7 common genes, respectively. Biological modules such as RCC_NL_UP, PENG_RAPAMYCIN_DN, PENG_GLUTAMINE_DN, LEI_MYB_REGULATED_GENES, BASSO_REGULATORY_HUBS, as well as CHANG_SERUM_RESPONSE_UP are those modules whose associated genes are regulated by diverse transcription factors in TFT LvCMs, suggesting that these modules are more liver specific and may not involve in wide cancerous mechanism as suggested by a lower number of shared modules to other LgCMs, GsCMs, and PrCMs in Figure 1B and C.

Comparative analyses of enriched up-regulated biological and TFT of LvCMs (biological LvCMs and TFT LvCMs) of zebrafish liver tumor. Only biological LvCMs that show conservation to at least one TFT LvCM are shown. Gradient of gray boxes indicates the number of common genes in these two categories of modules, with light gray and dark gray indicating 1–2 and 3–7 common genes, respectively.
E2F is a transcriptional target of Myc24,25 and is required to elicit the onset of S phase in cell cycle, 26 regulation of RNA processing, 27 chromatin remodeling, 28 hypoxia, 29 and metastasis 30 in tumorigenesis. This is consistent with our transcriptome analyses linking E2F to most enriched biological modules associated to cancerous hallmarks as shown in Figure 3. Wide connections of E2F to these cancerous hallmark modules also support the observation that E2Fs were disproportionately overexpressed in diverse tumor types. 22
Interestingly, YY1, a ubiquitous transcription factor known to have a fundamental role in normal biological processes such as embryogenesis, differentiation, replication, and cellular proliferation as well as tumorigenesis,31,32 also shows connections with many cancerous hallmarks of LvCMs in our study. Also, YY1 was shown to correlate to Myc in inducing hepatocarcinogenesis in mice, 33 further implicating synergistic roles of YY1 to enhance tumorigenesis with Myc and E2F.
Besides, results in Figure 3 also identified STAT in regulating tumorigenesis processes such as hypoxia, metastasis, mRNA processing, and protein synthesis. STATs are latent cytoplasmic transcription factors that convey signals from cytokine and growth factor receptors to the nucleus and are known to frequently overactivated in a variety of human solid tumors. 34 It has been shown that constitutive activation of STAT is accompanied by up-regulation of Myc. 35
In addition, transcription factors such as GABP, ELK, NRF, SREBP, and ARNT are also linked to, although not widely connected to, cancerous hallmarks of LvCMs as in E2F. GABP has been shown to involve in breast cancers, 36 and ELK to a downstream effector of MAPK pathway in prostate cancer. 37 NRF, SREBP, and ARNT are transcription factors involved in cellular metabolism such as NRF in biogenesis, 38 SREBP in adipogenesis, 39 and ARNT in xenobiotic metabolism. 40 Activation of genes regulated by these transcription factors may activate pathways that serve to regulate these cancerous processes to maintain the transformation state of tumor in liver.
Taken together, our data suggest that zebrafish liver tumor cells have acquired essential characteristics of cancers that are conserved to human liver tumors at the level of biological and TFT modules. While the general transcriptome characteristics largely reflect the outcome of physiology/pathology, this study also focused on the analyses of TFT modules to explore more upstream events and thus provided some mechanistic insights into the physiology/pathology. This implicates a conserved core transcriptional regulatory mechanism of tumorigenesis that involves interplay and synergistic activities of Myc, E2F, YY1, and STAT in both zebrafish and human. Thus, further mechanistic study using both reverse and forward genetics approaches can be used in zebrafish to further validate the regulatory mechanism of these modules in liver tumorigenesis.
Footnotes
Acknowledgment
This work was supported by Biomedical Research Council of Singapore.
Disclosure Statement
No competing financial interests exist.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
