Abstract
Background
Epigenetics is an important mRNA expression regulator. However, how distinct epigenetic factors, such as microRNAs (miRNAs) and promoter methylation, cooperatively regulate mRNA expression is rarely discussed. Recently, apparent miRNA regulation of promoter methylation was identified by bioinformatic analysis; however, it has not yet been experimentally confirmed. If miRNA regulation of other epigenetic factors were identified, it would reveal another layer of epigenetic regulation. In this paper, histone modifications (H3K4me1, H3K4me3, H3K27me3, H3K27ac, H3K9ac, and H2AZ) during mammalian spermatogenesis were studied and the apparent miRNA-target-specific histone modification was investigated by bioinformatic analyses of publicly available datasets.
Results
We identified several miRNAs’ target genes that are significantly associated with histone modification during mammalian spermatogenesis. MiRNAs that target genes associated with the most significant histone modifications are expressed before or during spermatogenesis; thus the results were convincing.
Conclusions
In this paper, we identified apparent miRNA regulation of histone modifications using a bioinformatics approach. The biological mechanisms of this effect should be further experimentally investigated.
Introduction
Epigenetic regulation of messenger RNA (mRNA) expression is involved in a wide variety of biological events including disease and cancer progression, development, and even evolution. Although epigenetics were recognized as important events several years ago, our present understanding of their mechanisms remains incomplete. One reason for this deficiency is our lack of knowledge about cooperative mechanisms between distinct epigenetic factors. For example, although both microRNAs (miRNAs) and promoter methylation are related to the suppression of mRNA expression, whether these two factors regulate mRNA expression cooperatively is not known. Although histone modification can both suppress and trigger mRNA expression, it is not known what would result from simultaneous suppressive and expressive histone modification. Without this kind of knowledge, our understanding of how epigenetic factors regulate target genes will remain incomplete. Recently, Su et al. 1 found that miRNAs more frequently target genes with less methylated promoters. More recently, the author used bioinformatics to identify apparent reciprocal regulation of target genes between promoter methylation and miRNAs,2,3 which has some support from the evolutionary point of view. 4 Thus, seeking more examples of apparent miRNA regulation of other epigenetic factors will deepen our understanding of the epigenetic regulation of mRNAs.
Histone modification is believed to affect mRNA expression through chromatin remodeling. For example, promoter methylation and methylation of histone are generally supposed to suppress transcription of genes. As both are equally methylated, it is possible that miRNA regulation of both processes exists. A relationship between promoter methylation and histone modification has been suggested.5,6
In this paper, we analyze publicly available histone modification data obtained during mammalian spermatogenesis. 7 We targeted spermatogenesis because histone modification is believed to play critical roles in this process 8 and miRNAs are also thought to be generally important for differentiation. 9 Thus, cooperative regulation of mRNAs between miRNAs and histone modification likely takes place during spermatogenesis.
Various histone modifications of multiple development stages in spermatogenesis were investigated for being targeted by miRNAs. We found that the targeted genes of most miRNAs were significantly associated with more or less histone modification during spermatogenesis. The frequent observation of apparent miRNA-target-specific histone modification reflected the apparent dependence of histone modification on the frequency of being targeted by an individual miRNA. In addition, miRNAs whose target genes are most significantly associated with histone modification around the transcription start site (TSS) were previously reported to be expressed before or during spermatogenesis. This supported our analysis, and experimental verification of these results is awaited.
Before reporting our findings, we would like to emphasize that the primary known functions of miRNA regulation of target genes take place only post-transcriptionally; ie, in the cytoplasm. miRNAs function via base-pairing with complementary sequences within mRNA molecules. As a result, these mRNA molecules are silenced by one or more of the following processes: 1) cleavage of the mRNA strand into two pieces, 2) destabilization of the mRNA through shortening of its poly(A) tail, and 3) less efficient translation of the mRNA into proteins by ribosomes. Thus, at the moment, there are no known mechanisms that directly relate apparent miRNA-target-specific histone modification to their functionalities.
Results and Discussion
Mouse miRNAs whose target genes are associated with differential histone modification during spermatogenesis around the TSS
The number of mouse miRNAs whose target genes are associated with significantly more or less modified histones, as inferred by the MiRaGE method. Statistical tests employed were the Kolmogorov-Smirnov (Ks) test, the t-test (t), and the Wilcoxon rank sum test (wilcox). Adjusted P-values less than 0.05 were considered significant. The full list of P-values are available in Supplementary File 1.
Mouse samples can be better clustered using P-values than using the logarithmic ratio of histone modifications
To demonstrate how the obtained P-values are biologically informative, we applied hierarchical clustering (unweighted pair group method with arithmetic mean, UPGMA) to mouse samples using distances with negatively signed correlation coefficients between P-values obtained by the t-test [Fig. 1A]. In this way, the obtained clusters are easily interpreted biologically. First, the samples were primarily clustered by their biological meanings. For example, H3K27ac_SC, H3AZ_SC, and H3K9ac_SC were clustered together (Cluster A) according to the tissue they were sampled from. H3K27me3 ST and H3K27me3_SG (Cluster B), H3K27ac ST and H3K27ac_SG (Cluster C), and H3K4me3 ST and H3K4me3_SG (Cluster D) all have the same histone modifications within the cluster. H3K4me1_ST and H3kme1_SC were clustered, and both shared histone modifications. H3K4me3_SC was clustered together with H3K4me1_ST and H3k4me1_SC (Cluster E), which were all sampled from spermatocytes. These were all low-level clusterings. Higher level clustering also had some biological significance. Cluster F comprised seven samples (H3K27ac_ST, H3K27ac_SG, H3K4me3 ST, H3K4me3_SG, H3K4me1_ST, H3k4me1_SC, and H3K4me3_SC). H3K4me3, H3K27ac, and H3K4me1 are thought to be transcription activators, and they were clustered with correlation coefficients greater than 0.4 (red broken line), thus showing that they are positively correlated. Conversely, two H3K27me3 samples (Cluster B) that were clustered with them have negative correlation coefficients. This observation is coincident with the fact that H3K27me3 is generally believed to be a transcription silencer. Although two other transcription activators, H3K9ac_SC and H3K27ac_SC (Cluster A), were not directly clustered with the seven transcription activators (Cluster F), this was the only discrepancy observed. Thus, clustering using P-values provides biologically relevant information. However, this might be because histone modification itself is biologically informative, and clustering by P-values simply reflects this. To further investigate this point, we performed UPGMA with the negative correlation coefficients between the logarithmic ratio of histone modifications, Δxij [Fig. 1B]. We employed the logarithmic ratio of histone modifications because this ratio was used for MiRaGE computation of P-values. If the biologically informative clusters obtained by the P-values simply reflected biologically informative histone modification, clustering using the logarithmic ratio of histone modifications should produce a result that is at least as good as that obtained by P-values. In contrast to our expectation, it was clear that the clusters obtained using the logarithmic ratio of histone modifications were less biologically informative than those obtained using P-values. The only conserved cluster between Figures 1A and 1B was cluster B. Cluster E was partially conserved (Cluster E’). All other clusters were broken into smaller parts. Although there were some newly formed clusters, they were not convincing. Cluster G comprised H3K4me3_SG, H3K4me3_ST, and H3K4me3_SC, which were sampled from three tissues. Cluster H mostly comprised samples taken from spermatocytes, but they were clustered with another sample taken from spermatocytes (Cluster A), which is broken, because Cluster H was now clustered with H3K27ac_ST, which was sampled from other tissues. The only advantage of Figure 1B compared to Figure 1A is that transcription silencers (H3K27me3) were clustered together and separately from transcription activators. Additionally, the clustering in Figure 1A is more compact than that in Figure 1B. In Figure 1A, if we consider three clusters, namely clusters A, B, and F, these are all clusters with correlation coefficients greater than 0.4. Although we also observed “three” clusters in Figure 1B, they are not clustered together with correlation coefficients greater than 0.4. Also, in Figure 1A, 10 samples other than H3K27ac_SC and H3K4me3_SC are clustered with at least one other sample and the correlation coefficients greater than 0.7 (blue broken line), while only 5 samples belonging to Cluster E’ and G are clustered together with any other samples and have correlation coefficients larger than 0.7. Thus, we concluded that the good clustering observed in Figure 1A could not be fully explained by biologically informative histone modifications. Converting histone modification to P-values that represent apparent miRNA-target-specific histone modification added more biological significance. This suggested indirectly that the apparent miRNA-target-specific histone modification may not be an artifact generated by the MiRaGE algorithm, but represents a genuine biological effect. Similar results to those for the P-values were obtained by either the Wilcoxon test or the Kolmogorov-Smirnov test (Figs. S1 and S2).
Hierarchical clustering (UPGMA) of mouse samples. (A) Negative correlation coefficients of P-values as distance. (B) Negative correlation coefficients of logarithmic ratio of histone modification (ΔXij) as distance.
miRNAs whose target genes are most frequently associated with significant differential histone modifications were previously reported to be regulated during spermatogenesis
Mouse miRNAs whose target genes are most frequently associated with more or less modified histones compared with the control.
Among those listed, six miRNAs (mmu-miR-291a-3p, mmu-miR-294–3p, mmu-miR-295–3p, mmu-miR-302a-3p, mmu-miR-302b-3p, and mmu-miR-302d-3p) are often cited as miRNAs that exhibit “sternness”. For example, mmu-miR-302a/b/ds’ human homologs hsa-miR-302a/b/d were reported to be embryonic stem (ES) cell-specific miRNAs. 12 Mouse miR-291–3p, miR-294, and miR-295 were also reported to be mouse ES-cell-specific miRNAs. 13 However, mmu-miR-22's human homolog, hsa-miR-22, was reported to be upregulated during differentiation. 14 Mmu-miR-23b's human homolog, hsa-miR-23b was reported to play critical roles in spermatogenesis. 15 Mmu-miR-122's human homolog, hsa-miR-122, was also reported to play critical roles in sperm abnormalities. 16 Mmu-miR-23a's human homolog was reported to be poorly expressed in abnormal semen compared with normal semen. 17
Thus, all of the selected 10 miRNAs were reported to be expressed in either spermatogenesis-related differentiated or undifferentiated cells. In particular, three miRNAs, mmu-miR-23a/b-3p and mmu-miR-122-5p were reported to be related to spermatogenesis. This also suggested that apparent miRNA-target-specific histone modification may not be an artifact caused by bioinformatics analysis, but is a potentially real biological outcome.
Alternatively, if the P-values computed by Kolmogorov-Smirnov test were employed, mmu-miR-29a/b/c and mmu-miR-24 were added to miRNAs with more histone modification in addition to the miRNAs listed in Table 2, while no miRNAs in Table 2, except mmu-miR-338–3p, were listed as miRNAs with less histone modification. Mmu-miR-29b was reported to be upregulated in sexually immature mouse testis. 18 MiR-29a/b/c were reported to play certain roles in spermatogenesis in rats. 19 Aberrant expression of miR-29a/b and miR-24 was reported in spermatogenesis-related experiment. 20 MiR-338 was also identified as a testes-specific miRNA in mouse and Xenopus. 21 Thus, although replacing P-values with those from the Kolmogorov-Smirnov test modified Table 2 a little and some miRNAs are additionally identified, they are also reported to be related to spermatogenesis.
Apparent mimA-target-specific histone modification is unlikely to be an artifact caused by bioinformatic analysis
Number of mouse miRNAs whose target genes are associated with significantly more or less modified histones, as inferred by the MiRaGE method when the gene IDs were shuffled.
A shuffle test where the genes are shuffled simultaneously for all samples was also used for confirmation. This means, if the ith histone modification of the jth sample xij is exchanged with xi'j, the ith histone modification is always exchanged with the i'th histone modification for all samples other than jth sample, too. Hereafter, we call this kind of shuffling “simultaneous shuffling”. Thus, this simultaneous shuffling did not affect the correlation coefficients between sample histone modifications. For example, the hierarchical clustering shown in Figure 1B remained unchanged. Conversely, shuffling completely destroyed the inference of apparent miRNA-target-specific histone modification, as shown in Table 3. Thus, Figure 1A computed from the P-values would change drastically. Figure 2 shows a typical hierarchical clustering using P-values computed from simultaneously shuffled histone modifications. Interestingly, it is almost identical to Figure 1B computed from the logarithmic ratio of histone modifications, Δxij. This suggests that the difference between Figures 1A and 1B did not come from an artifact caused by the MiRaGE procedure, but reflects a biologically significant event captured by the MiRaGE procedure. This also supports the view that our findings are not artifactual.
Hierarchical clustering (UPGMA) of simultaneously shuffled mouse samples Notations are the same as Figure 1 (A). P-values were computed by (A) t-test, (B) Wilcoxon test, and (C) Kolmogorov-Smirnov test.
Finally, we tested what would happen if multivariate analysis was employed. In the MiRaGE method, all mRNA expression is attributed to the considered miRNA. This is clearly an overestimation or double count, because mRNA expression is affected by all miRNAs that target each gene, not just the considered miRNA. To address this concern, we tried to infer mRNA expression (logarithmic ratio Δxij) as a function of the target gene table of miRNAs, using lasso.
22
Lasso tries to infer mRNA expression as a function of all miRNAs that target each mRNA and a shrinkage algorithm excludes irrelevant miRNAs; therefore, double counts and/or overestimation of apparent miRNA-target-specific histone modification are expected to be suppressed. Figure 3 shows the comparison between P-values obtained by the MiRaGE method and the regression coefficients obtained by lasso. It is clear that they are well and significantly correlated, although some differences could be observed. Lasso and MiRaGE methods employ numerical inference; therefore, some discrepancies are unavoidable without experimental validation. Critically, we can state that the overestimation and/or double count possibly included in the MiRaGE method do not destroy the inference. Although the results for the P-values obtained by the Wilcoxon and Kormogorov–Smirnov tests are available in Figures S3 and S4, they are essentially the same as that for the P-values obtained by the t-test.
Comparison between P-values and regression coefficients. Horizontal axis shows the regression coefficients attributed to each miRNA by lasso. Vertical axis shows the P-values (G < G’) obtained and attributed to each miRNA by the MiRage method, using the t-test. COR, Spearman correlation coefficients; P, attributed P-values to correlation coefficients.
Apparent miRNA-target-specific histone modification is also observable in human sperm
Number of human sperm miRNAs whose target genes are associated with significantly more or less modified histones, as inferred by the MiRaGE method.
In order to see if conserved miRNAs share the same apparent miRNA-target-specific histone modification commonly, we have drawn Venn diagrams between histone modifications (Fig. 4, see Methods). There are many conserved miRNAs associated with common apparent miRNA-target-specific histone modification (individual names available in Supplementary File 2). It also supports the reliability of apparent miRNA-target-specific histone modification.
Venn diagrams that compare apparent microRNA-target-specific histone modifications between human and mouse. Apparent microRNA-target-specific histone modifications were compared between conserved miRNAs of human (sperm) and mouse (spermatocytes, spermatogonia, and spermatids). Full lists of individual miRNA names were available in Supplementary File 2. See Methods for more details. (A) G < G’, H3K4me1, (B) G > G’, H3K4me1, (C) G < G’, H3K27ac, and (D) G > G’, H3K27ac.
Dependence of histone modification on the number of miRNA target genes caused apparent miRNA-target-specific histone modification
Although the apparent miRNA-target-specific histone modification observed in this study is
biologically informative and thus feasible, not likely caused by an artifact of the algorithms employed, and universal between at least humans and mice, and is thus expected to be universal in mammalian spermatogenesis,
P-values attributed to histone modification compared between genes targeted by more or less miRNAs than the mean.

schematics explaining the P-values of the dependence of histone modification on miRNAs. There are four miRNAs that target six genes. Although in this example we assumed that histone modifications increase as more miRNAs target the genes, it could be that histone modifications decrease as more miRNAs target the genes, depending upon the types of considered histone modification. The sizes of circles reflect the amount of histone modification. If we consider miR-A, genes 1, 2, and 4 are classified as target genes and 3, 5, and 6 are off-target genes. Thus, target genes are regarded as being associated with more histone modifications. Genes with less histone modification are rarely classified as target genes; therefore, it is more likely that genes targeted by an miRNA have more histone modifications than off-target genes.
Confirmation using an independent dataset
Number of mouse spermatocyte miRNAs whose target genes are associated with histones having more or less PRDM9-dependent H3K4me3 sites, as inferred by the MiRaGE method.
Biological function of apparent miRNA-target-specific histone modification
Although the apparent miRNA-target-specific histone modification seems to be a real phenomenon, it would be more convincing if we could suggest what its biological function is. Unfortunately, we have no suggestions regarding the biological function of apparent miRNA-target-specific histone modification, nor do we have any suggestions regarding experimental confirmation of apparent miRNA-target-specific histone modification. Recently, Ihara et al. 26 investigated the enrichment of histones, which is believed to be caused by histone modification and to affect gene expression in spermatogenesis. Thus, observation of apparent miRNA-target-specific histone enrichment in these data would support its existence.
Number of mouse spermatocyte miRNAs whose target genes are associated with histone enrichment, as inferred by the MiRaGE method.
Biological origin of long-range interaction between MIRNA regulation and histone modification
One may wonder whether the findings here are biologically realistic, because miRNA regulation of target genes and histone modification take place at the opposite ends of a gene (the former at the 3’ UTR and the latter at the 5’ UTR). Although there have been no reports that studied this kind of long-range interaction between miRNA regulation and histone modification, some studies have reported the long-range interactions between histone modification and RNAi, which contributes to post-transcriptional suppression similar to miRNA. 27 For example, Zofall et al. 28 found that histone H2A.Z cooperates with RNAi and heterochromatin factors to suppress antisense RNAs. The location of H2A.Z on the DNA is far from where RNAi binds to the DNA: H2A.Z binds to the 5’ UTR of the transcript, while the RNAi should bind to the 3’ UTR of considered transcript. Zofall et al. 29 also reported that RNAi nucleation and DNA-binding factor nucleation spread in opposite directions, starting from opposite ends of the gene to cooperate with heterochromatin assembly factors. Conversely, Yu et al found 30 that sequences in the 3’ UTR of an mRNA-coding gene inhibited the ability of siRNAs to promote heterochromatin formation, which takes place in the 5’ UTR of the gene. Although no similar processes have been reported in humans or mice (all studies were restricted to fission yeast), it would not be surprising if similar mechanisms are found later and long-range interactions between miRNA regulation and histone modification turn out to be real.
Conclusions
In this paper, we used bioinformatics to identify the apparent miRNA-target-specific histone modification in mouse and human spermatogenesis. Such histone modification was biologically informative and feasible, and was unlikely to be caused by an artifact generated by the algorithm used. Also, such histone modification reflected the dependence of histone modifications on the frequency of being targeted by an individual miRNA; therefore, it is likely a real biological effect. However, the mechanism of apparent miRNA-target-specific histone modification is unknown and requires further experimentation and analysis.
Methods
Brief explanation of the MiRaGE method
This is a brief explanation of the MiRaGE method. For more details, see the previous publications.10,11
Although MiRaGE was originally invented to infer miRNA regulation of target gene expression from target mRNA expression, it has also been used successfully to infer miRNA regulation of promoter methylation.2,3 Briefly, MiRaGE first computes the amount of differential mRNA expression/promoter methylation/histone modification of the ith gene at the jth sample, Δxij (see below for actual definition). It then computes the differential values to group the genes into two sets: genes targeted by the considered miRNA, or genes not targeted by the considered miRNA but targeted by any other miRNAs. The exclusion from the analysis of genes not targeted by any miRNAs is due to the interrelations between genes targeted by any miRNAs. Thus, genes targeted by any miRNAs should be considered separately from genes targeted by no miRNAs. For example, each miRNA must compete with the protein machinery, such as AGO proteins, that mediates its function. If the amounts of certain miRNAs increased, these miRNAs might occupy protein machineries that otherwise could be used by other miRNAs. 31 This would affect the expression of mRNAs not targeted by these miRNAs but targeted by any other miRNAs. However, this does not affect genes targeted by no miRNAs. Conversely, if certain mRNAs’ expressions increased, they would absorb miRNAs that otherwise could bind to other mRNAs. 32 Thus, the amount of genes targeted by any miRNAs affects the expression of other genes targeted by any miRNAs but does not affect that of genes not targeted by any other miRNAs. Thus, mRNAs targeted by no miRNAs were excluded from the analysis. Then, the P-values computed by the three statistical tests, ie, the t-test, the Wilcoxon rank sum test, and the Kolmogorov-Smirnov test, were attributed to individual miRNAs. Despite the simplicity of this methodology, it has worked well for various applications.33–36 The obtained P-values were adjusted using the Benjamini– Hochberg criterion. 37
In the following, we explain MiRaGE methods along the line by which it was used in this study. Suppose xij is the histone modification (or enrichment) of the ith gene and jth treated sample. The differential histone modification between treated and control samples would then be
Histone modification data studied
Mouse samples downloaded from GEO.
Association of histone modification with genes was computed by integration of histone modifications between 1000 bases downstream and upstream from the TSS in the hg19 human genome and the mm9 mouse genome.
Comparisons of apparent microRNA-target-specific histone modifications between conserved miRNAs of human and mouse
In order to compare the apparent miRNA-target-specific histone modification between conserved miRNAs of human and mouse, we first listed the apparent miRNA-target-specific histone modification that were recognized as significant by all three statistical tests, ie, t-test, Wilcoxon rank sum test, and Kolmogorov-Smirnov test, for both H3Kme1 and H3K27ac, using adjusted P-values provided in Supplementary File 1. Then, conserved miRNAs between human and mouse were listed based upon miRviewer. 41 Since three and four independent experiments were performed for H3Kme1 and H3K27ac, respectively, based upon these results, Venn diagrams (Fig. 4) were drawn and full lists of identified miRNAs were included in Supplementary File 2.
Independent histone modification data for confirmation
Independent histone modification data was downloaded from GEO using GEO ID: GSE52628. For B6 and KI cell lines, GSE52628_H3K4me3_B6_merge_ChIP.BedGraph.gz and GSE52628_H3K4me3_KI_merge_ChIP.BedGraph.gz in Supplementary file were used for the MiRaGE method, respectively. Because no control data were available in these datasets, xic was constantly taken to be 1 independent of i.
Histone enrichment dataset
Histone enrichment data was downloaded from GEO using GEO ID: GSE56281. GSM1358290_mtm6.sorted.bedGraph.gz in Supplementary file of GSM1358290, GSM1358291_mtm7.sorted. bedGraph.gz in Supplementary file of GSM1358291, and GSM1358292_mtm8.sorted.bedGraph.gz in Supplementary file of GSM1358292 were first downloaded. Because no control data were available in these datasets, xic was constantly taken to be 1 independent of i. GSE56281_D23_IP.sorted.bedGraph.gz and GSE56281_D23_input.sorted.bedGraph.gz in Supplementary file of GSE56281 were downloaded and used as treated and control data, respectively. Also, in these datasets, the values had already been converted to be logarithmic values; therefore, exp(xij) was used instead of xij, while xic has not been converted, thus xic was used for input.
Inference of apparent microRNA-target-specific histone modification using lasso
To infer the apparent miRNA-target-specific histone modification by multivariate analysis, we employed lasso.
22
Δxij is modeled as
Statistical analysis using functions/packages in R
Most statistical analyses in this study were performed with various packages/functions in R. 42 Hierarchical clustering (UPGMA) was performed by the hclust function with the setting method = “average”. Lasso was performed by the lars function in the package lars with default settings. Three statistical tests used the three functions: t.test, ks.test, and wilcox. test in R. Adjusted P-values were computed using the p.adjust function with the setting method = “BH”.
Author Contributions
Conceived and designed the experiments: YHT. Analyzed the data: YHT. Wrote the first draft of the manuscript: YHT. Agree with manuscript results and conclusions: YHT. Developed the structure and arguments for the paper: YHT. Made critical revisions: YHT. The author reviewed and approved of the final manuscript.
Supplementary Data
Supplementary File 1
P-values and adjusted P-values. P-values (PG<G’ and PG>G’) and adjusted P-values for mouse and human samples. Each sheet name indicates which sheet includes what kind of data.
Supplementary File 2
Individual miRNA names presented in Venn diagram (Fig. 4).
Individual names of miRNAs associated with apparent microRNA-target-specific histone modifications shown in Figure 4. MiRNAs conserved between human and mouse are shown in red.
Supplementary File 3
P-values attributed to histone modification compared between genes targeted by more or less miRNAs than the mean.
P-values shown in Table 5 for t-test, Wilcoxon test, and Kolmogorov-Smirnov test.
Supplementary File 4
P-values and adjusted P-values.
P-values (PG<G’ and PG>G’) and adjusted P-values for independent histone modification data for confirmation.
Supplementary File 5
P-values and adjusted P-values.
P-values (PG<G’ and PG>G,) and adjusted P-values for histone enrichment.
Supplementary File 6
Supplementary Figures S1, S2, S3 and S4.
