A Novel Method for Assessing the Statistical Significance of RNA-RNA Interactions Between Two Long RNAs

Abstract

RNA-RNA interactions are key mechanisms through which noncoding RNA (ncRNA) regions exert biological functions. Computational prediction of RNA-RNA interactions is an essential method for detecting novel RNA-RNA interactions because their comprehensive detection by biological experimentation is still quite difficult. Many RNA-RNA interaction prediction tools have been developed, but they tend to produce many false positives. Accordingly, assessment of the statistical significance of computationally predicted interactions is an important task. However, there is no method to evaluate the statistical significance of RNA-RNA interactions that is applicable to interactions between two long RNA sequences. We developed a method to calculate the p-value for the minimal interaction energy between two long RNA sequences. The developed method depends on the fact that minimum interaction energies of RNA-RNA interactions between long RNAs follow a Gumbel distribution when repeat sequences in RNAs are masked. To show the usefulness of the developed method, we applied it to whole human 5′-untranslated region (UTR) and 3′-UTR sequences to detect novel 5′-UTR-3′-UTR interactions. We thus identified two significant 5′-UTR-3′-UTR interactions. Specifically, the human small proline-rich repeat protein 3 shows conserved 5′-UTR-3′-UTR interactions with some nucleotide variations preserving base pairings among primates. Our developed method enables us to detect statistically significant RNA-RNA interactions between long RNAs such as long ncRNAs. Statistical significance estimates help in identification of interactions for experimental validation and provide novel insights into the function of ncRNA regions.

1. Introduction

Recent, large-scale transcriptome studies have revealed that most of the regions in the human genome are transcribed into RNAs, but not translated into proteins, and many of these noncoding RNA (ncRNA) regions also have various biological functions (Clark et al., 2011; Hon et al., 2017). One of the main biophysical mechanisms through which these ncRNA regions exert their functions is their interactions with other RNA regions based on complementary base pairings (Madhani and Christine, 1994; Bachellerie et al., 2002; Guile and Esteller, 2015). For example, microRNAs suppress the expression of target mRNAs by binding the 3′-untranslated region (UTR) of targets (Ameres and Zamore, 2013). As another example, a long-range intramolecular interaction between the 5′-UTR and 3′-UTR of p53 mRNA plays a key role in its translational control (Chen and Kastan, 2010). These examples suggest that identification of RNA-RNA interactions is an important step in the functional assessment of ncRNA regions.

Although several experimental methods to infer RNA-RNA interactions have been developed (Engreitz et al., 2014; Lu et al., 2016; Nguyen et al., 2016), computational prediction of RNA-RNA interaction is still an essential technique. At present, many RNA-RNA interaction prediction tools have been developed (Tafer et al., 2011; Alkan et al., 2017; Kato et al., 2017; Mann et al., 2017). These programs output optimal or suboptimal interactions under optimization conditions of each program. As such, even when RNA pairs that do not interact [e.g., pairs of randomly generated RNA sequences (Tjaden et al., 2006)] are given as input data, these programs output interactions that are optimal or suboptimal solutions in input sequences. To eliminate likely false-positive interactions, a cutoff method based on a threshold score is frequently used. However, selection of an appropriate cutoff score is a difficult task, so this method may be highly arbitrary. Although some tools (Rehmsmeier et al., 2004; Tjaden et al., 2006; Wright et al., 2013) can assess the statistical significance of detected RNA-RNA interactions, these programs can be applied to only small RNAs and cannot be used to assess interactions between long RNAs.

In the assessment of statistical significance, there are at least four distinctions between short and long RNA interactions. The first issue is the consideration of repeat sequences. Repeat sequences are generally masked in sequence alignments because they are frequently aligned to nonhomologous regions, which produce incorrect homology predictions. On the other hand, the simple complementary of repeat sequences between two RNAs may make them important components of RNA-RNA interactions (Johnson and Guigó, 2014). While we need not pay attention to repeat sequences in RNA-RNA interaction predictions for short RNA because they do not include repeat sequences, we cannot ignore the effect of repeat sequences in the prediction of interactions between long RNAs because some of the repeat sequences are actually involved in RNA-RNA interactions between long RNAs (Gong and Maquat, 2011). However, by having the repeat sequences available for RNA-RNA interaction, the null distribution of RNA-RNA interaction scores may not follow the theoretical statistical distribution because these elements strongly decrease sequence randomness. Therefore, we have to investigate how the null distribution changes depending on whether repeat sequences are included in RNA-RNA interaction predictions.

The second issue is sequence length. While sequence lengths of short RNAs are approximately uniform and consist of several tens of bases, those of long RNAs are highly diverse, ranging from 200 bases to several tens of thousands of bases. Therefore, we have to consider the influence of differences in sequence lengths on null distribution. The third issue is the maximal span parameter. When considering the secondary structure of long RNAs, researchers generally restrict maximal spans between bases that form base pairs to reduce the computation time (Bernhart et al., 2006; Kiryu et al., 2008; Lange et al., 2012). As short maximal spans prevent the formation of intramolecular RNA secondary structures, this parameter should influence the potential for forming intermolecular RNA-RNA interactions. The fourth issue is the inequality in lengths of the two RNA sequences. In sequence alignment, the configurations of null distribution change as the lengths of two sequences become unequal (Altschul and Erickson, 1986). We have to investigate whether this trend also occurs in RNA-RNA interactions.

In the present study, we developed a method for assessing statistical significance of RNA-RNA interactions between two long RNA sequences. As the criterion for determining whether two RNA regions interact, we used interaction energy, which was calculated as the summation of hybridization energy and accessibility energy. In brief, hybridization energy is the stabilized energy based on intermolecular base pairs, and accessibility energy is the energy required to inhibit the regions from forming intramolecular stem structures. When several local RNA-RNA interactions were detected between two RNAs, we used an interaction with the minimum interaction energy. We first investigated influences of the aforementioned four factors on the null distribution of minimum interaction energies. Next, we implemented the method to evaluate the statistical significance of predicted RNA-RNA interactions between two long RNAs. Then, to validate usefulness of our developed method, we investigated whether novel human 5′-3′ UTR interactions were detected by our method. Our approach discovered a likely 5′-3′ UTR interaction associated with the small proline-rich repeat protein 3 (SPRR3), which shows conservation with some nucleotide variations preserving base pairings among primates.

2. Methods

2.1. Method for evaluating the null distribution of minimal interaction energy between two ribonucleic acid sequences

To evaluate whether minimal interaction energy between two RNA sequences is statistically significant, a null distribution of energies is required. In other words, we have to calculate the minimal interaction energies between many unrelated RNA pairs and specify a distribution function of these energies. In this research, for each experiment, we randomly cut out 6000 RNA sequences with certain sequence lengths from long noncoding RNA (lncRNA) transcripts annotated by Gencode, ver.25 (Harrow et al., 2012), and created 3000 RNA pairs from these 6000 RNA sequences. The reason for choosing lncRNA is as follows. LncRNA has a low expression level and shows a tissue-specific expression pattern (Cabili et al., 2011; Iwakiri et al., 2017) and thus the number of true lncRNA-lncRNA interactions should be limited. Actually, there are few experimentally verified lncRNA-lncRNA interactions at present (Nguyen et al., 2016). Therefore, the generated dataset should contain almost no true interacting RNA pairs. We used the longest lncRNA transcript for each lncRNA gene when the lncRNA gene has several splicing isoforms.

Then, we applied the RIblast program to these RNA pairs (Fukunaga and Hamada, 2017). RIblast is currently the only program to predict RNA-RNA interactions in long RNAs using accessibility energy. We used only an interaction with minimum interaction energy among the detected local interactions for each RNA pair. We set the seed length parameter, interaction energy threshold, and output energy threshold of RIblast to 3, 0.0, and 0.0, respectively, to enumerate as many interactions as possible. The other RIblast parameters were set to default values unless otherwise specified. Note that RIblast uses a maximal span parameter W to calculate accessibility energies (Kiryu et al., 2011). Next, we investigated whether the empirical distribution for absolute values of calculated minimum interaction energies fits a Gumbel distribution, an extreme value distribution that is widely used in bioinformatic applications (Smith et al., 1985; Altschul and Erickson, 1986; Rehmsmeier et al., 2004; Tjaden et al., 2006). The cumulative distribution function F(x) is defined as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} F \left( x \right) = \exp \left[ { - \exp \left\{ { - \left( { \frac { { x - \mu } } { \eta } } \right) } \right\} } \right] \end{align*} \end{document}

The location parameter \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} and scaling parameter \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} were estimated by the following formula based on the moment method: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \eta = \frac { { \sqrt { 6v } } } { \pi } \end{align*} \end{document} \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \mu = u - \; \gamma \eta \end{align*} \end{document}

where v, u, and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\gamma$$ \end{document} are sample variances, the absolute value of the sample means, and Euler–Mascheroni constant (0.57721…), respectively.

First, we investigated the influence of repeat sequences. When we included repeat sequences in the analysis, we randomly cut out, 400-nucleotide, 6000 RNAs from lncRNA transcripts and executed RIblast without masking repeats. When we masked repeat sequences, 6000 sequences were randomly cut out from lncRNA transcripts, so that the length of the nonrepeat region was 400 bases, and conducted RIblast with hard masking of repeats. Note that sequence lengths of these sequences were 400 bases or more when including repeats, and Supplementary Figure S1 shows the distribution of the total sequence lengths. Annotation of the GRCh38 assembly in the UCSC genome browser database was used as the repeat sequence annotation (Tyner et al., 2017). In addition, tandem repeats were further annotated using TANTAN (r = 0.02) (Frith, 2010).

Next, we investigated influences of the sequence length and maximal span W on the shape of null distribution. We used six sequence lengths between 100 and 800 and 15 maximal span values between 0 and 300. Then, we created 3000 RNA pairs for each length. Note that we used the same dataset for different maximal span values to ignore the influence based on the difference of datasets. In these experiments, the sequence length means the length of the nonrepeat region and RIblast was executed with hard masking of repeats.

Finally, we inspected the effect of uneven lengths of the two RNA sequences on the configuration of the null distribution. We used four sequence length pairs such that the product of lengths of the two sequences was uniform: (360, 360), (240, 540), (180, 720), and (120, 1080). Then, we created RNA pairs for each sequence length pair. In this study, the sequence length refers to the length of the nonrepeat region, and we used RIblast with hard masking of repeats.

2.2. Evaluation method for the p-value calculation method

We investigated whether distribution of p-values, which are obtained by our developed p-value calculation method, from the random dataset follows uniform distribution. We randomly cut out 6000 sequences from lncRNA transcripts and created 3000 RNA pairs. In this study, sequence lengths were randomly determined from 100 to 1000 nt for each pair. We obtained minimum interaction energy for each pair using RIblast and applied the p-value calculation method to the minimum interaction energies. We used four maximal span values between 50 and 200 and compared the shape of p-value distributions with that of uniform distribution using the QQ plot. Note that two sequence lengths in a pair are the same, and we used the same dataset for different maximal spans.

2.3. Method for detecting 5′-3′ untranslated region interactions

We applied our statistical significance assessment method for RNA-RNA interaction to the detection of novel 5′-3′ UTR interactions. We created a positive dataset of human 5′-3′ UTR pairs as follows. First, to mask tandem repeats, we applied TANTAN (r = 0.02) to human genome (GRCh38) in the UCSC genome browser database while preserving preannotated repeats. Next, we extracted all UTR regions annotated by Gencode, ver.25, from the genome sequence. In this study, we used UTR pairs with the largest sum of sequence lengths of 5′-UTR and 3′-UTR for each gene when genes have multiple UTR annotations. However, for the TP 53 gene, we used an ENST00000610292.4 transcript, which is a transcript containing a known 5′-3′ UTR interaction region (Chen and Kastan, 2010). Finally, by excluding UTR pairs of which the nonrepeat region lengths of either UTR were less than 100 bases, we obtained 12,839 UTR pairs. We applied RIblast with repeat hard-masking style to the obtained whole UTR pairs and calculated the p-value of the minimum interaction energy for each UTR pair. All RIblast parameters were set to the default parameter. We used the Benjamini–Hochberg method for controlling false discovery rates (FDRs) under multiple hypothesis testing ( \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \rm{ \alpha }} < \;0.05$$ \end{document} ) (Benjamini and Hochberg, 1995). In this method, all p-values are sorted in ascending order. Then, each p-value is multiplied by 12,839 (the number of tests) and divided by the rank in a sorted order. If the adjusted p-value is less than 0.05, we regarded it as statistically significant.

3. Results

3.1. Parameter estimation of null distribution of minimum interaction energy

First, we investigated the dependence of null distribution of minimum interaction energy on consideration of repeat sequences (Fig. 1). We verified that the empirical null distribution corresponds reasonably well with the Gumbel distribution when we masked repeat sequences (Fig. 1A, C, E). However, when we included repeat sequences in the analysis, the null distribution did not follow the Gumbel distribution as a consequence of extremely low interaction energy (Fig. 1B, D, F). These results suggest that we can assess statistical significance of minimal interaction energies based on the Gumbel distribution when repeat sequences are masked, but not when repeat sequences are included in the analyses. Accordingly, we excluded repeat sequences in the following analysis.

FIG. 1.

The influence of repeat sequences on the shape of the null distribution of minimum interaction energies. (A, B) The x-axis represents the minimum interaction energy and the y-axis represents its density. Black and gray represent the empirical distribution and Gumbel distribution, respectively. We sampled 5000 values from a Gumbel distribution whose parameters were estimated based on real data using the moment method and drew the density distribution from the 5000 values. Repeat sequences were (A) masked or (B) included in the analysis. (C, D) The QQ plot between empirical distribution and Gumbel distribution. The x-axis represents minimum interaction energies of empirical distribution and the y-axis represents those of Gumbel distribution. If the plots are arranged on the y = x, the empirical distribution follows the Gumbel distribution. Repeat sequences were (C) masked or (D) included in the analysis. (E, F) The plot of a log-log transform of the empirical distribution function. F(x) means the empirical distribution function. The x-axis represents minimum interaction energies and the y-axis represents log-log transform of the empirical distribution function. If the plot becomes a straight line, the distribution is close enough to Gumbel distribution. Repeat sequences were (E) masked or (F) included in the analysis.

Second, we evaluated the effect of sequence length and maximal span W on parameters of the Gumbel distribution. We first found that null distribution does not follow Gumbel distribution when the maximal span is small (W = 0 or 10; Supplementary Figs. S2–S4). This reason is probably explained by the following observations. Small maximal spans ignore accessibility energies and thus tend to form long, local base-pairing interactions (Supplementary Fig. S5). As the length of the interaction region with minimum interaction energy approaches the total sequence length, the minimum interaction energy may no longer be an extreme value from many independent samples. Therefore, we set the maximal span parameter to 20 and over in the following analysis. We verified that null distribution follows Gumbel distribution regardless of the sequence length and maximal span when the maximal span was set to 20 and more (Supplementary Figs. S2–S4).

Next, we investigated the Gumbel distribution parameters for 78 combinations of sequence lengths and maximal spans using the moment method (Tables 1 and 2). When we investigated the dependence of the parameters \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} on the sequence length with a fixed maximal span, we found that the parameters \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} have linear relationships with the logarithm of the sequence length and the untransformed sequence length, respectively (Fig. 2A, B). These results indicate that the Gumbel distribution parameters depend on sequence length, but they are accurately predictable from sequence lengths using linear regression. On the other hand, when we estimated the dependence of Gumbel distribution parameters on maximal span with a fixed sequence length, we discovered that \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} monotonically decrease with increasing W (Fig. 2C, D). These results suggest that Gumbel distribution parameters are dependent on maximal span, but they are estimable by interpolation.

FIG. 2.

The effect of the maximal span parameter W and the sequence length on Gumbel distribution parameters. (A) The relationship between location parameter μ and the logarithm of the sequence length when W is 70. The x-axis represents the logarithm of the sequence length and the y-axis represents μ. (B) The relationship between scaling parameter η and the sequence length when W is 70. The x-axis represents the sequence length and the y-axis represents η. (C) The relationship between location parameter μ and the maximal span W when the sequence length is 400. The x-axis represents the maximal span and the y-axis represents μ. (D) The relationship between scaling parameter η and the maximal span W when the sequence length is 400. The x-axis represents the maximal span and the y-axis represents η.

Table 1.

Dependence of the Location Parameter \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} on Maximal Span W and Sequence Length

	Sequence length
W	100	200	300	400	600	800
20	6.68	8.62	9.86	10.80	12.21	13.11
30	5.89	7.48	8.45	9.20	10.31	11.10
40	5.53	6.95	7.81	8.46	9.41	10.12
50	5.29	6.64	7.44	8.02	8.89	9.55
60	5.16	6.43	7.20	7.76	8.58	9.20
70	5.06	6.28	7.02	7.57	8.36	8.96
80	4.99	6.18	6.89	7.44	8.21	8.80
90	4.94	6.13	6.82	7.36	8.10	8.67
100	4.88	6.06	6.77	7.30	8.02	8.57
150	4.83	5.92	6.61	7.05	7.79	8.30
200	4.81	5.85	6.52	6.95	7.66	8.17
250	4.79	5.82	6.48	6.90	7.58	8.10
300	4.79	5.80	6.44	6.86	7.54	8.03

Table 2.

Dependence of the Location Parameter \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} on Maximal Span W and Sequence Length

	Sequence length
W	100	200	300	400	600	800
20	1.534	1.796	1.834	2.008	2.150	2.235
30	1.307	1.443	1.477	1.619	1.759	1.798
40	1.206	1.281	1.335	1.413	1.532	1.600
50	1.142	1.207	1.251	1.334	1.418	1.456
60	1.109	1.158	1.184	1.274	1.328	1.364
70	1.081	1.117	1.145	1.201	1.286	1.330
80	1.068	1.099	1.127	1.172	1.254	1.272
90	1.046	1.085	1.113	1.142	1.221	1.244
100	1.041	1.073	1.090	1.122	1.191	1.238
150	1.029	1.039	1.048	1.061	1.153	1.170
200	1.032	0.998	0.997	1.035	1.093	1.136
250	1.031	1.001	0.987	1.009	1.073	1.106
300	1.031	0.996	0.998	0.998	1.060	1.081

Third, we checked the influence of uneven lengths of two RNA sequences on the Gumbel distribution parameters. Supplementary Table S1 shows the estimated parameters \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} for sequence length pairs such that the product of the two sequence lengths is uniform. We discovered that parameter \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} becomes larger in proportion to the increase in length inequality between two sequences, while parameter \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} is independent of inequality. This means that the statistical significance of small and large minimum interaction energies tends to be underestimated and overestimated, respectively, when there is a large difference between two RNA sequence lengths. However, we did not consider this bias when developing our statistical significance assessment method, and further researches will be required for addressing this bias.

3.2. The calculation method of statistical significance

Based on the above analysis results, we constructed a method to calculate the p-value for the minimal interaction energy in RNA-RNA interactions between long RNAs as follows. We defined N and M as the lengths of two RNA sequences and e as the minimum interaction energy. In addition, we defined W as the maximal span parameter used when assessing RNA-RNA interactions. Additionally, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _{w , l}}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \eta _{w , l}}$$ \end{document} were defined with the parameters \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\mu$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\eta$$ \end{document} describing the null distribution when the maximal span is w, and the square root of the product of two RNA sequence lengths is l. First, we calculated 12 values ( \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _{W , 100}}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _{W , 200}}$$ \end{document} , …, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _{W , 800}}$$ \end{document} , \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \eta _{W , 100}}$$ \end{document} , …, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \eta _{W , 800}}$$ \end{document} ) from the values in Tables 1 and 2 using the linear interpolation method. In this study, if W is smaller than 20, the corresponding p-value is not calculated. If W is larger than 300, we used the parameters corresponding to W = 300. Second, we calculated regression parameters of the linear regression formula modeling the relationship between log l and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _{W , l}}$$ \end{document} ( \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _{W , l}}$$ \end{document} = a log l + b, where a and b are regression parameters) and those for modeling the relationship between l and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \eta _{W , l}}$$ \end{document} ( \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \eta _{W , l}}$$ \end{document} = cl + d, where c and d are regression parameters) based on the 12 values calculated above. Finally, we obtained \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mu _{W , \sqrt {MN} }}$$ \end{document} and \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \eta _{W , \sqrt {MN} }}$$ \end{document} using the linear regression formulas and calculated p-values for minimum interaction energy according to \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} p \left( { X \le e } \right) = 1 - { \rm { \;exp } } \left[ { - \exp \left\{ { - \left( { { \frac { e - { \mu _ { W , \sqrt { MN } } } } { { \eta _ { W , \sqrt { MN } } } } } } \right) } \right\} } \right] . \end{align*} \end{document}

We evaluated the correctness of our developed p-value calculation method. Figure 3A–D shows the QQ plot between theoretical uniform distribution and distribution of obtained p-values from the random dataset for various maximal span values. We verified that the plot is on the diagonal and thus the distribution of obtained p-values follows uniform distribution for all maximal span values.

FIG. 3.

The QQ plot between empirical distribution of p-values and the uniform distribution. The x-axis represents calculated p-values, and the y-axis represents sampled values from the uniform distribution. If the plots are arranged on the y = x, empirical distribution of p-values follows uniform distribution. The maximal span values were (A) 50, (B) 100, (C) 150, and (D) 200.

The source code for the p-value calculation method is freely available at https://github.com/fukunagatsu/RIblast_pv

3.3. Detection of novel 5′-3′ untranslated region interactions

We showcase one application that demonstrates the usefulness of our method for assessing the statistical significance of RNA-RNA interactions between long RNAs. While eukaryotic mRNAs generally form circular structures through the interaction between the 5′ CAP structures and 3′ poly-A tails (Wells et al., 1998), some mRNAs have another circularization mechanism, 5′-3′ UTR interaction based on complementary base parings. Although this mechanism has been frequently reported in RNA viruses (Nicholson and White, 2014) and bacteria (de los Mozos et al., 2013), there are few reports in eukaryotes. The only experimentally confirmed example among human mRNAs is the TP53 mRNA, in which the 5′-3′ UTR interaction region exerts translational regulation (Chen and Kastan, 2010). In the present study, we tried to detect novel human 5′-3′ UTR interactions using our method.

When we controlled the FDR using the Benjamini–Hochberg method ( \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \rm{ \alpha }} < 0.05$$ \end{document} ), two significant 5′-3′ UTR pairs were detected in the positive dataset: SPRR3 (q-value = 0.00032) and transmembrane protein 74B (Tmem74b) (q-value = 0.01499). SPRR3 is a component of epidermal differentiation complexes and is specifically expressed in oral and esophageal epithelia (Hohl et al., 1995; Kypriotou et al., 2012). On the other hand, Tmem74b has not been heavily studied. Unfortunately, the TP53 transcript, which is the only positive example of human UTR-UTR interactions, could not be detected using our method. Figure 4 shows the predicted interactions in these two transcripts, and the distances between the two interaction regions of SPRR3 and Tmem74b are 756 and 1084 nucleotides, respectively. Then, we investigated sequence conservation of the interaction region using multiple alignments of 100 vertebrate genomes provided by the UCSC genome browser database (Tyner et al., 2017). Both interaction regions of SPRR3 were aligned with the rhesus monkey and marmoset genomes, but were not aligned with genomes of species that are more remotely related than the mouse. We discovered that these regions have some base-pair substitutions preserving base pairings among primates (Fig. 4A). The evolutionary constraint suggests that this interaction actually occurs in vivo and has biological functions. On the other hand, the interaction regions of Tmem74b were not aligned with the genome of other organisms.

FIG. 4.

Visualization of statistically significant UTR-UTR interactions. (A) The predicted interaction associated with the small proline-rich repeat protein 3. Rhesus and marmoset sequences were derived from the UCSC genome browser database. Nucleotides in gray represent mutation sites from human sequences. (B) The predicted interaction associated with transmembrane protein 74B. These regions could not be aligned with genomes of other organisms. UTR, untranslated region.

4. Discussion and Conclusion

We developed a p-value calculation method for predicting interactions between two long RNAs. Our method utilizes the Gumbel distribution, which matches empirical null distribution of minimum interaction energies of RNA-RNA interactions. Using our method, we predicted comprehensive human 5′-3′ UTR interactions, and we found that UTR interactions involving SPRR3 are conserved among primates with some substitutions preserving base pairings.

While we focused on minimum interaction energy in this research, recent studies have reported that the summation of interaction energies of all detected local interactions between two RNAs is an effective predictor of long ncRNA interactions (Terai et al., 2016; Iwakiri et al., 2017). These studies emphasized the importance of a statistical significance assessment method for the summed interaction energies. Karlin and Altschul developed a p-value calculation method for the sum of the local sequence alignment scores of multiple regions (Karlin and Altschul, 1993). However, because they assumed that each segment is independent of other segments, we cannot directly apply this method to RNA-RNA interaction predictions because a region may interact with multiple sites. More work is needed to overcome this complication.

Although our method identified two human 5′-3′ UTR interactions, we could not recover the known TP53 5′-3′ UTR interactions. This is because the interaction region in TP53 is relatively short and thus the interaction energy is relatively large (Chen and Kastan, 2010). In addition, we excluded repeat sequences in the analysis, but there may be functional UTR-UTR interactions occurring through the repeat sequences of expressed transcripts. As such, we presume that functional UTR-UTR interactions are more widespread in the human transcriptome than indicated by the analysis in this research. The development of a more sensitive bioinformatic method for RNA-RNA interaction predictions is an important research topic.

Footnotes

Acknowledgments

This work was supported by JSPS KAKENHI, Grant Numbers JP16J00129 and JP17H05605 to T.F. and JP16H05879 to M.H. Computations in this research were performed using the supercomputing facilities at the National Institute of Genetics in Research Organization of Information and Systems.

Author Disclosure Statement

The authors declare that no competing financial interests exist.

References

Alkan

, Wenzel

, Palasca

, et al. 2017. RIsearch2: Suffix array-based large-scale prediction of RNA-RNA interactions and siRNA off-targets. Nucleic Acids Res. 45, e60.

Altschul

S.F.

, and Erickson

B.W.

1986. A nonlinear measure of subalignment similarity and its significance levels. Bull. Math. Biol., 48, 617–632.

Ameres

S.L.

, and Zamore

P.D.

2013. Diversifying microRNA sequence and function. Nat. Rev. Mol. Cell Biol., 14, 475–488.

Bachellerie

J.P.

, Cavaillé

, and Hüttenhofer

2002. The expanding snoRNA world. Biochimie, 84, 775–790.

Benjamini

, and Hochberg

1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol., 57, 289–300.

Bernhart

S.H.

, Hofacker

I.L.

, and Stadler

P.F.

2006. Local RNA base pairing probabilities in large sequences. Bioinformatics, 22, 614–615.

Cabili

M.N.

, Trapnell

C. G

, off

, et al. 2011. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927.

Chen

, and Kastan

M.B.

2010. 5′-3′-UTR interactions regulate p53 mRNA translation and provide a target for modulating p53 induction after DNA damage. Genes Dev. 24, 2146–2156.

Clark

M.B.

, Amaral

P.P.

, Schlesinger

F.J.

, et al. 2011. The reality of pervasive transcription. PLoS Biol. 9, 1000625.

10.

de los Mozos

I.R.

, Vergara-Irigaray

, Segura

, et al. 2013. Base pairing interaction between 5′-and 3′-UTRs controls icaR mRNA translation in Staphylococcus aureus. PLoS Genet. 9, 1004001.

11.

Engreitz

J.M.

, Sirokman

, McDonel

, et al. 2014. RNA-RNA interactions enable specific targeting of noncoding RNAs to nascent pre-mRNAs and chromatin sites. Cell, 159, 188–199.

12.

Frith

M.C.

2010. A new repeat-masking method enables specific detection of homologous sequences. Nucleic Acids Res. 39, e23.

13.

Fukunaga

, and Hamada

2017. RIblast: An ultrafast RNA-RNA interaction prediction system based on a seed-and-extension approach. Bioinformatics, 33, 2666–2674.

14.

Gong

, and Maquat

L.E.

2011. lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 3′ UTRs via Alu elements. Nature, 470, 284–288.

15.

Guil

, and Esteller

2015. RNA-RNA interactions in gene regulation: The coding and noncoding players. Trends Biochem. Sci., 40, 248–256.

16.

Harrow

, Frankish

, Gonzalez

J.M.

, et al. 2012. GENCODE: The reference human genome annotation for The ENCODE project. Genome Res. 22, 1760–1774.

17.

Hohl

, de Viragh

P.A.

, Arniguet-Baras

, et al. 1995. The small proline-rich proteins constitute a multigene family of differentially regulated cornified cell envelope precursor proteins. J. Invest. Dermatol., 104, 902–909.

18.

Hon

C.-C.

, Ramilowski

J.A.

, Harshbarger

, et al. 2017. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature, 543, 199–204.

19.

Iwakiri

, Terai

, and Hamada

2017. Computational prediction of lncRNA-mRNA interactions by integrating tissue specificity in human transcriptome. Biol. Direct. 12, 15.

20.

Johnson

, and Guigó

2014. The RIDL hypothesis: Transposable elements as functional domains of long noncoding RNAs. RNA, 20, 959–976.

21.

Karlin

, and Altschul

S.F.

1993. Applications and statistics for multiple high-scoring segments in molecular sequences. Proc. Natl Acad. Sci. U. S. A., 90, 5873–5877.

22.

Kato

, Mori

, Sato

, et al. 2017. An accessibility-incorporated method for accurate prediction of RNA-RNA interactions from sequence data. Bioinformatics, 33, 202–209.

23.

Kiryu

, Kin

, and Asai

2008. Rfold: An exact algorithm for computing local base pairing probabilities. Bioinformatics, 24, 367–373.

24.

Kiryu

, Terai

, Imamura

, et al. 2011. A detailed investigation of accessibilities around target sites of siRNAs and miRNAs. Bioinformatics, 27, 1788–1797.

25.

Kypriotou

, Huber

, and Hohl

2012. The human epidermal differentiation complex: Cornified envelope precursors, S100 proteins and the fused genes family. Exp. Dermatol., 21:643–649.

26.

Lange

S.J.

, Maticzka

, Möhl

, et al. 2012. Global or local? Predicting secondary structure and accessibility in mRNAs. Nucleic Acids Res. 40, 5215–5226.

27.

, Zhang

Q.C.

, Lee

, et al. 2016. RNA duplex map in living cells reveals higher-order transcriptome structure. Cell, 165, 1267–1279.

28.

Madhani

H.D.

, and Guthrie

1994. Dynamic RNA-RNA interactions in the spliceosome. Annu. Rev. Genet., 40, 248–256.

29.

Mann

, Wright

P.R.

, and Backofen

2017. IntaRNA 2.0: Enhanced and customizable prediction of RNA-RNA interactions. Nucleic Acids Res. W1, W435–W439.

30.

Nicholson

B.L.

, and White

K.A.

2014. Functional long-range RNA-RNA interactions in positive-strand RNA viruses. Nat. Rev. Microbiol., 12, 493–504.

31.

Nguyen

T.C.

, Cao

, Yu

, et al. 2016. Mapping RNA-RNA interactome and RNA structure in vivo by MARIO. Nat. Commun. 7, 12023.

32.

Rehmsmeier

, Steffen

, Höchsmann

, et al. 2004. Fast and effective prediction of microRNA/target duplexes. RNA, 10, 1507–1517.

33.

Smith

T.F.

, Waterman

M.S.

, and Burks

1985. The statistical distribution of nucleic acid similarities. Nucleic Acids Res. 13, 645–656.

34.

Tafer

, Amman

, Eggenhofer

, et al. 2011. Fast accessibility-based prediction of RNA-RNA interactions. Bioinformatics, 27, 1934–1940.

35.

Terai

, Iwakiri

, Kameda

, et al. 2016. Comprehensive prediction of lncRNA-RNA interactions in human transcriptome. BMC Genomics, 17, 12.

36.

Tjaden

, Goodwin

S.S.

, Opdyke

J.A.

, et al. 2006. Target prediction for small, noncoding RNAs in bacteria. Nucleic Acids Res. 34, 2791–2802.

37.

Tyner

, Barber

G.P.

, Casper

, et al. 2017. The UCSC Genome Browser database: 2017 update. Nucleic Acids Res. 45, 626–634.

38.

Wells

S.E.

, Hillner

P.E.

, Vale

R.D.

, et al. 1998. Circularization of mRNA by eukaryotic translation initiation factors. Mol. Cell., 2, 135–140.

39.

Wright

R.R.

, Richter

A.S.

, Papenfort

, et al. 2013. Comparative genomics boosts target prediction for bacterial small RNAs. Proc. Natl. Acad. Sci. U. S. A., 110, 3487–3496.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

2.45 MB