In this section, we will introduce the RNA mutually constrained folding problem, which predicts two alternative RNA structures of the minimum sum of free energies with the consideration of step-wise folding constraint effect. We will first present the basic problem formulation in section 2.1, and then present a straightforward solution to the problem using Nussinov's energy model in section 2.2. We will further introduce a more realistic algorithm with sophisticated Turner's energy model and an improved time complexity in section 2.3. Finally, we will discuss how we handle reactivities and pseudoenergies in section 2.4.
2.1. Problem formulation
We begin the formulation of the RNA mutually constrained folding problem by introducing the inputs of the algorithm. The algorithm requires three inputs: (1) the RNA sequence S with length l, from which the alternative structures are to be predicted; (2) a set of observed mixture reactivities
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$R = \{r_0 , r_1 , \ldots , r_{l - 1} \}$$
\end{document}
, where ri is the reactivity for the ith nucleotide; and (3) the expected partition of these two alternative structures ψ, where 0 ≤ ψ ≤ 1.
The outputs of the algorithm are two simultaneously predicted RNA structures TA and TB. We expect that the predicted structures are thermodynamically stable, or, the sum of the real free energies of the two structures, say E⋅(TA) + E⋅(TB) (a solid dot is used to indicate real energy), is minimized. At the same time, we also expect that the discrepancy between the expected and the observed reactivity profile, say |R−mixture(R(TA), R(TB))|, is minimized. Here, R(TA) refers to the expected reactivity profile of structure TA alone. The discrepancy of the reactivity profiles is usually quantified by the pseudoenergy (Deigan et al., 2009; Low and Weeks, 2010), and we adopt this measurement as well. For example, let TA(i) be the structural configuration of the ith nucleotide in TA, and assume TA(i) is paired. If the probing enzyme/chemical reagent is not reactive to the paired nucleotides, we should expect rather low reactivity observed at this nucleotide, that is,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\hat{r}_i = 0$$
\end{document}
. The following formula (reformulated from Deigan et al., 2009 and Low and Weeks, 2010) has been proposed to compute the pseudoenergy E∘ (a void dot is used to indicate pseudoenergy):
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}E^{\circ} (\mid r_i - \hat{r}_i \mid) = m * \log
(\mid r_i - \hat{r}_i \mid + 1) + b , \tag{1}\end{align*}
\end{document}
where m is positive and b is a negative constant. If
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\mid r_i - \hat{r}_i \mid = 0$$
\end{document}
(i.e., no discrepancy),
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E^{\circ} (\mid r_i - \hat{r}_i \mid) = b$$
\end{document}
, and a favorable pseudoenergy is returned to indicate that the assumption of TA(i) being paired is correct. If
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\mid r_i - \hat{r}_i \mid > 0 , E^{\circ} (\mid r_i - \hat{r}_i \mid) > b$$
\end{document}
, and a less favorable pseudoenergy is returned to question the assumption. We can simplify the notation of pseudoenergy as E∘(ri) since
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\hat{r}$$
\end{document}
is known. We also assume the pseudoenergies to be additive as the real energies, that is,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E^{\circ} (T_{i , j}) = \sum^j_{k = i}E^{\circ} (r_k)$$
\end{document}
. In this case, the pseudo energy can be incorporated into the real free energy, that is, E(TA) = E⋅(TA) + E∘(TA), and our algorithm will minimize E(TA) + E(TB) to simultaneously consider thermodynamic stability and reactivity discrepancy. Now, we can formally define the RNA mutually constrained folding problem as follows:
Input: an RNA sequence S, a set of mixture reactivities R, and an expected partition ψ.
Output: two RNA structures TA and TB of S, such that the sum of free energies (including both real and pseudo energies) E(TA) + E(TB) is minimized.
The key to the solution of this problem is based on the understanding of how TA and TB are mutually constrained, that is, how the folding of one structure may exert constraint on, or be constrained by, the other structure. Recall that ri is the observed reactivity at the ith nucleotide, and let it be a mixture (with a ratio ψ) of the reactivity from structure A (defined as
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i^A$$
\end{document}
) and structure B (defined as
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i^B$$
\end{document}
). In other words,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i = f (\psi , r_i^A , r_i^B)$$
\end{document}
. Also assume that we can represent the observed reactivity
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i^B$$
\end{document}
by its expectation
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\hat{r}_i^B$$
\end{document}
, which is determined by the structural configuration TB(i), i.e.,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i^B = \hat{r}_i^B = g (T^B (i))$$
\end{document}
. Therefore,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i = f_g (\psi , r_i^A , T^B (i))$$
\end{document}
and finally,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i^A = f^{\prime}_g (\psi , r_i , T^B (i))$$
\end{document}
. For the sake of simplicity, we will write
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$r_i^A = f (T^B (i))$$
\end{document}
, as ψ and ri are not variables and are given as the inputs. In this case, the pseudoenergy used for folding TA is affected by the structure of TB and vise versa. In this case, TA and TB are mutually constrained.
With the understanding of mutual constraints, we can easily devise a brute force solution, which takes exponential time. We can enumerate all possible TB as constraints and fold TA correspondingly. The pair of TA and TB that results in the minimum sum of free energies can be taken as the optimal solution. However, this approach is computationally expensive, as there are an exponential number of possible structures that are to be enumerated as the structural constraint TB. To resolve this issue, we can break down the problem into smaller subproblems using dynamic programming formulation. Let
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_1 \ldots j_1) ^A$$
\end{document}
be a substructure in TA (which begins with Si1 and ends with Sj1), and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_2 \ldots j_2) ^B$$
\end{document}
be a substructure in TB (which begins with Si2 and ends with Sj2). These two substructures may or may not overlap each other. If the optimal structures (with the minimum sum of energies) adopted by both subregions, say Ti1,j1;i2,j2, are known, we can use them as structural constraints to fold nearby RNA sequences. We can reach the final solution using this approach by extending i1, j1 to 0 and i2, j2 to l−1.
To guarantee optimality, we need to consider all possible constraining orders when computing Ti1,j1;i2,j2. For example, let “−” indicate a void region that exerts no structural constraint, T−,−;i2,k2 → Ti1,j1;i2,k2 → Ti1,j1;i2,j2 (where i2 ≤ k2 ≤ j2) represents the following constraining order: (1)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_2 \ldots k_2) ^B$$
\end{document}
is folded without constraint, (2)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_1 \ldots j_1) ^A$$
\end{document}
is folded by applying
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_2 \ldots k_2) ^B$$
\end{document}
as a structural constraint, and (3)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(k_2 + 1 \ldots j_2) ^B$$
\end{document}
is folded by applying
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_1 \ldots j_1) ^A$$
\end{document}
as a structural constraint. Other constraining orders are possible as well. For example, T−,−;i2,k2 → Ti1,k1;i2,k2 → Ti1,k1;i2,j2 → Ti1,j1;i2,j2 (where i1 ≤ k1 ≤ j1) can act as an alternative constraining order to be traversed during the computation of Ti1,j1;i2,j2. In summary, all subregions (Ti1,j1;i2,j2) and all constraining orders need to be taken into account in the algorithm. We present a combinatorial solution using the Nussinov's energy model (Nussinov et al., 1978) in the following section.
2.2. An algorithm with Nussinov's energy model
In this section, we will introduce a solution for the RNA mutually constrained folding problem using Nussinov's energy model (Nussinov et al., 1978) to facilitate the understanding of the major idea. The object function of Nussinov's RNA folding formulation is to maximize the number of base pairs, instead of minimizing the free energy, of the predicted RNA structure. Therefore, denote F(i1,j1;i2,j2) as the maximum number of base pairs within the subregions
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_1 \ldots j_1) ^A$$
\end{document}
and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_2 \ldots j_2) ^B$$
\end{document}
, where
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$F (i_1 , j_1; i_2 , j_2) = F (T_{i_1 , j_1; i_2 , j_2}^A) + F (T_{i_1 , j_1; i_2 , j_2}^B)$$
\end{document}
, and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$T_{i_1 , j_1; i_2 , j_2}^A$$
\end{document}
is the structure for the subregion
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_1 \ldots j_1) ^A$$
\end{document}
of Ti1,j2;i2,j2. Note that F also contains both real and pseudo base pairs, that is, F = F⋅ + F∘. Also denote
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$F^A (i_1 , j_1 , T_{i_2 , j_2}^B)$$
\end{document}
as the maximal number of base pairs within the subregions
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_1 \ldots j_1) ^A$$
\end{document}
given the structure
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$T_{i_2 , j_2}^B$$
\end{document}
as a structural constraint. For the sake of clarity, we underline the terms that correspond to the terminal cases, whose values can be directly computed or looked up. We can compute F(i1,j1;i2,j2) using the following recursive function:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}
& F (i_1 , j_1; i_2 , j_2) \\& = \max \begin{cases}0 \quad [ {\rm
if} \ i_1 = j_1 \ {\rm and} \ i_2 = j_2 ] , \\ F (i_1 + 1 , j_1
\,- \, 1; i_2 , j_2) \, + \, 1 \, + \, \underline{F^{\circ}} (f
(T_{i_1 + 1 , j_1 - 1; i_2 , j_2}^B (i_1))) +
\underline{F^{\circ}} (f (T_{i_1 + 1 , j_1 - 1; i_2 , j_2}^B
(j_1))) [{\rm if} \ i_1 \ pairs with \ j_1 ] ,
\\ F (i_1 , j_1; i_2 + 1 , j_2 - 1) \, + \, 1 \, +\,
\underline{F^{\circ}} (f (T_{i_1 , j_1; i_2 + 1 , j_2 - 1}^A
(i_2))) + \underline{F^{\circ}} (f (T_{i_1 , j_1; i_2 + 1 , j_2 -
1}^A (j_2))) [{\rm if} \ i_2 \ {pairs with} \ j_2 ] ,
\\ \max_{i_1 < k_1 \leq j_1} \{F (k_1 + 1 , j_1; i_2 , j_2) +
F^A (i_1 , k_1 , T_{{k_1 + 1 , j_1; i_2 , j_2}}^B) \} , \\
\max_{i_1 \leq k_1 < j_1} \{F (i_1 , k_1 - 1; i_2 , j_2) + F^A
(k_1 , j_1 , T_{{i_1 , k_1 - 1; i_2 , j_2}}^B) \} , \\ \max_{i_2
< k_2 \leq j_2} \{F (i_1 , j_1; k_2 + 1 , j_2) + F^B (i_2 , k_2 ,
T_{{i_1 , j_1; k_2 + 1 , j_2}}^A) \} , \\ \max_{i_2 \leq k_2 <
j_2} \{F (i_1 , j_1; i_2 , k_2 - 1) + F^B (k_2 , j_2 , T^A_{i_1 ,
j_1; i_2 , k_2 - 1}) \}. \end{cases} \tag{2}
\end{align*}
\end{document}
The first case described in Equation (2) corresponds to a boundary case where no base pair is formed. The second and third cases correspond to paired cases, where the outmost nucleotides (i1 and j1, or i2 and j2, respectively) form a base pair. In this case, “1” is added to indicate the base pair that has formed, and how well the observed reactivity supports the pair is evaluated by pseudo base pairs (
F
∘
s). The last four cases try all possible branching points with different constraining orders. Take the fourth case as an example; the last added structural component (i.e.,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(i_1 \ldots k_1) ^A$$
\end{document}
) will be predicted using the existing optimal substructure (i.e.,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$T_{{k_1 + 1 , j_1; i_2 , j_2}}^B$$
\end{document}
) as a constraint by the traditional Nussinov's folding algorithm (Nussinov et al., 1978) with soft constraints.
The optimal structural configuration for a region with a structural constraint, for example,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$F^A (i , j , T_{{i_1 , j_1; i_2 , j_2}}^B)$$
\end{document}
[similar for
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$F^B (i , j , T_{{i_1 , j_1; i_2 , j_2}}^A)$$
\end{document}
], can be computed as follows:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*} & F^A (i , j , T_{{i_1 , j_1; i_2 , j_2}}^B) \\ &
= \max \begin{cases}0 \quad\quad [ {\rm if} \ i = j ] , \\ F^A (i
+ 1 , j - 1 , T_{{i_1 , j_1; i_2 , j_2}}^B) + 1 +
\underline{F^{\circ}} (f (T_{{i_1 , j_1; i_2 , j_2}}^B (i))) +
\underline{F^{\circ}} (f (T_{{i_1 , j_1; i_2 , j_2}}^B (j))) \quad
[ {\rm if} \ i \ {pairs with} \ j ] , \\ \max_{i < k \leq j}
\{F^A (i , k - 1 , T_{{i_1 , j_1; i_2 , j_2}}^B) + F^A (k , j ,
T_{i_1 , j_1; i_2 , j_2}^B) \} , \end{cases} & (3) \end{align*}
\end{document}
A direct implementation of this algorithm leads to an O(l8) time complexity and O(l5) space complexity. Indeed, to complete the algorithm, we need to fill up a four-dimensional dynamic programming table F, which requires O(l4) time. For each entry in F, O(l) time is used for traversing all branching k1 and k2, and O(l3) is used to compute the constrained folding FA and FB. Therefore, the overall time complexity would be O(l8). The space complexity is O(l5). Note that the algorithm needs to maintain the four-dimensional table F, in addition, and O(l) space is also required for each entry of F to record the corresponding optimal structure that would be used as a structural constraint in the future folding steps. Hence the overall space complexity is O(l5).
2.3. An improved algorithm with Turner's energy model
The time and space complexity of the previous algorithm are prohibitively high and are not feasible for most real RNA sequences. Therefore, we need to devise a more efficient algorithm. At the same time, we need to consider the more realistic Turner's energy model (Turner et al., 1988). Inspired by the idea of RNAscf (Bafna et al., 2006), we observe that the major scaffolds of RNA secondary structures can be represented by stacks. A stack, built from a number of continuously nested base pairs, form the regular A-form helix of the RNA structure that stabilizes the structure. Note that we only consider the significant stacks, that is, those with more than four base pairs and eight hydrogen bonds, as the number of these significant stacks is usually small and less than the length of the RNA sequence (Bafna et al., 2006). Therefore, at each folding step we will add a stack or a structural component enclosed by a stack. Thus we can achieve significant speedup compared to the previous algorithm with base-pair resolution.
We begin the exposition of the algorithm by introducing basic definitions of stacks and their relationships. An RNA structure can be represented by a set of significant stacks; denote the set as
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$${\cal P}$$
\end{document}
. A stack p can be uniquely determined by three indices: the leftmost endpoint l(p), the rightmost endpoint r(p), and the width of the stack w(p). The nucleotides at l(p) and r(p) form the outmost (smallest 5′ and largest 3′ indices) base pair of p, while l(p) + w(p) − 1 and r(p) − w(p) + 1 form the innermost base pair of p. To simplify the notations, we also say that lI(p) and rI(p) form the innermost base pair of p. The stacks can be partially ordered by increasing rightmost endpoints and decreasing leftmost endpoints. With such partial ordering, we can denote the ith stack in
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$${\cal P}$$
\end{document}
as pi.
Let pi and pj be two stacks in
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$${\cal P}$$
\end{document}
and assume that i < j. If pi is enclosed by pj, that is, lI(pj) < l(pi) and rI(pj) > r(pi), denote their relationship as pi <
I
p
j
. If pi is juxtaposed to pj, that is, r(pi) < l(pj), denote their relationship as pi <
J
pj. If there is no stack pk such that pi < J pk and pk < J pj, we say that pi is directly before pj. Note that there may exist more than one stack that are directly before pj, therefore denote the stacks that are directly before pj as a set
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$${\cal F} (p_j)$$
\end{document}
. The size of the set
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$${\cal F} (p_j)$$
\end{document}
is expected to be a constant when only the significant stacks are considered (Bafna et al., 2006).
Since Turner's energy model also considers the free energies of loops (unpaired regions) formed between stacks, we define the loop regions as follows. Denote the hairpin loop formed by stack pi as L(pi), which refers to the region
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(l_I (p_i) + 1 \ldots r_I (p_i) - 1)$$
\end{document}
. Denote the internal/bulge loops formed between stacks pi and pj as Ll(pi, pj) and Lr(pi, pj) if pi < I pj, which refer to the two regions
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(l_I (p_j) + 1 \ldots l (p_i) - 1)$$
\end{document}
and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(r (p_i) + 1 \ldots r_I (p_j) - 1)$$
\end{document}
, respectively. If not specified, L(pi, pj) is used to represent both loops. Denote the multibranch loop formed between stacks pi and pj as L(pi, pj) if pi < J pj, which refers to the region
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$(r (p_i) + 1 \ldots l (p_j) - 1)$$
\end{document}
. Finally, we can also represent a loop region by explicitly giving the sequence region, for example,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$L (i \ldots j)$$
\end{document}
is a loop starting from the ith nucleotide and ending with the jth nucleotide.
Let the minimum free energy of the regions enclosed by stacks pi and pj (including these two stacks) be E(pi; pj), and E = E⋅ + E∘. If we artificially add a stack p*, where l(p*) = 0, r(p*) = l − 1 and w(p*) = 0, we can retrieve the global optimal solution from E(p*;p*). For clarity, we explicitly write E(pi;pj) as
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E (p_i^A; p_j^B)$$
\end{document}
to indicate that pi is presented in the structure A and pj is presented in the structure B. Also, denote
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_h (p_i^A; p_j^B)$$
\end{document}
as the minimum free energy for the most recent hairpin loop folding event,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_l (p_i^A; p_j^B)$$
\end{document}
for the most recent internal/bulge loop folding event, and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_m (p_i^A; p_j^B)$$
\end{document}
for the most recent multibranch loop folding event. Therefore:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}E (p_i^A; p_j^B) = \min \{0 , E_h (p_i^A; p_j^B) , E_l (p_i^A; p_j^B) , E_m (p_i^A; p_j^B) \} , \tag{4}\end{align*}
\end{document}
where the first case “0” is a boundary case where no structure is formed.
To compute the hairpin loop energy
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_h (p_i^A; p_j^B)$$
\end{document}
, denote
es
(p) as the free energy of a stack p, and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{e_h} (L (p_i^A))$$
\end{document}
as the free energy for the hairpin loop
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$L (p_i^A)$$
\end{document}
(recall that the underlined terms indicate the terminal cases that can be directly computed or looked up). Denote
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{E^{uc}} (p_i^A)$$
\end{document}
as the minimum free energy for the stack
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
and the region enclosed by it when folded without mutual constraint. The matrix
Euc
can be precomputed by using the traditional minimum free energy folding algorithms (Zuker and Sankoff, 1984; Hofacker et al., 1994; Reuter and Mathews, 2010), while the reactivities are used as soft constraints [extending the recursive function for computing
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$F^A (i , j , T_{{i_1 , j_1; i_2 , j_2}}^B)$$
\end{document}
with Turner's energy model]. Note that we only need to precompute the matrix once, and all required unconstrained folding results can be retrieved. Let the structure that corresponds to
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{E^{uc}} (p_i^A)$$
\end{document}
be
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$T^{uc} (p_i^A)$$
\end{document}
. For pseudoenergies, denote
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{E^{\circ}} (p_i^A; T^B)$$
\end{document}
as the pseudoenergy of adopting
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
as a stack into the structure given the constraint TB, and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{E^{\circ}} (L (p_i^A) ; T^B)$$
\end{document}
as the pseudoenergy of adopting the loop region given the constraint. (We do not consider loop pseudoenergy if both structures are unpaired at this region.) The recursive function for computing
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_h (p_i^A; p_j^B)$$
\end{document}
considers two cases, where (1)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_j^B$$
\end{document}
, or (2)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
is recently added as a hairpin loop:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}E_h (p_i^A; p_j^B) = \min \begin{cases} \underline{E^{uc}} (p_i^A) + \underline{E^{\circ}} (p_j^B; T_{{p_i^A}}^{uc}) + \underline{E^{\circ}} (L (p_j^B) ; T_{{p_i^A}}^{uc}) + \underline{e_s} (p_j^B) + \underline{e_h} (L (p_j^B)) , \\ \underline{E^{uc}} (p^B_j) + \underline{E^{\circ}} (p_i^A; T_{{p_j^B}}^{uc}) + \underline{E^{\circ}} (L (p_i^A) ; T_{{p_j^B}}^{uc}) + \underline{e_s} (p_i^A) + \underline{e_h} (L (p_i^A)) .\end{cases} & (5) \end{align*}
\end{document}
To consider the internal/bulge loop case, denote
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{e_l} (L (p_x^A , p_i^A))$$
\end{document}
as the free energy for the internal/bulge loop formed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_x^A$$
\end{document}
and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
, if
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_x^A <_{I} p_i^A$$
\end{document}
. The recursive function for computing
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_l (p_i^A; p_j^B)$$
\end{document}
considers two cases, where (1)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_j^B$$
\end{document}
or (2)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
is recently added as an internal/bulge loop:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}
E_l{(p_i^A; p_j^B)} = \min \begin{cases} \min_{p_y^B < _I p_j^B}
\{E (p_i^A; p_y^B) + \underline{E^o} (p_j^B; T_{{p^A_i{o ;}
p^B_y}}^A) + \underline{E^{o}} (L (p_y^B; p_j^B) ; T_{{p^A_i{o ;}
p^B_y}}^A) + \underline{e_s} (p_j^B) + \underline{e_l} (L (p_y^B ,
p_j^B)) \} ,\\ \min_{p_x^A < _I p_i^A} \{E (p_x^A; p_j^B) +
\underline{E^{o}} (p_i^A; T_{{p^A_x{o;} p^B_j}}^B) +
\underline{E^{o}} (L (p_x^A; p_i^A) ; T_{{p^A_x{o;} p^B_j}}^B) +
\underline{e_s} (p_i^A) + \underline{e_l} (L (p_x^A , p_i^A)) \}.
\end{cases} {(6)}\end{align*}
\end{document}
To compute the multibranch loop case, we have to introduce a new three-dimensional matrix Em1. Em1 stores the minimum free energy formed between an opened multibranch loop and a closed loop. The opened multibranch loop can be viewed as a chain, which is formally defined as a set of juxtaposing stacks and their enclosed structural components. Therefore, the entry
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m1} (p_i^A , p_x^A; p_j^B)$$
\end{document}
is the optimal structural configuration formed between the chain that is ended with
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_x^A$$
\end{document}
and enclosed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
(
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
itself is NOT included in the chain), and the structural component that is enclosed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_j^B$$
\end{document}
(where
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_j^B$$
\end{document}
itself is included). Let
ema
be the multibranch loop closing penalty,
emb
be the unpaired region extension penalty (applied on the length of the loop L, |L|), and
emc
be the bonus free energy for adding a new branch. The recursive function for computing
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_m (p_i^A; p_j^B)$$
\end{document}
considers two cases, where (1)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_j^B$$
\end{document}
, or (2)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p_i^A$$
\end{document}
is recently added as a multibranch loop:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}E_m (p^A_i , p^B_j) = \min
\begin{cases} \min_{p^B_y < _I p^B_j} \{E_{m1} (p^A_i; p^B_j ,
p^B_y) + \underline{E^{o}} (p^B_j , T^A_{p^A_i; p^B_j , p^B_y}) +
\underline{E^{o}} (L_r (p^B_y , p^B_j) , T^A_{p^A_i; p^B_j ,
p^B_y}) \\ \qquad \,\,+ \underline{e_s} (p^B_j) +
\underline{e_{ma}} + \underline{e_{mb}} * \mid L_r (p^B_y , p^B_j)
\mid + \underline{e_{mc}} \} , \\ \min_{p^A_x < _I p^A_i}
\{E_{m1} (p^A_i , p^A_x; p^B_j) + \underline{E^{o}} (p^A_i ,
T^B_{p^A_i , p^A_x; p^B_j}) + \underline{E^{o}} (L_r (p^A_x ,
p^A_i) , T^B_{p^A_i , p^A_x; p^B_j}) \\ \qquad \,\,+
\underline{e_s} (p^A_i) + \underline{e_{ma}} + \underline{e_{mb}}
* \mid L_r (p^A_x , p^A_i) \mid + \underline{e_{mc}} \}
.\end{cases} \tag {7}\end{align*}
\end{document}
To compute
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m1} (p^A_i , p^A_x; p^B_j)$$
\end{document}
, we introduce another matrix Em2 that corresponds to the minimum free energy configuration formed between two chains. The two chains that correspond to the entry
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m2} (p^A_i , p^A_x; p^B_j , p^B_y)$$
\end{document}
are the ones that ended with
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
(enclosed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_i$$
\end{document}
) and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_y$$
\end{document}
(enclosed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_j$$
\end{document}
), respectively. Let the term ‘
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E (p^A_i , p^B_j) .E^{\bullet} (A)$$
\end{document}
’ refer to the real free energy of the structural component formed in A as recorded in
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E (p^A_i , p^B_j)$$
\end{document}
. For boundary cases, denote
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{E^{uc}_m} (p^A_i , p^A_x)$$
\end{document}
as the unconstrained free energy of the chain that is enclosed at
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_i$$
\end{document}
and ended at
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
. Note that the
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{E^{uc}_m}$$
\end{document}
matrix is auxiliary to
Euc
matrix (Zuker and Sankoff, 1984; Hofacker et al., 1994), which can also be precomputed for a constant time look-up. Let the corresponding structure of
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$\underline{E^{uc}_m} (p^A_i , p^A_x)$$
\end{document}
be
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$T^{uc}_{p^A_i , p^A_x}$$
\end{document}
. The recursive function for computing
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m1} (p^A_i , p^A_x; p^B_j)$$
\end{document}
considers five cases: where the closed loop
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_j$$
\end{document}
is recently added as (1) a hairpin loop, (2) an internal/bulge loop, or (3) a multi-branch loop, respectively; or the last component
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
in the chain is recently added as (4) an extension, or (5) the beginning of the chain.
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}E_{m1} (p^A_i , p^A_x; p^B_j) = \min
\begin{cases} \underline{E^{uc}_m} (p^A_i , p^A_x) +
\underline{E^{o}} (p^B_j , T^{uc}_{p^A_i , p^A_x}) +
\underline{E^{o}} (L (p^B_j) , T^{uc}_{p^A_i , p^A_x}) +
\underline{e_s} (p^B_j) + \underline{e_h} (L (p^B_j)) , \\
\min_{p^B_y < _I p^B_j} \{E_{m1} (p^A_i , p^A_x; p^B_y) +
\underline{E^{o}} (p^B_i , T^A_{p^A_i , p^A_x; p^B_y}) +
\underline{E^{o}} (L (p^B_y , p^B_j) , T^A_{p^A_i , p^A_x; p^B_y})
\\ \qquad \,\,+ \underline{e_s} (p^B_j) + \underline{e_l} (L
(p^B_y , p^B_j)) \} , \\ \min_{p^B_y < _I p^B_j} \{E_{m2} (p^A_i
, p^A_x; p^B_j , p^B_y) + \underline{E^{o}} (p^B_j , T^A_{p^A_i ,
p^A_x; p^B_j , p^B_y}) + \underline{E^{o}} (L_r (p^B_y , p^B_j) ,
T^A_{p^A_i , p^A_x; p^B_j , p^B_y}) \\ \qquad \,\,+
\underline{e_s} (p^B_j) + \underline{e_{ma}} +
\underline{e_{mb}} * \mid L_r (p^B_y , p^B_j) \mid \} , \\
\min_{p^A_u \in {\cal F} (p^A_x)} \{E_{m1} (p^A_i , p^A_u; p^B_j)
+ \underline{E^{o}} (L (p^A_u , p^A_x) , T^B_{p^A_i , p^A_u;
p^B_j}) + \underline{e_{mb}} * \mid L (p^A_u , p^A_x) \mid +
\underline{e_{mc}} \\ \qquad \,\,+ \min_{p^B_v \in T^B_{p^A_i ,
p^A_u; p^B_j}} \{E (p^A_x; p^B_v) .E^{\bullet} (A) +
\underline{E^{o}} (T^A_{p^A_x; p^B_v} , T^B_{p^A_i , p^A_u;
p^B_j}) \} \} , \\ \underline{E^{uc}} (p^B_j) +
\underline{E^{o}} (L_l (p^A_x , p^A_i) , T^{uc}_{p^B_j}) +
\min_{p^B_v \in T^{uc}_{p^B_j}} \{E (p^A_x , p^B_v) .E^{\bullet}
(A) + \underline{E^{o}} (T^A_{p^A_x; p^B_v} , T^{uc}_{p^B_j}) \}
\\ \qquad \,\,+ \underline{e_{mb}} * \mid L_l (p^A_x , p^A_i) \mid
+ \underline{e_{mc}}.\end{cases} \tag {8}\end{align*}
\end{document}
In the last two cases, note that we do not fold the structure enclosed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
from scratch as we did in the naive algorithm. Instead, we assume that its structure is majorally determined by only one structural constraint. Let the structural constraint be enclosed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_v$$
\end{document}
. By search all
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_v \in T^B_{p^A_i , p^A_u; p^B_j , p^B_y}$$
\end{document}
, we will identify this structural constraint from
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$T^B_{p^A_i , p^A_u; p^B_j , p^B_y}$$
\end{document}
. Once we have identified
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_v$$
\end{document}
, which encloses the structural constraint, we can retrieve the corresponding structure that is enclosed by
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
from
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E (p^A_x; p^B_v)$$
\end{document}
. Note that we omitted the cases where
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
adopts no structure, which can be computed easily by adjusting the length to be applied on
emb
and discard
emc
.
Note that the recursive function for computing
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m1} (p^A_i; p^B_j , p^B_y)$$
\end{document}
can be easily derived based on the symmetricity. Therefore, we omit the exposition of this part. Finally, the recursive function for computing
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m2} (p^A_i , p^A_x; p^B_j , p^B_y)$$
\end{document}
considers four cases, where (1)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_y$$
\end{document}
from the chain is recently added as an extension of the existing chain, (2)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
is added as an extension, (3)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_y$$
\end{document}
is added as the beginning of the chain, or (4)
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
is added as the beginning of the chain:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
\begin{align*}
\begin{split}&E_{m2} (p^A_i , p^A_x; p^B_j , p^B_y) \\ &= \min
\begin{cases} \min_{p^B_v \in {\cal F} (p^B_y)} \{E_{m2} (p^A_i ,
p^A_x; p^B_j , p^B_v) + \underline{E^{o}} (L (p^B_v , p^B_y) ,
T^A_{p^A_i , p^A_x; p^B_j , p^B_v}) + \underline{e_{mb}} * \mid L
(p^B_v , p^B_y) \mid + \underline{e_{mc}} \\ \qquad + \min_{p^A_u
\in T^A_{p^A_i , p^A_x; p^B_j , p^B_v}} \{E (p^A_u , p^B_y)
.E^{\bullet} (B) + \underline{E^{o}} (T^B_{p^A_u; p^B_y} ,
T^A_{p^A_i , p^A_x; p^B_j , p^B_v}) \} \} , \\ \min_{p^A_u \in
{\cal F} (p^A_x)} \{E_{m2} (p^A_i , p^A_u; p^B_j , p^B_y) +
\underline{E^{o}} (L (p^A_u , p^A_x) , T^B_{p^A_i , p^A_u; p^B_j ,
p^B_y}) + \underline{e_{mb}} * \mid L (p^A_u , p^A_x) \mid +
\underline{e_{mc}} \\ \qquad + \min_{p^B_v \in T^B_{p^A_i , p^A_u;
p^B_j , p^B_y}} \{E (p^A_x , p^B_v) .E^{\bullet} (A) +
\underline{E^{o}} (T^A_{p^A_x; p^B_v} , T^B_{p^A_i , p^A_u; p^B_j
, p^B_y}) \} \} , \\ \underline{E^{uc}_m} (p^A_i , p^A_x) +
\underline{E^{o}} (L_l (p^B_y , p^B_j) , T^{uc}_{p^A_i , p^A_x}) +
\underline{e_{mb}} * \mid L_l (p^B_y , p^B_j) \mid +
\underline{e_{mc}} \\ \qquad + \min_{p^A_u \in T^{uc}_{p^A_i ,
p^A_x}} \{E (p^A_u , p^B_y) .E^{\bullet} (B) + \underline{E^{o}}
(T^B_{p^A_u; p^B_y} , T^{uc}_{p^A_i , p^A_x}) \} , \\
\underline{E^{uc}_m} (p^B_j , p^B_y) + \underline{E^{o}} (L_l
(p^A_x , p^A_i) , T^{uc}_{p^B_j , p^B_y}) + \underline{e_{mb}} *
\mid L_l (p^A_x , p^A_i) \mid + \underline{e_{mc}} \\ \qquad +
\min_{p^B_v \in T^{uc}_{p^B_j , p^B_y}} \{E (p^A_x , p^B_v)
.E^{\bullet} (A) + \underline{E^{o}} (T^A_{p^A_x; p^B_v} ,
T^{uc}_{p^B_j , p^B_y}) \} . \end{cases}\end{split} \tag
{9}\end{align*}
\end{document}
Note that we also omitted the cases where
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
or
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_y$$
\end{document}
adopts no structure, which can be computed by adjusting the loop size for
emb
and discarding
emc
.
The time complexity of the improved algorithm is O(n5), where n is the number of significant stacks predicted from the input RNA sequence. It is shown that n < l (Bafna et al., 2006), and thus the improved algorithm is feasible for most real RNAs. The algorithm will fill up the two-dimensional matrix E. To compute an entry in E, say
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E (p^A_i , p^B_j)$$
\end{document}
, one needs to compute three matrices:
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m1} (p^A_i , p^A_x; p^B_j)$$
\end{document}
,
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m1} (p^A_i; p^B_j , p^B_y)$$
\end{document}
, and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$E_{m2} (p^A_i , p^A_x; p^B_j , p^B_y)$$
\end{document}
. Since
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_i$$
\end{document}
and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_j$$
\end{document}
are determined, the variables become
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_x$$
\end{document}
and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_y$$
\end{document}
. Therefore, we can use O(n) time to compute the Em1 matrix, and O(n2) time to compute the Em2 matrix. Since we need to traverse a number of constraining structural components for computing each entry of Em1 and Em2, the time complexities add up to O(n2) and O(n3), respectively. Hence, the overall time complexity for this algorithm is O(n5).
The space complexity of the improved algorithm is O(n3l). Consider the fact that the matrix Em2 is only referred to by the computation of Em1 with the same enclosing base pairs (
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^A_i$$
\end{document}
and
\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document}
$$p^B_j$$
\end{document}
), it can be discarded immediately once the corresponding entries in Em1 are computed. Therefore, we only need to store E and Em1. For each entry in E and Em1, O(l) space is used to record the optimal structures. As a result, the overall space complexity for this algorithm is O(n3l) (note that Em1 is a three-dimensional matrix).