Abstract
Abstract
Metabolites regulate their own production by directly interacting with highly conserved regions of mRNA that are capable of forming discrete tertiary structures. Such regions of mRNA are called riboswitches. The thiamine pyrophosphate (TPP) riboswitch is the most common riboswitch in different organisms. The TPP is an essential coenzyme that is synthesized by the coupling of pyrimidine (hydroxymethyl pyrimidine) and thiazole (hydroxyethyl thiazole). The TPP riboswitch was searched across all possible phyla of plant kingdom by using Arabidopsis thaliana, a model organism in which TPP riboswitch is already found. The aptameric domain of the TPP riboswitch is conserved at the sequence as well as structure levels in all chosen plant species. The sequence of the noncoding RNA that acts as a riboswitch and is folded into an appropriate stem-loop hairpin secondary structure with minimum free energy is predicted by several computational tools. Most of the secondary structures are similar but not the same, because of variation in sequence size. The TPP ligand can bind to the 3′ untranslated region of the aptameric sequence, between the loops P2, P4, and P5 and the region between J2/3 and J4/5. The sequence of these loop regions in all predicted tertiary structure of the riboswitch in representative plant species—green algae to flowering plants—is the same, and the residues situated in these junctions are directly involved in binding thymine pyrophosphate and are conserved in all the representative species.
1. Introduction
R
Various types of riboswitches have been characterized in prokaryotes (Mandal et al., 2003; Cromie et al., 2006). The thiamine pyrophosphate (TPP) riboswitch is the only riboswitch that has been identified in eukaryotic organism
In all cases, the binding of TPP to the RNA molecule downregulates the expression of thiamine biosynthetic genes and thereby decreases TPP biosynthesis. The sequence of the TPP aptameric domain is highly conserved in all three major kingdoms of life (Hammann and Westhof, 2007). The TPP riboswitch is located at 70 bp downstream of the poly-adenylation signal (Bocobza and Aharoni, 2008). The ligand-free state is a partially folded structure of the RNA, but conformational changes occur upon binding to its ligand (Ali et al., 2010). The well-defined 2.9Å-resolution crystal structure of the TPP riboswitch with its natural ligand binding state and reorganization of aptameric sequence upon ligand binding is found in Arabidopsis thaliana (Thore et al., 2006).
The TPP riboswitches are present in a variety of plant species where they reside in the 3′ UTRs of THIC genes. Formation of THIC transcripts with alternative 3′ UTR lengths is dependent on riboswitch function and mediates feedback regulation of THIC expression in response to changes in cellular TPP levels.
This study was undertaken to explore the existence of TPP riboswitch and its structure and function in several plant species. We found that the length of the TPP riboswitch is ranging from 77 to 130 nt, and the structure at the secondary level also varies. The tertiary structure is quite conserved, and the visualization is done by using PyMol (Schrodinger, 2010) to understand the functioning of TPP riboswitch in the different categories of plants.
2. Materials and Methods
The genomic sequences of following organisms have been taken from NCBI ftp ref sequence genome. All the plant species taken as material is shown in Table 1.
Monocot angiosperm plant category.
Dicot angiosperm plant category.
The nucleotide sequence information of Arabidopsis thaliana and other reference plant species of the TPP riboswitch is extracted from the NCBI database. This reference sequence is used for homology search against the nucleotide database using blast algorithm (Kent, 2002). Here, the lowest E-value signifies the best match of the different sequences to the reference. Now the FASTA file of the selected sequences that are homologous to the reference sequence is extracted from the NCBI database. The TPP riboswitch is predicted by the RibEX (Abreu-Goodger and Merino, 2005) and RiboSW tools (Chang et al., 2009) and then validated with the help of the noncoding Rfam database (Griffiths-Jones et al., 2005). Conservation of the riboswitch in several organisms is done by multiple alignments with the help of the MUSCLE server (Edgar, 2004). The secondary structure of the putative riboswitches is elucidated by the RNA secondary structure prediction server (Reuter and Mathews, 2010). The LOGO representation of these structures was obtained with WebLogo software (Crooks et al., 2004). The tertiary structure of the aptameric region of the TPP riboswitch in different plants is predicted by the RNAComposer web server (Popenda et al., 2012), and the tertiary structure is visualized by the PyMOL tool.
3. Results and Discussion
In eukaryotes, the primary focus was on the THIC gene, which is known to be repressed by excess thiamine in Arabidopsis thaliana and other plant species. The TPP riboswitch is located in the 3′ UTR of the corresponding mRNA.
It is established that the TPP riboswitch is phylogenetically conserved in all phyla in the plant kingdom except Pteridophyta (fern). The sequence of the THIC gene for fern is not available in the NCBI database or any other public database. Bocobza et al. (2007) have claimed to have found the TPP riboswitch in Pteridophyta (fern) also. Personal communication to Dr. Bocobza confirmed that their result was based on fern sequence data from their lab (not made available to public databases).
It is observed that the sequence as well as the structure for the TPP riboswitch is conserved within a given class of organisms. The length of the aptameric region is in between 77 and 130 bp.
The structure of the TPP riboswitch, taken from different candidates, has five or more stems, that is, P1, P2, P3, P4, and P5. The sequence-level conservation of all the TPP riboswitch is shown in Figure 1. The P1 forward stem (GCAC) and P2 forward stem (AGGG) are conserved in all species, but P3 stem appears the most variable. P1 (GUGC) and P2 (CCCU) backward stem are also conserved, while P4 (CCU, AGG) and P5 (CACG, CCUG) forward and backward stem are also conserved in all plant species. The P3 stem is the most variable region, and is responsible for the ligand binding. Highly conserved TPP-binding aptamers are present in the 3′ UTRs of the THIC genes of the plant species Arabidopsis thaliana. The collection of plant TPP aptamer representatives was expanded by sequencing THIC genes from additional plant species and by conducting database searches for nucleotide sequences that conform to the TPP aptamer consensus sequence.

Multiple sequence alignment of the TPP riboswitch in different plants. Alignment of TPP aptamer sequences from various plant species reveals high conservation of sequences. Nucleotides forming stems P1 through P5 are highlighted in color. Sequences are derived from Ostreococuss lucimarinus, Micromonas pusilla, Chlamydomonas reinhardtii, Physcomitrella patens, Selaginella moellendorffii, Picea sitchensis, Phoenix dactylifera, Zea mays, Sorghum bicolour, Poa secunda, Brachypodium distachyon, Triticum aestivum, Oryza brachyantha, Oryza sativa Japonica, Oryza sativa, Oryza sativa Indica, Fragaria vesca, Prunus persica, Males x domestica, Populus trichocarpa, Linum usitatissimum, Ricinus communis, Jatropha curcas, Cajanus cajan, Glycine max, Cucumis sativus, Arabidopsis lyrata, Arabidopsis thaliana, Eutrema parvulum, Brassica rapa, Raphanus sativus, Carica papaya, Theobroma cacao, Lactuca sativa, Ocimum basilicum, Solanum lycopersicum, Solanum tuberosum, Solanum pimpinellifolium, Nicotiana benthamiana, Nicotiana tabacum. TPP, thiamine pyrophosphate.
An alignment of all available TPP aptamer sequences from plants reveals a high level of conservation of nucleotide sequence and a secondary structure consisting of stems P1 through P5 (Fig. 1) and the variable length of the P3 stem in various plant species. The logo of the nucleotide sequence of different plant species is shown in Figure 2. The secondary structure of the Arabidopsis thaliana with five stems (P1, P2, P3, P4, and P5) is shown in Figure 3. Most of the secondary structures are similar but not the same because of sequence size variation from one to another.

The nucleotide sequence of the TPP riboswitch in plant species.

Secondary structure of the aptameric region of the TPP riboswitch in the Arabidopsis thaliana with five stems, P 1, P2, P3, P4, and P5.
The noncoding RNA sequence can fold back on itself to form many possible secondary structures, but the actual secondary structure of an RNA sequence is the one with the minimum free energy. Secondary structures with minimum free energy in different plant species are predicted by several computational tools. Free energy of the secondary structure of the RNA is the sum of the energy of all the loops, and it depends on the size and degree of the loops. The size and degree of a loop are the number of unpaired bases and the number of base pairs in the loop, respectively (Lu et al., 2009). Since there are some differences at the sequence level of the TPP riboswitch, the minimum free energy of the two structures of same organism is different as given in Table 2.
MFE, minimum free energy of the thiamine pyrophosphate riboswitch.
The 3D view of the aptameric region of the TPP riboswitch of Arabidopsis thaliana (pdb id: 3D2G) with its ligand-bound state is visualized by PyMOL as shown in Figure 4. The figure shows that the TPP ligand mainly binds with the G11, G28, G48, G64, C65, and G66 residues of the aptameric part of the RNA. All these residues are the part of the P2, P4, and P5 stems and the region between junction J2/3 and J4/5.

Three-dimensional view of the aptameric region with ligand binding state of the TPP riboswitch.
The ligand-free tertiary structures of the aptameric region of the TPP riboswitch in a few representative plant species—Micromonas pusilla (green algae), Physcomitrella patens (moss), Triticum aestivum (monocot), and Brassica rapa (dicot)—are shown in Figure 5. These predicted structures show similarity to the tertiary structure of the TPP structure of Arabidopsis thaliana (3D2G) and bind with ligand at its appropriate position. The residues that are in these stem parts are conserved in all the plant species.

Tertiary structure of some representative plant species of different phylum.
This study has conclusively established that the aptameric region of the TPP riboswitch is conserved at the sequence as well as the structural levels for all plant species ranging from green algae to flowering plants.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
