- Research article
- Open access
- Published:
Analysis of secondary structural elementsĀ in human microRNA hairpin precursors
BMC Bioinformatics volumeĀ 17, ArticleĀ number:Ā 112 (2016)
Abstract
Background
MicroRNAs (miRNAs) regulate gene expression by targeting complementary mRNAs for destruction or translational repression. Aberrant expression of miRNAs has been associated with various diseases including cancer, thus making them interesting therapeutic targets. The composite of secondary structural elements that comprise miRNAs could aid the design of small molecules that modulate their function.
Results
We analyzed the secondary structural elements, or motifs, present in all human miRNAĀ hairpin precursors and compared them to highly expressed human RNAs with known structures and other RNAs from various organisms. Amongst human miRNAs, there are 3808 are unique motifs, many residing in processing sites. Further, we identified motifs in miRNAs that are not present in other highly expressed human RNAs, desirable targets for small molecules. MiRNA motifs were incorporated into a searchable database that is freely available.
We also analyzed the most frequently occurring bulges and internal loops for each RNA class and found that the smallest loops possible prevail. However, the distribution of loops and the preferred closing base pairs were unique to each class.
Conclusions
Collectively, we have completed a broad survey of motifs found in human miRNA precursors, highly expressed human RNAs, and RNAs from other organisms. Interestingly, unique motifs were identified in human miRNA processing sites, binding to which could inhibit miRNA maturation and hence function.
Background
MicroRNAs (miRNAs) regulate gene expression via targeting mRNAs for destruction or translation repression [1ā4]. Aberrant miRNA expression is associated with diseases [5, 6] including cancers [7], cardiovascular diseases [8], and HIV [9, 10]. In addition to being employed to explore mRNA and protein function in vivo [5, 11], miRNAs are also being explored as therapeutic targets [12, 13], in particular because overexpression of oncogenic miRNAs aids initiation and progression of various tumors [14ā16]. Different strategies have been used to inhibit oncogenic miRNAs, including antisense or sponge oligonucleotides that bind mature miRNAs [17, 18] and inhibiting miRNA processing with small molecules [19ā21]. A major liabilityĀ of oligonucleotide-based therapeutics is poor tissue-specific delivery and cellular uptake [17]. Small molecules have been neglected for targeting RNA in general because it was speculated that RNA structural flexibility leads to lack of binding specificity. However, recent successful examples of using small molecules to target different RNAs [22, 23] have stimulated increasing interests in using small molecules to target miRNAs.
Usually, small molecules bind to non-canonically paired regions of RNA [22], such as bulges, internal loops, and hairpin loops (Fig.Ā 1), as they provide enlarged major grooves for small molecule entry and partially exposed bases that can be exploited to increase specificity [13, 24]. Thus, miRNA hairpin precursors, which fold into stem loop structures that display various types of loops (Fig.Ā 1) [25], are ideal candidates for small molecule binding. MiRNA processing occurs in both the nucleus (via Drosha) and the cytoplasm (via Dicer/transactivating response RNA-binding protein (TRBP)) [26]. Therefore, small molecules that localize to either compartment could inhibit miRNA maturation.
The number of known miRNA sequences has expanded tremendously [27, 28] because of the development of deep-sequencing technology. To develop specific small molecules that inhibit the processing of a single or few miRNAs, it is essential to identify unique secondary structural elements, or motifs. That is, it is important to know which motifs occur and their frequencies. In this study, we built a database of motifs found in human miRNA secondary structures. We examined the frequency of these motifs and which motifs are preferred at processing sites. It is still a mystery how the Dicer/TRBP complex achieves accuracy in processing pre-miRNAs with such huge diversity (more than a thousand different sequences in human). MiRNA processing sites (where the miRNA strands are cleaved) are presumed to be important. This analysis was then completed for RNAs with known structures, including highly expressed human RNAs. We hope that this analysis will eventually help our understanding of miRNA processing and improve identification of potential target sites for small molecules.
Methods
MiRNA hairpin precursor sequences and structures
All Homo sapiens miRNA and mature miRNA sequences were obtained from miRBase v.17 [27] (http://www.mirbase.org/). The secondary structures of miRNA hairpin precursors were predicted by RNAstructure [29], which uses a free energy minimization algorithm [30]. Please note that miRNA hairpin precursor structure determination via free energy minimization is the standard in the field [25].
Other RNA sequences and structures
A previously constructed database of other RNA structures was also analyzed in order to make comparisons to miRNAs [31]. The database contains 1349 RNAs including 123 small subunit rRNAs [32], 223 large subunit rRNAs [32, 33], 309 5S rRNAs [34], 484 tRNAs [35], 91 signal recognition particles [36], 16 RNase P RNAs [37], 100 group I introns [38, 39], and three group II introns [40]. We also analyzed highly expressed human RNAs with known structures including 5S rRNA, 16S rRNA, 23S rRNA, 7SL (signal recognition particle), RNase P RNA, U4/U6 snRNA, and 465 non-redundant tRNAs [41].
Motif nomenclature
The motifs predicted in miRNA hairpin precursor secondary structures include bulges, internal loops, hairpins, and multibranch loops (Figs.Ā 1 and 2). Bulges are divided into two categories: 5ā bulge loops and 3ā bulge loops. Its designation as 5ā or 3ā is determined by the position of the unpaired nucleotide relative to the first hairpin loop in the miRNA's secondary structure (if it is 5ā to the hairpin loop or 3ā).
A motif includes closing base pair(s) and non-canonically paired nucleotides. Sequences are always written 5ā ā 3ā. Closing base pairs are indicated with parentheses (for example, (GC)), and both nucleotides are always designated due to the possibility of GU pairs. The nucleotide 5ā to the loop is always listed first. Base pairs are listed at the beginning and end of the motif sequence for bulges and internal loops, only at the beginning of hairpin loops, and between all unpaired regions of multibranch loops. A ā/ā separates the two sides of bulges and internal loop. Please see Figs.Ā 1 and 2 and the Results & Discussion for examples.
Determination of statistical significance: are two motifsā occurrence frequencies significantly different?
In order to determine if a particular motif is over- or under-represented, its statistical significance was calculated by a Z-score of type 1 error. That is, when Motif 1 occurred with probability p1 in a sample size n1, and Motif 2 occurred with probability p2 in a sample size n2, it is hypothesized that Motif 1 and Motif 2 occur with the same frequency. To reject this hypothesis, we calculate a Z-score using Eqs.Ā 1 and 2:
If the Z-score >2, the hypothesis is rejected, and Motifs 1 and 2 have significantly different occurrence frequencies; if the Z-score <2, then no conclusion can be drawn.
Determination of miRNA processing sites
The processing sites of a miRNA are defined as the first and last nucleotides in the mature miRNA. The mature miRNA was mapped onto miRNAĀ hairpin precursors, and the motifs or paired regions containing the two end nucleotides were selected. If the site contains unpaired or non-canonically paired nucleotides, the processing site could be the unpaired nucleotides or the closing base pair. If the processing site is in a paired region, the base pairs next to the processing site are also included.
Results and discussion
A database of human miRNA hairpin precursor motifs
The number of human (Homo sapiens) miRNA sequences deposited in miRBase [27] has doubled in the past few years. As of August 2014, there were 1881 human miRNA sequences in miRBase. Although the secondary structures of most miRNAs have not been determined experimentally, a uniform system for miRNA annotation has been developed that employs secondary structure determination via free energy minimization [25, 29]. That is, the structures of miRNA hairpin precursors are accurately predicted from sequence. Therefore, RNAstructure [29], a free energy minimization algorithm that employs experimentally determined thermodynamic values, was used to predict the secondary structures of miRNAĀ hairpin precursors. Only the lowest free energy structure was considered in our analysis. All non-canonically paired regions except the dangling ends for each hairpin precursor secondary structure were extracted and listed in the motif database. The database contains the following information for each motif: the miRNA ID/accession number, motif type (bulge, internal loop, hairpin, etc.), unpaired motif (single stranded nucleotides only), motif (unpaired nucleotides and the closing base pair(s)), and motif with closing base pairs and first non-nearest neighbor.
Motif nomenclature
A motif includes unpaired or non-canonically paired regions (denoted in red) and its closing base pair(s) (denoted in black). Bulges and internal loops have two closing base pairs, hairpins have one closing pair, and multibranch loops have three or more. Examples of the nomenclature used are provided in Figs.Ā 1 and 2. For example, the 5ā bulge loop in Fig.Ā 1 is indicated as (GC)G/-(UA) while the 3ā bulge loop is named (GC)-/U(UA). Likewise, Internal Loop 1 is named (GC)C/A(AU); Internal Loop 2 is (UA)A/AA(AU); and the hairpin is named (GU)UUUAGU. For multibranch loops, the base pairs and the unpaired strands are written in order from 5ā to 3ā end. Since the 5ā closing base pair is also the 3ā closing base pair, it is repeated but in the opposite orientation. Thus, the multibranch loop in Fig.Ā 2e is named (CG)A(GU)U(GC)C(GC) (5ā and 3ā closing base pairs denoted in bold). This nomenclature was developed such that the same unpaired regions with different closing base pairs can be distinguished from each other, for example (AU)U/-(GC) and (CG)-/U(UA); or (CG)C/A(UA) and (AU)A/C(GC) (Fig.Ā 2).
General survey of motifs in precursor miRNAs
(A searchable database of motifs found in human miRNA hairpin precursors based on our analysis is available at: http://www.scripps.edu/disney/software.html.) The motifs present in miRNAĀ hairpin precursor secondary structures are quite diverse. Of all miRNAs, only 32 (2.2Ā %) have fully paired stems (absence of non-canonically paired regions). The remaining 97.8Ā % have 1ā14 motifs in the stem. There are a total of 7436 non-canonically paired motifs including 3862 internal loops, 1546 hairpin loops, 1089 5ā bulge loops, 922 3ā bulge loops, and 17 multibranch loops (Fig.Ā 3a).
There are 2334 unique motifs (occur only once) if the base pairs and their orientations are not considered (31.4Ā % of total). If closing pairs and their orientations are considered, then there are 3808 unique motifs (51.2Ā % of total). Previous studies have shown that loop closing pairs can dramatically affect loop structure [42, 43]. Not surprisingly, changing a loopās closing pairs can affect small molecule affinity [44, 45]. Many motifs appeared only once, providing a potential specific target site for small molecules. Further analysis was only completed on bulges and internal loops since the diversity of the hairpin loops was too large (see bar labeled āothersā in Fig.Ā 3a) and the sample size of multibranch loops is too small (17 motifs) for meaningful analysis (Fig.Ā 3a).
General survey of motifs in other types of RNAs
The motifs present in other RNAs are also diverse. There are a total of 26213 non-canonically paired motifs: 6937 bulges, 8457 internal loops, and 10819 hairpins. For highly expressed human RNAs with known structures, there are 2712 total motifs including 157 5ā bulges, 123 3ā bulges, 378 internal loops, 1521 hairpins, and 534 multibranch loops. Differences were observed in the distribution of motifs between other types of RNAs and human miRNAs. For example, the percentage of large hairpins is significantly less in other RNAs as compared to miRNAs (Fig.Ā 3b). In contrast, the percentage of 4-nucleotide hairpins and 2-nucleotide bulges in much greater (Fig.Ā 3b).
Small loops prevail in bulges and internal loops
As listed in TableĀ 1 and shown in Fig.Ā 3, the most highly represented bulges and internal loops for precursor miRNAs are the smallest possible size: 1-nucleotide bulges and 1āĆā1 nucleotide internal loops. Specifically, 69.3Ā % of 5ā bulge loops and 71.4Ā % of 3ā bulge loops are one-nucleotide bulges. Not surprisingly, the four possible 1-nucleotide bulges are the four most prevalent bulge loops. Two-nucleotide bulges are next most prevalent (15.0Ā % for 5ā bulge and 12.6Ā % for 3ā bulge). Likewise, small bulges and internal loops prevail in other types of RNAs and highly expressed human RNAs. For example, 1- and 2-nucleotide bulges account for ~92Ā % of all bulges of other RNAs and 85Ā % of human RNAs.
For internal loops in precursor miRNAs, 55.4Ā % of the 3860 internal loops are 1āĆā1 nucleotide internal loops. The second most prevalent internal loop size is 2āĆā2 (11.2Ā %) followed by 1āĆā2 and 2āĆā1 internal loops (8.9Ā %) (Fig.Ā 3a). This overall trend is similar for other RNAs: 1āĆā1 loops account for 39.8Ā % of all loops while 2āĆā2 and 1āĆā2 / 2āĆā1 nucleotide loops account for 11.8Ā % and 15.1Ā %, respectively. In highly expressed human RNAs, 1āĆā1 loops account for 49.7Ā % of all loops while 2āĆā2 and 1āĆā2 / 2āĆā1 nucleotide loops account for 6.9Ā % and 7.7Ā %, respectively. Since smaller bulges and internal loops are thermodynamically more stable than their larger counterparts [46ā51], it is not surprising that they are more highly represented.
Nucleotide preferences in single nucleotide 5ā bulge and 3ā bulge loops in precursor miRNAs
From thermodynamic studies, 1-nucleotide pyrimidine bulges (C or U) are more stable than 1-nucleotide purine bulges (A or G) independent of bulge position (5ā or 3ā) [51]. Thus, one might expect that pyrimidine bulges would occur more frequently than purine bulges and that the position of the bulge (5ā or 3ā) would not influence the order of frequency. In order to investigate if miRNA hairpin precursors have a preference for certain nucleotides and if this preference is position-dependent, we employed a pooled population comparison, a statistical approach that affords a confidence interval that the preference is not random (see Methods). For example, when āMotif 1ā occurs with a certain probability within a given sample size, a random distribution assumes that āMotif 2ā occurs with a similar probability. To reject this hypothesis, a Z-score is calculated, which represents the confidence that an increased or decreased frequency of a motif did not occur randomly and thus is truly enriched or depleted.
As shown in Fig.Ā 4 and listed in TableĀ 1, the order of single nucleotide occurrence in 5ā bulges is Uā>āAā>āCā>āG while in 3ā bulges the order is AāāāUā>āCā>āG (TableĀ 1 and Fig.Ā 4). (Please note that ā>ā indicates the two frequencies of occurrence are significantly different with Z-score >2 while āāā indicates Z-score <2). These orders are not correlated to the order of 1-nucleotide bulge thermodynamic stabilities (CāāāUā>āAāāāG). Furthermore, the occurrences of U in 5ā bulges and 3ā bulges are similar (0.236 and 0.233, respectively) as is the occurrences of C or G in 5ā bulges and 3ā bulges. However, A occurs more frequently as a 3ā bulge than a 5ā bulge with Z-scoreā=ā2.08. For highly expressed human RNAs, the trends are: 5ā bulge nucleotide: AāāāCāāāUā>āG; 3ā bulge nucleotide: AāāāCā>āUāāāG, although none of these differences is statistically significant.
The distribution of nucleotides in 1-nucleotide bulges is similar for human miRNAs and other highly expressed human RNAs; indeed, there are no statistically significant differences between them. In contrast, 1-nucleotide A bulges appear more often in RNAs from other organisms while 1-nucleotide C bulges appear less often (Fig.Ā 4b).
The structure of an RNA in general and bulges in particular [52] can be dynamic, resulting in multiple folds. Thus, the thermodynamically optimal state of an unbound RNA target may not be the same as the three dimensional structure of a protein- or small molecule-bound state. This may be advantageous for targeting RNA as the RNA's structure may remodel to accommodate ligand binding in a conformational selection mechanism.
Bulges prefer different closing base pairs
For each frequently occurring bulge, there are diverse combinations of closing base pairs, and their frequencies are dependent upon the bulged nucleotide. For example, there are 25 different closing base pair combinations for 5ā bulge U, and the occurrences of these closing pair combinations are different, ranging from 1 to 39 (Fig.Ā 5a).
We analyzed all 5ā 1-nucleotide bulges to determine if there is a preference for the most frequently occurring closing base pair combinations. FigureĀ 5b shows that each 5ā bulge prefers different closing base pair combinations. In some cases, the position of the bulge also influences the preferred closing base pairs; that is, whether it is a 5ā or 3ā bulge (Fig.Ā 5c). For example, 5ā bulge (UA)U/-(GC) occurs 39 times (2nd most prevalent) while 3ā bulge (UA)-/U(GC) occurs only 11 times (7th-most prevalent).
As shown schematically in Fig.Ā 2, the same motif (including closing base pairs) could be placed in different orientations in the miRNA's structure. Since their thermodynamic stabilities are the same, we inquired if the direction affects the frequency of occurrence. For example, 5ā bulge (UA)U/-(GC) is the same as 3ā bulge (CG)-/U(AU). The 5ā bulge (UA)U/-(GC) was observed 39 times in human miRNAs (the most frequent base pair combination). However, the 3ā bulge (CG)-/U(AU) was not observed.
There are examples in which the directionality of a motif does not affect occurrence. For example, 5ā bulge (GC)U/-(GC) occurs 24 times; the corresponding 3ā bulge, (CG)-/U(CG), also occurs 24 times. A more sophisticated analysis will be required in order to determine why directionality matters for some motifs but not others.
As observed for miRNA precursors, each 5ā and 3ā 1-nucleotide bulge in highly expressed human RNAs has a different distribution of observed closing base pairs (Additional file 1: Figure S1). Because of the small sample size (nā=ā88 for 3ā bulges and nā=ā121 for 5ā bulges), statistically significant differences were not observed. The most frequently occurring 5ā bulges were (UA)A/-(GC) (nā=ā11) while the most frequently occurring 3ā bulge was (UG)-/G(UA) (nā=ā7). Interestingly, the 5ā bulge (UA)A/-(GC) was not observed as a 3ā bulge (CG)-/A(AU). Another frequently occurring 5ā bulge, (GC)U/- (GC) (nā=ā7) was also not observed as a 3ā bulge. The most frequently occurring 3ā bulge, (UG)-/G(UA), was only observed once as the corresponding 5ā bulge.
Nucleotide preferences for 1āĆā1 nucleotide internal loops
The ten possible 1āĆā1 nucleotide internal loops are the ten most frequently occurring internal loops in miRNA hairpin precursors (TableĀ 1). They can be divided into three groups based on their frequencies of occurrence. In order for an internal loop to be placed in a particular group, its Z-scoreā>ā2 when compared to the loops in the other groups (TableĀ 2 and Fig.Ā 4). Group 1 contains the most frequently occurring loops including G/G, A/C, C/A, and U/U; Group 2 (second most frequently occurring) includes U/C and C/U; and Group 3 (least frequently occurring) includes A/A, C/C, G/A, and A/G. It is important to point out A/C and C/A are the same motifs but different orientations as are U/C and C/U, and G/A and A/G. Evidently, the direction of the unpaired nucleotides does not matter. For 1āĆā1 nucleotide loops in which both nucleotides are the same, the order of occurrence is G/GāāāU/Uā>āA/AāāāC/C, which is different from the order observed for bulge loops.
Differences in frequency are observed when comparing 1āĆā1 nucleotide internal loops in highly expressed human RNAs and other RNAs. For example, G/G loops appear more frequently in highly expressed human RNAs and less frequently in RNAs from other organisms as compared to miRNAs. A/C, A/G, and U/U loops appear more frequently in other RNAs than in miRNA precursors.
1āĆā1 nucleotide internal loops also have preferences for closing base pairs
Previous studies have shown that loop closing base pairs affect loop thermodynamic stability and structure [46, 48, 49]. We therefore investigated if the five most frequently occurring 1āĆā1 nucleotide loops (G/G, A/C, C/A, U/U, and U/C) in miRNAs have closing base pair preferences. In this analysis, AU and UA, GC and CG, and GU and UG closing base pairs were grouped together. (Thus, AU indicates AU and UA closing pairs; GC indicates GC and CG closing pairs; and GU indicates GU and UG closing pairs.) The results are summarized in Fig.Ā 6. Interestingly, G/G, A/C, and U/C have the same order of preference for 5ā closing base pairs: AUā>āGCā>āGU. C/A and U/U prefer GCā>āAUā>āGU for the 5ā closing pair. In contrast, A/C, U/U, and U/C have the same 3ā closing base pair preferences: GCā>āAUā>āGU. Unique trends are observed for G/G (AUā>āGCāāāGU) and C/A (AUāāāGCā>āGU).
As was observed with bulges, directionality affects frequency in some cases. For example, C/A and A/C internal loops have different preferences for the 5ā closing base pair. Similarly, internal loop (UA)C/A(GC) and (CG)A/C(AU) are the same loop. However, (UA)C/A(GC) occurs 29 times while internal loop (CG)A/C(AU) occurs 14 times. The difference in the frequency of occurrence is statistically significant (Z-scoreā=ā2.32).
Since the most frequently occurring 1āĆā1 nucleotide loops were similar in highly expressed human RNAs and RNAs from other organisms with known structures, we also studied closing base pair preferences for those RNAs. Unlike miRNA precursors, the five loops each have unique preferences for 5ā and 3ā closing base pairs (Fig.Ā 6). For highly expressed human RNAs, an analysis of the closing base pairs of all 1āĆā1 nucleotide loops reveals that GU closing pairs are discriminated against as both 5ā and 3ā closing pairs as compared to GC pairs (Z-scoreā=ā2.92 and 2.77, respectively). There is no statistically significant difference between GC and AU closing pairs or between AU and GU closing pairs. There are statistically significant differences in the closing base pairs for the five loops when comparing human miRNA precursors to RNAs from other organisms (nā=ā12; Fig.Ā 6). The most statistically significant difference is the preference for 3āGC closing pairs for A/C internal loops (pā<ā0.0001).
MiRNA processing sites
Presumably, the functionally important sites in miRNAĀ hairpin precursors are the processing sites, where precursors are cleaved by Dicer and Drosha to form the mature miRNA. How do Dicer and Drosha determine the exact sites to cleave? Are they chosen by a specific sequence, motif, or proximity to up/downstream elements? We therefore analyzed the secondary structures of Dicer and Drosha processing sites.
The site corresponding to the 5ā end of the mature RNA is referred to as the start processing site while the 3ā end of the mature RNA is referred to as the end processing site (Fig.Ā 1). The processing site nucleotide can be paired (including loop closing base pairs), a bulged nucleotide, an internal loop nucleotide, a hairpin nucleotide, or at the terminal ends. Of all start processing site nucleotides, 57.7Ā % are paired (including loop closing pairs) while 49.0Ā % of end processing site nucleotides are paired. This difference is statistically significant; that is, it can be stated that start processing site nucleotides occur more frequently as paired than end processing site nucleotides do (Z-scoreā=ā4.68). There are also a small number of processing sites in terminal endsā17 start processing sites and 28 end processing sites.
We next determined the number of unique motifs that reside in Dicer and Drosha processing sites. If considering only loop nucleotides, there are 507 unique Dicer (nā=ā334) and Drosha (nā=ā173) processing sites. This corresponds to 17.8Ā % of all processing sites, 21.7Ā % of all unique miRNA motifs, and 6.8Ā % of all miRNA motifs. Of the 507 unique Dicer and Drosha processing sites, 39 are present in highly expressed human RNAs. If closing base pairs also confer uniqueness, then there are 752 unique Dicer (nā=ā451) and Drosha (nā=ā301) sites, corresponding to 26.4Ā % of all processing sites, 19.7Ā % of all unique miRNA motifs, and 10.1Ā % of all miRNA motifs. The majority of unique Dicer processing sites reside in internal loops (38.4Ā % when considering closing base pairs) or hairpins (44.3Ā % when considering closing base pairs), while the majority of unique Drosha sites reside in internal loops (85.4Ā % when considering closing base pairs). Of these sites, 742 are unique to human miRNAs as compared to highly expressed human RNAs.
Conclusions
In this study, we constructed a database of the secondary structural elements (motifs) found in human miRNA hairpin precursor secondary structures. Analysis of this database reveals that small loops prevail in bulges and internal loops. Interestingly, loops and bulges have significantly different preference for loop nucleotides, which also dictate preference for closing base pairs and closing base pair combinations. The origins of these preferences are not clear, but they likely affect the binding of proteins and small molecules. We also examined the motifs present at miRNA processing sites. More than half of the 5ā (start) and 3ā (end) processing sites are in paired regions. Hopefully, the database and its analysis will facilitate the development of small molecules that specifically bind and modulate miRNA function, in particular, those that are associated with cancer or other diseases.
References
Krol J, Loedige I, Filipowicz W. The widespread regulation of microRNA biogenesis, function and decay. Nat Rev Genet. 2010;11(9):597ā610.
Kim VN, Han J, Siomi MC. Biogenesis of small RNAs in animals. Nat Rev Mol Cell Bio. 2009;10(2):126ā39.
Jones-Rhoades MW, Bartel DP, Bartel B. MicroRNAs and their regulatory roles in plants. Annu Rev Plant Biol. 2006;57:19ā53.
He L, Hannon GJ. MicroRNAs: Small RNAs with a big role in gene regulation. Nat Rev Genet. 2004;5(7):522ā31.
Cui QH, Lu M, Zhang QP, Deng M, Miao J, Guo YH, et al. An analysis of human microRNA and disease associations. PLoS One. 2008;3(10):e3420.
Sander C, Betel D, Wilson M, Gabow A, Marks DS. The microRNA.org resource: targets and expression. Nucleic Acids Res. 2008;36:D149ā53.
Calin GA, Croce CM. MicroRNA signatures in human cancers. Nat Rev Cancer. 2006;6(11):857ā66.
Olson EN, Small EM. Pervasive roles of microRNAs in cardiovascular biology. Nature. 2011;469(7330):336ā42.
Benkirane M, Triboulet R, Mari B, Lin YL, Chable-Bessia C, Bennasser Y, et al. Suppression of microRNA-silencing pathway by HIV-1 during virus replication. Science. 2007;315(5818):1579ā82.
Huang J, Wang F, Argyris E, Chen K, Liang Z, Tian H, et al. Cellular microRNAs contribute to HIV-1 latency in resting primary CD4+ T lymphocytes. Nat Med. 2007;13(10):1241ā7.
John B, Enright AJ, Aravin A, Tuschl T, Sander C, Marks DS. Human microRNA targets (vol 2, pg 1862, 2005). PLoS Biol. 2005;3(7):1328ā8.
Garzon R, Marcucci G, Croce CM. Targeting microRNAs in cancer: rationale, strategies and challenges. Nat Rev Drug Discov. 2010;9(10):775ā89.
Calin GA, Zhang S, Chen L, Jung EJ. Targeting microRNAs with small molecules: from dream to reality. Clin Pharmacol Ther. 2010;87(6):754ā8.
Petrocca F, Visone R, Onelli MR, Shah MH, Nicoloso MS, de Martino I, et al. E2F1-regulated microRNAs impair TGFbeta-dependent cell-cycle arrest and apoptosis in gastric cancer. Cancer Cell. 2008;13(3):272ā86.
Frankel LB, Christoffersen NR, Jacobsen A, Lindow M, Krogh A, Lund AH. Programmed cell death 4 (PDCD4) is an important functional target of the microRNA miR-21 in breast cancer cells. J Biol Chem. 2008;283(2):1026ā33.
Meng F, Henson R, Wehbe-Janek H, Ghoshal K, Jacob ST, Patel T. MicroRNA-21 regulates expression of the PTEN tumor suppressor gene in human hepatocellular cancer. Gastroenterology. 2007;133(2):647ā58.
Aagaard L, Rossi JJ. RNAi therapeutics: principles, prospects and challenges. Adv Drug Deliv Rev. 2007;59(2ā3):75ā86.
Loya CM, Lu CS, Van Vactor D, Fulga TA. Transgenic microRNA inhibition with spatiotemporal specificity in intact organisms. Nat Methods. 2009;6(12):897ā903.
Bose D, Jayaraj G, Suryawanshi H, Agarwala P, Pore SK, Banerjee R, et al. The tuberculosis drug streptomycin as a potential cancer therapeutic: inhibition of miR-21 function by directly targeting its precursor. Angew Chem Int Ed Engl. 2012;51(4):1019ā23.
Velagapudi SP, Disney MD. Two-dimensional combinatorial screening enables the bottom-up design of a microRNA-10b inhibitor. Chem Commun (Camb). 2014;50(23):3027ā9.
Velagapudi SP, Gallo SM, Disney MD. Sequence-based design of bioactive small molecules that target precursor microRNAs. Nat Chem Biol. 2014;10(4):291ā7.
Thomas JR, Hergenrother PJ. Targeting RNA with small molecules. Chem Rev. 2008;108(4):1171ā224.
Guan L, Disney MD. Recent advances in developing small molecules targeting RNA. ACS Chem Biol. 2012;7(1):73ā86.
Tran T, Disney MD. Two-dimensional combinatorial screening of a bacterial rRNA A-site-like motif library: defining privileged asymmetric internal loops that bind aminoglycosides. Biochemistry. 2010;49(9):1833ā42.
Ambros V, Bartel B, Bartel DP, Burge CB, Carrington JC, Chen X, et al. A uniform system for microRNA annotation. RNA. 2003;9(3):277ā9.
Lee Y,Ā Jeon K,Ā Lee JT,Ā Kim S,Ā Kim VN.Ā MicroRNAĀ maturation: stepwiseĀ processingĀ and subcellular localization. EMBO J. 2002; 21(17):4663-70.
Griffiths-Jones S, Kozomara A. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 2011;39:D152ā7.
Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ. miRBase: tools for microRNA genomics. Nucleic Acids Res. 2008;36:D154ā8.
Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH. Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc Natl Acad Sci U S A. 2004;101(19):7287ā92.
Mathews DH, Turner DH. Prediction of RNA secondary structure by free energy minimization. Curr Opin Struct Biol. 2006;16(3):270ā8.
Mathews DH, Sabina J, Zuker M, Turner DH. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol. 1999;288(5):911ā40.
Gutell RR. Collection of small subunit (16S- and 16S-like) ribosomal RNA structures: 1994. Nucleic Acids Res. 1994;22(17):3502ā7.
Schnare MN, Damberger SH, Gray MW, Gutell RR. Comprehensive comparison of structural characteristics in eukaryotic cytoplasmic large subunit (23 S-like) ribosomal RNA. J Mol Biol. 1996;256(4):701ā19.
Szymanski M, Barciszewska MZ, Erdmann VA, Barciszewski J. 5S ribosomal RNA database. Nucleic Acids Res. 2002;30(1):176ā8.
Sprinzl M, Horn C, Brown M, Ioudovitch A, Steinberg S. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 1998;26(1):148ā53.
Larsen N, Samuelsson T, Zwieb C. The signal recognition particle database (SRPDB). Nucleic Acids Res. 1998;26(1):177ā8.
Brown JW. The ribonuclease P database. Nucleic Acids Res. 1998;26(1):351ā2.
Damberger SH, Gutell RR. A comparative database of group I intron structures. Nucleic Acids Res. 1994;22(17):3508ā10.
Waring RB, Davies RW. Assessment of a model for intron RNA secondary structure relevant to RNA self-splicing--a review. Gene. 1984;28(3):277ā91.
Michel F, Umesono K, Ozeki H. Comparative and functional anatomy of group II catalytic introns--a review. Gene. 1989;82(1):5ā30.
Juhling F, Morl M, Hartmann RK, Sprinzl M, Stadler PF, Putz J. tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res. 2009;37(Database issue):D159ā62.
SantaLucia Jr J, Turner DH. Structure of (rGGCGAGCC)2 in solution from NMR and restrained molecular dynamics. Biochemistry. 1993;32(47):12612ā23.
Wu M, Turner DH. So-lution structure of (rGCGGACGC)2 by two-dimensional NMR and the iterative relaxation matrix approach. Biochemistry. 1996;35(30):9677ā89.
Pushechnikov A, Lee MM, Childs-Disney JL, Sobczak K, French JM, Thornton CA, et al. Rational design of ligands targeting triplet repeating transcripts that cause RNA dominant disease: application to myotonic muscular dystrophy type 1 and spinocerebellar ataxia type 3. J Am Chem Soc. 2009;131(28):9767ā79.
Tran T, Disney MD. Molecular recognition of 6ā-N-5-hexynoate kanamycin A and RNA 1x1 internal loops containing CA mismatches. Biochemistry. 2011;50(6):962ā9.
Chen G, Znosko BM, Jiao X, Turner DH. Factors affecting thermodynamic stabilities of RNA 3 x 3 internal loops. Biochemistry. 2004;43(40):12865ā76.
Freier SM, Kierzek R, Jaeger JA, Sugimoto N, Caruthers MH, Neilson T, et al. Improved free-energy parameters for predictions of RNA duplex stability. Proc Natl Acad Sci U S A. 1986;83(24):9373ā7.
Schroeder SJ, Burkard ME, Turner DH. The energetics of small internal loops in RNA. Biopolymers. 1999;52(4):157ā67.
Schroeder SJ, Turner DH. Thermodynamic stabilities of internal loops with GU closing pairs in RNA. Biochemistry. 2001;40(38):11509ā17.
Zhu J, Wartell RM. The effect of base sequence on the stability of RNA and DNA single base bulges. Biochemistry. 1999;38(48):15986ā93.
Znosko BM, Silvestri SB, Volkman H, Boswell B, Serra MJ. Thermodynamic parameters for an expanded nearest-neighbor model for the formation of RNA duplexes with single nucleotide bulges. Biochemistry. 2002;41(33):10406ā17.
Stelzer AC, Frank AT, Kratz JD, Swanson MD, Gonzalez-Hernandez MJ, Lee J, et al. Discovery of selective bioactive small molecules by targeting an RNA dynamic ensemble. Nat Chem Biol. 2011;7(8):553ā9.
Acknowledgments
This work was funded by the National Institutes of Health (R01-GM097455 to MDD and R15-GM085699 to BMZ) and The Scripps Research Institute.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authorsā contributions
BL completed data analysis on miRNAs and drafted the manuscript; JLC completed data analysis on internal loop closing base pairs and highly expressed human RNAs; BMZ completed data analysis on other RNAs; DW and SMG wrote scripts to parse and analyze miRNAs motifs; MF constructed the searchable database and web server; MDD conceived of the study, and participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.
Additional file
Additional file 1: Figure S1.
Analysis of the closing base pairs for 1-nucleotide bulges, both 5ā and 3ā, in highly expressed human RNAs with known structures. As observed for 5ā and 3ā bulges in miRNA precursors, each bulge has preferred 5ā and 3ā closing base pairs. Further, the distribution of closing base pairs is different for miRNA precursors and other human RNAs (Fig.Ā 5). (PDF 319Ā kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Liu, B., Childs-Disney, J.L., Znosko, B.M. et al. Analysis of secondary structural elementsĀ in human microRNA hairpin precursors. BMC Bioinformatics 17, 112 (2016). https://doi.org/10.1186/s12859-016-0960-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12859-016-0960-6