Tuesday, February 17, 2015

NGS: Notes of microRNAs

Abstract: Names and statistics of miRNAs the in miRNA data base.



microRNAs (miRNAs) are small non-coding RNAs acting as RNA silencing and post-translational regulations of gene expression. The mature form of miRNA is usually 21-22 nucleotides from the precursor (Figure 1).
F
igure 1. Biogenesis of miRNA (http://en.wikipedia.org/wiki/MicroRNA#Nomenclature)

How miRNAs are named?
Here is one sentence from miRbase: “Please note that miRNA names are able to convey only limited information, and are entirely unsuitable to encode information about complex sequence relationships. You should not therefore rely on the name to tell you all you need to know about the sequence. Sensible database approaches should instead use dedicated fields and annotation to describe such relationships, such as the "family" data provided here.”

Example 1:
hsa-mir-21: UGUCGGGUAGCUUAUCAGACUGAUGUUGACUGUUGAAUCUCAUGGCAACACCAGUCGAUG
GGCUGUCUGACA
hsa-miR-21-5p: UAGCUUAUCAGACUGAUGUUGA

hsa-miR-21-3p: CAACACCAGUCGAUGGGCUGU

Note: The prefix ‘hsa’ represents Homo sapiens. ‘mir’ and ‘miR’ represent the precursor and mature from of miRNA, respectively. The number of 21 indicates the date validated. The lower is earlier found than the higher. Both of the two matured miRNAs hsa-miR-21-5p and hsa-miR-21-3p originate from opposite arms of the same precursor hsa-miR-21 and are found in roughly similar amounts, denoted with a -5p (the left arm) or -3p (the right arm) suffix.

Example 2:
hsa-mir-24-1: CUCCGGUGCCUACUGAGCUGAUAUCAGUUCUCAUUUUACACACUGGCUCAGUUCAGCAGG
AACAGGAG

hsa-mir-24-2:
CUCUGCCUCCCGUGCCUACUGAGCUGAAACACAGUUGGUUUGUGUACACUGGCUCAGUUC
AGCAGGAACAGGG

Note: The suffix -1 and -2 indicate that hsa-mir-24-1 and hsa-mir-24-2 come from the genes located in different regions, but share an identical mature form of miRNA hsa-mir-24-3p. The mature forms of hsa-mir-24-1 are hsa-mir-24-1-5p and hsa-mir-24-3p. The mature forms of hsa-mir-24-2 are hsa-mir-24-2-5p and hsa-mir-24-3p.

Example 3:
hsa-mir-124-1:

AGGCCUCUCUCUCCGUGUUCACAGCGGACCUUGAUUUAAAUGUCCAUACAAUUAAGGCAC
GCGGUGAAUGCCAAGAAUGGGGCUG

hsa-mir-124-2:

AUCAAGAUUAGAGGCUCUGCUCUCCGUGUUCACAGCGGACCUUGAUUUAAUGUCAUACAA
UUAAGGCACGCGGUGAAUGCCAAGAGCGGAGCCUACGGCUGCACUUGAA

hsa-mir-124-3:

UGAGGGCCCCUCUGCGUGUUCACAGCGGACCUUGAUUUAAUGUCUAUACAAUUAAGGCAC
GCGGUGAAUGCCAAGAGAGGCGCCUCC
The precursors hsa-mir-124-1, hsa-mir-124-2, and hsa-mir-124-3 share identical matured miRNAs. hsa-miR-124-5p: CGUGUUCACAGCGGACCUUGAU

hsa-miR-124-3p: UAAGGCACGCGGUGAAUGCC


Example 4:
hsa-mir-125a: UGCCAGUCUCUAGGUCCCUGAGACCCUUUAACCUGUGAGGACAUCCAGGGUCACAGGUGA
GGUUCUUGGGAGCCUGGCGUCUGGCC

hsa-mir-125b-1:
UGCGCUCCUCUCAGUCCCUGAGACCCUAACUUGUGAUGUUUACCGUUUAAAUCCACGGGU
UAGGCUCUUGGGAGCUGCGAGUCGUGCU
hsa-mir-125b-2: ACCAGACUUUUCCUAGUCCCUGAGACCCUAACUUGUGAGGUAUUUUAGUAACAUCACAAG
UCAGGCUCUUGGGACCUAGGCGGAGGGGA
Note: The sufix -125a and -125b indicate that the the matured forms of the three precursors have similar sequences with one or two mismatches. hsa-miR-125a-5p vs. hsa-miR-125b-5p.
hsa-miR-125a-5p: UCCCUGAGACCCUUUAACCUGUGA


hsa-miR-125b-5p: UCCCUGAGACCCUAACUUGUGA


hsa-miR-125a-3p: ACAGGUGAGGUUCUUGGGAGCC
hsa-miR-125b-1-3p: ACGGGUUAGGCUCUUGGGAGCU
hsa-miR-125b-2-3p: UCACAAGUCAGGCUCUUGGGAC


Example 5:
hsa-let-7a-5p: UGAGGUAGUAGGUUGUAUAGUU

hsa-let-7a-3p: CUAUACAAUCUACUGUCUUUC

hsa-let-7a-2-3p: CUGUACAGCCUCCUAGCUUUCC

hsa-let-7b-5p: UGAGGUAGUAGGUUGUGUGGUU

hsa-let-7b-3p: CUAUACAACCUACUGCCUUCCC

hsa-let-7c-5p: UGAGGUAGUAGGUUGUAUGGUU

hsa-let-7c-3p: CUGUACAACCUUCUAGCUUUCC

hsa-let-7d-5p: AGAGGUAGUAGGUUGCAUAGUU

hsa-let-7d-3p: CUAUACGACCUGCUGCCUUUCU

hsa-let-7e-5p: UGAGGUAGGAGGUUGUAUAGUU

hsa-let-7e-3p: CUAUACGGCCUCCUAGCUUUCC

hsa-let-7f-5p: UGAGGUAGUAGAUUGUAUAGUU

hsa-let-7f-1-3p: CUAUACAAUCUAUUGCCUUCCC

hsa-let-7f-2-3p: CUAUACAGUCUACUGUCUUUCC

hsa-let-7g-5p: UGAGGUAGUAGUUUGUACAGUU

hsa-let-7g-3p: CUGUACAGGCCACUGCCUUGC

hsa-let-7i-5p: UGAGGUAGUAGUUUGUGCUGUU

hsa-let-7i-3p: CUGCGCAAGCUACUGCCUUGCU
Note: The rule of those miRNA names labelled with ‘let’ instead of ‘miR’ are exception to the most.

Statistics of miRNAs
The miRBase database (http://www.mirbase.org/index.shtml) is a searchable database of published miRNA sequences and annotation. Release 21 (June, 2014) contains 28,645 entries representing hairpin
precursor miRNAs, expressing 35,828 mature miRNA products, in 223
species.


Figure 2. Statistics of precursors
F
igure 3. Statistics of mature miRNAs.

Of 1,881 precursors of human, 1,828 precursors own unique sequences across the miRBase, and 32 precursors have identical sequences with other human precursors, and 19 share sequences with the precursors of other species. Of 2,588 mature miRNAs of human, 2,565 matures own unique sequences across the miRBase, and 15 matures have identical sequences with other human matures, and 8 share sequences with the mature forms of other species.

The length of human precursors varied from dozens to more than one hundred bases. Human matures are shorter, and mostly would be 22-21 bases (Figure 4).


Figure 4. Distribution of sequence length of precursors and matures of miRNAs

Reference
1. A uniform system for microRNA annotation. RNA. 2003 Mar;9(3):277-9.
2. miRBase: annotating high confidence microRNAs using deep sequencing data.
Nucleic Acids Res. 2014 42:D68-D73



Writing date: 2012.12.01, 2015.02.16