MaizeSequence HomeMaizeSequence Home
Home > Help & Documentation

Non-coding RNA Overview

The majority of non-coding RNA, or ncRNA, in Ensembl has been generated using RFAM. A small handpicked set is also available. This ncRNA set, as well as a detailed description of the annotation methods can be obtained from ftp.genetics.wustl.edu.

The following non-coding RNA gene types are annotated, along with pseudogenes

tRNA
nuclear transfer RNA
Mt-tRNA
mitochondrially-derived tRNA located in the nuclear genome
rRNA
ribosomal RNA
scRNA
small cytoplasmic RNA
snRNA
small nuclear RNA
snoRNA
small nucleolar RNA
miRNA
microRNA precursors
misc_RNA
miscellaneous other RNA

Annotation Details

Most ncRNA is annotated by aligning genomic sequence against RFAM using BLASTN. The BLAST hits are clustered and filtered by E value and are used to seed Infernal searches of the locus with the corresponding RFAM covariance models. The purpose of this is to reduce the search space required, as to scan the entire genome with all the RFAM covariance models would be extremely CPU-intensive. The resulting BLAST hits are then used as supporting evidence for ncRNA genes.

miRNA is predicted by BLASTN of genomic sequence slices against miRBase sequences. The BLAST hits are clustered and filtered by E value and the aligned genomic sequence is then checked for possible secondary structure using RNAFold. If evidence is found that the genomic sequence could form a stable hairpin structure, the locus is used to create a miRNA gene model. The resulting BLAST hit is used as supporting evidence for the miRNA gene.