What is CANTATAdb 2.0?

CANTATAdb 2.0 is a database of lncRNAs identified computationally in 36 plant species and 3 algae. Below are lncRNA counts per species.

How were these lncRNAs discovered?

We mapped reads from hundreds RNA-Seq libraries to corresponding plant genomes and we re-annotated these genomes using gene prediction software and known annotation data as a reference. Then, we applied several filters to discriminate between non-coding and protein-coding transcripts.

What's inside the database?

CANTATAdb 2.0 collects 239,631 lncRNAs predicted in 39 species. The database presents, among others, lncRNA sequences, expression values across RNA-Seq libraries, genomic locations, hypothetical peptides encoded by lncRNAs, BLAST search results against Swiss-Prot proteins and non-coding RNAs from NONCODE.

Why we created this database?

A number of model plant species lack comprehensive datasets of lncRNas and their annotations, which poses a difficulty for their further studies. CANTATAdb 2.0 is one of the biggest and most comprehensive databases of plant lncRNAs.

Contact information

Any questions, remarks, suggestions? Feel free to contact us:

miszcz [at] amu.edu.pl

Website of the Laboratory of Integrative Genomics
(group of prof. Izabela Makalowska)
Laboratory of Integrative Genomics
Institute of Antropology
Adam Mickiewicz University in Poznan
ul. Umultowska 89, 61-674 Poznan, Poland

