annotate_my_genomes: an easy-to-use pipeline to improve genome annotation and uncover neglected genes by hybrid RNA sequencing

dc.contributor.authorFarkas, Carlos
dc.contributor.authorRecabal, Antonia
dc.contributor.authorMella, Andy
dc.contributor.authorCandia Herrera, Daniel.
dc.contributor.authorGonzález Olivero, Maryori.
dc.contributor.authorHaigh, Jody Jonathan
dc.contributor.authorTarifeño Saldivia, Estefanía.
dc.contributor.authorCaprile, Teresa
dc.date.accessioned2024-09-03T19:21:25Z
dc.date.available2024-09-03T19:21:25Z
dc.date.issued2022
dc.description.abstractBackground: The advancement of hybrid sequencing technologies is increasingly expanding genome assemblies that are often annotated using hybrid sequencing transcriptomics, leading to improved genome characterization and the identification of novel genes and isoforms in a wide variety of organisms. Results: We developed an easy-to-use genome-guided transcriptome annotation pipeline that uses assembled transcripts from hybrid sequencing data as input and distinguishes between coding and long non-coding RNAs by integration of several bioinformatic approaches, including gene reconciliation with previous annotations in GTF format. We demonstrated the efficiency of this approach by correctly assembling and annotating all exons from the chicken SCO-spondin gene (containing more than 105 exons), including the identification of missing genes in the chicken reference annotations by homology assignments. Conclusions: Our method helps to improve the current transcriptome annotation of the chicken brain. Our pipeline, implemented on Anaconda/Nextflow and Docker is an easy-to-use package that can be applied to a broad range of species, tissues, and research areas helping to improve and reconcile current annotations. The code and datasets are publicly available at https://github.com/cfarkas/annotate_my_genomes
dc.facultadFacultad de Salud y Ciencias Sociales
dc.format.extent14 páginas
dc.format.extent5.121Mb
dc.format.mimetypePDF
dc.identifier.citationGigaScience, 11, 14 p.
dc.identifier.doi10.1093/gigascience/giac099
dc.identifier.issn2047-217X
dc.identifier.urihttp://repositorio.udla.cl/xmlui/handle/udla/1635
dc.identifier.urihttps://academic.oup.com/gigascience?login=false
dc.language.isoeng
dc.publisherOxford University Press
dc.rightsCreative Commons Attribution License (CC BY)
dc.sourceGigaScience
dc.subjectGenome Annotation pipeline
dc.subjectHybrid sequencing
dc.subjectSCO-spondin
dc.subjectTranscriptome annotation
dc.titleannotate_my_genomes: an easy-to-use pipeline to improve genome annotation and uncover neglected genes by hybrid RNA sequencing
dc.typeArtículo
dc.udla.catalogadorCBM
dc.udla.indexWoS
dc.udla.indexScience Citation Index Expanded
dc.udla.indexScopus
dc.udla.indexAcademic Search Ultimate
dc.udla.indexNatural Science Collection
dc.udla.indexDOAJ
dc.udla.indexBiological Science Database
dc.udla.indexBIOSIS
dc.udla.indexCAB Abstracts
dc.udla.indexEMBASE
dc.udla.indexMEDLINE

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
510.pdf
Size:
5.12 MB
Format:
Adobe Portable Document Format

Collections