annotate_my_genomes: an easy-to-use pipeline to improve genome annotation and uncover neglected genes by hybrid RNA sequencing

Farkas, Carlos; Recabal, Antonia; Mella, Andy; Candia Herrera, Daniel.; González Olivero, Maryori.; Haigh, Jody Jonathan; Tarifeño Saldivia, Estefanía.; Caprile, Teresa

doi:https://doi.org/10.1093/gigascience/giac099

annotate_my_genomes: an easy-to-use pipeline to improve genome annotation and uncover neglected genes by hybrid RNA sequencing

dc.contributor.affiliation	Universidad de Las Américas, Facultad de Salud y Ciencias Sociales, Chile
dc.contributor.author	Farkas, Carlos
dc.contributor.author	Recabal, Antonia
dc.contributor.author	Mella, Andy
dc.contributor.author	Candia Herrera, Daniel.
dc.contributor.author	González Olivero, Maryori.
dc.contributor.author	Haigh, Jody Jonathan
dc.contributor.author	Tarifeño Saldivia, Estefanía.
dc.contributor.author	Caprile, Teresa
dc.date.accessioned	2024-09-03T19:21:25Z
dc.date.available	2024-09-03T19:21:25Z
dc.date.issued	2022
dc.description.abstract	Background: The advancement of hybrid sequencing technologies is increasingly expanding genome assemblies that are often annotated using hybrid sequencing transcriptomics, leading to improved genome characterization and the identification of novel genes and isoforms in a wide variety of organisms. Results: We developed an easy-to-use genome-guided transcriptome annotation pipeline that uses assembled transcripts from hybrid sequencing data as input and distinguishes between coding and long non-coding RNAs by integration of several bioinformatic approaches, including gene reconciliation with previous annotations in GTF format. We demonstrated the efficiency of this approach by correctly assembling and annotating all exons from the chicken SCO-spondin gene (containing more than 105 exons), including the identification of missing genes in the chicken reference annotations by homology assignments. Conclusions: Our method helps to improve the current transcriptome annotation of the chicken brain. Our pipeline, implemented on Anaconda/Nextflow and Docker is an easy-to-use package that can be applied to a broad range of species, tissues, and research areas helping to improve and reconcile current annotations. The code and datasets are publicly available at https://github.com/cfarkas/annotate_my_genomes
dc.facultad	Facultad de Salud y Ciencias Sociales
dc.format.extent	14 páginas
dc.format.extent	5.121Mb
dc.format.mimetype	application/pdf
dc.identifier.citation	GigaScience, 11, 14 p.
dc.identifier.doi	https://doi.org/10.1093/gigascience/giac099
dc.identifier.issn	2047-217X
dc.identifier.ror	https://ror.org/0166e9x11
dc.identifier.uri	https://repositorio.udla.cl/xmlui/handle/udla/1635
dc.identifier.uri	https://academic.oup.com/gigascience?login=false
dc.language.iso	eng
dc.publisher	Oxford University Press
dc.rights	Creative Commons Attribution License (CC BY)
dc.source	GigaScience
dc.subject	Genome Annotation pipeline
dc.subject	Hybrid sequencing
dc.subject	SCO-spondin
dc.subject	Transcriptome annotation
dc.title	annotate_my_genomes: an easy-to-use pipeline to improve genome annotation and uncover neglected genes by hybrid RNA sequencing
dc.type	Artículo
dc.udla.catalogador	CBM
dc.udla.index	WoS
dc.udla.index	Science Citation Index Expanded
dc.udla.index	Scopus
dc.udla.index	Academic Search Ultimate
dc.udla.index	Natural Science Collection
dc.udla.index	DOAJ
dc.udla.index	Biological Science Database
dc.udla.index	BIOSIS
dc.udla.index	CAB Abstracts
dc.udla.index	EMBASE
dc.udla.index	MEDLINE

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 510.pdf
Size:: 5.12 MB
Format:: Adobe Portable Document Format

Download

Collections

Investigación