annotate_my_genomes: an easy-to-use pipeline to improve genome annotation and uncover neglected genes by hybrid RNA sequencing

Farkas, Carlos; Recabal, Antonia; Mella, Andy; Candia Herrera, Daniel.; González Olivero, Maryori.; Haigh, Jody Jonathan; Tarifeño Saldivia, Estefanía.; Caprile, Teresa

dc.contributor.author	Author	Farkas, Carlos
dc.contributor.author	Author	Recabal, Antonia
dc.contributor.author	Author	Mella, Andy
dc.contributor.author	Author	Candia Herrera, Daniel.
dc.contributor.author	Author	González Olivero, Maryori.
dc.contributor.author	Author	Haigh, Jody Jonathan
dc.contributor.author	Author	Tarifeño Saldivia, Estefanía.
dc.contributor.author	Author	Caprile, Teresa
dc.date.accessioned	Date Accessioned	2024-09-03T19:21:25Z
dc.date.available	Date Available	2024-09-03T19:21:25Z
dc.date.issued	Date Issued	2022
dc.identifier.citation	Referencia Bibliográfica	GigaScience, 11, 14 p.
dc.identifier.issn	ISSN	2047-217X
dc.identifier.uri	URI	http://repositorio.udla.cl/xmlui/handle/udla/1635
dc.identifier.uri	URI	https://academic.oup.com/gigascience?login=false
dc.description.abstract	Abstract	Background: The advancement of hybrid sequencing technologies is increasingly expanding genome assemblies that are often annotated using hybrid sequencing transcriptomics, leading to improved genome characterization and the identification of novel genes and isoforms in a wide variety of organisms. Results: We developed an easy-to-use genome-guided transcriptome annotation pipeline that uses assembled transcripts from hybrid sequencing data as input and distinguishes between coding and long non-coding RNAs by integration of several bioinformatic approaches, including gene reconciliation with previous annotations in GTF format. We demonstrated the efficiency of this approach by correctly assembling and annotating all exons from the chicken SCO-spondin gene (containing more than 105 exons), including the identification of missing genes in the chicken reference annotations by homology assignments. Conclusions: Our method helps to improve the current transcriptome annotation of the chicken brain. Our pipeline, implemented on Anaconda/Nextflow and Docker is an easy-to-use package that can be applied to a broad range of species, tissues, and research areas helping to improve and reconcile current annotations. The code and datasets are publicly available at https://github.com/cfarkas/annotate_my_genomes
dc.format.extent	dc.format.extent	14 páginas
dc.format.extent	dc.format.extent	5.121Mb
dc.format.mimetype	dc.format.mimetype	PDF
dc.language.iso	Language ISO	eng
dc.publisher	Publisher	Oxford University Press
dc.rights	Rights	Creative Commons Attribution License (CC BY)
dc.source	Sources	GigaScience
dc.subject	Subject	Genome Annotation pipeline
dc.subject	Subject	Hybrid sequencing
dc.subject	Subject	SCO-spondin
dc.subject	Subject	Transcriptome annotation
dc.title	Title	annotate_my_genomes: an easy-to-use pipeline to improve genome annotation and uncover neglected genes by hybrid RNA sequencing
dc.type	Document Type	Artículo
dc.udla.catalogador	dc.udla.catalogador	CBM
dc.udla.index	dc.udla.index	WoS
dc.udla.index	dc.udla.index	Science Citation Index Expanded
dc.udla.index	dc.udla.index	Scopus
dc.udla.index	dc.udla.index	Academic Search Ultimate
dc.udla.index	dc.udla.index	Natural Science Collection
dc.udla.index	dc.udla.index	DOAJ
dc.udla.index	dc.udla.index	Biological Science Database
dc.udla.index	dc.udla.index	BIOSIS
dc.udla.index	dc.udla.index	CAB Abstracts
dc.udla.index	dc.udla.index	EMBASE
dc.udla.index	dc.udla.index	MEDLINE
dc.identifier.doi	dc.identifier.doi	10.1093/gigascience/giac099
dc.facultad	dc.facultad	Facultad de Salud y Ciencias Sociales

Files in this item

Name:: 510.pdf
Size:: 5.121Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Investigación

Show simple item record