A novel SARS-CoV-2 viral sequence bioinformatic pipeline has found genetic evidence that the viral 3 ' untranslated region (UTR) is evolving and generating increased viral diversity

dc.contributor.authorFarkas, Carlos
dc.contributor.authorMella, Andy
dc.contributor.authorTurgeon, Maxime
dc.contributor.authorHaigh, Jody Jonathan
dc.date.accessioned2022-05-25T19:48:12Z
dc.date.available2022-05-25T19:48:12Z
dc.date.issued2021-06-21
dc.description.abstractAn unprecedented amount of SARS-CoV-2 sequencing has been performed, however, novel bioinformatic tools to cope with and process these large datasets is needed. Here, we have devised a bioinformatic pipeline that inputs SARS-CoV-2 genome sequencing in FASTA/FASTQ format and outputs a single Variant Calling Format file that can be processed to obtain variant annotations and perform downstream population genetic testing. As proof of concept, we have analyzed over 229,000 SARS-CoV-2 viral sequences up until November 30, 2020. We have identified over 39,000 variants worldwide with increased polymorphisms, spanning the ORF3a gene as well as the 3′ untranslated (UTR) regions, specifically in the conserved stem loop region of SARS- CoV-2 which is accumulating greater observed viral diversity relative to chance variation. Our analysis pipeline has also discovered the existence of SARS-CoV-2 hypermutation with low frequency (less than in 2% of genomes) likely arising through host immune responses and not due to sequencing errors. Among annotated non-sense variants with a population frequency over 1%, recurrent inactivation of the ORF8 gene was found. This was found to be present in the newly identified B.1.1.7 SARS-CoV-2 lineage that originated in the United Kingdom. Almost all VOC-containing genomes possess one stop codon in ORF8 gene (Q27∗), however, 13% of these genomes also contains another stop codon (K68∗), suggesting that ORF8 loss does not interfere with SARS- CoV-2 spread and may play a role in its increased virulence. We have developed this computational pipeline to assist researchers in the rapid analysis and characterization of SARS-CoV-2 variation.es
dc.format.extent14 páginas
dc.format.extent4.852Mb
dc.format.mimetypePDF
dc.identifier.citationFrontiers in Microbiology, 12 16 p.
dc.identifier.doihttps://doi.org/10.3389/fmicb.2021.665041
dc.identifier.issn1664-302X
dc.identifier.urihttp://repositorio.udla.cl/xmlui/handle/udla/1077
dc.identifier.urihttps://www.frontiersin.org/journals/microbiology
dc.publisherFrontiers Media S.A.
dc.sourceFrontiers in Microbiology
dc.subject3cpsdummy′UTR.es
dc.subjectNucleotide diversity (π).es
dc.subjectTajima’s D-statistic.es
dc.subjectViral evolution.es
dc.subjectVCF.es
dc.titleA novel SARS-CoV-2 viral sequence bioinformatic pipeline has found genetic evidence that the viral 3 ' untranslated region (UTR) is evolving and generating increased viral diversityes
dc.typeArtículoes
dc.udla.catalogadorCBM
dc.udla.indexSCOPUS
dc.udla.privacidadDocumento públicoes

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Farkas et al.2021.A Novel SARS-CoV-2 Viral Sequence Bioinformatic Pipeline Has Found Genetic Evidence That the Viral 3′ Untranslated Region (UTR) Is Evolving and Generating Increased Viral Diversity.pdf
Size:
4.85 MB
Format:
Adobe Portable Document Format
Description:
Farkas et al.2021.A Novel SARS-CoV-2 Viral Sequence Bioinformatic Pipeline Has Found Genetic Evidence That the Viral 3′ Untranslated Region (UTR) Is Evolving and Generating Increased Viral Diversity

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections