HOUSTON – (March 14, 2022) – Details about variants hiding in the deluge of genetic SARS-CoV-2 sequences would be good to know, if only researchers can get to them.
A new program developed at Rice University’s George R. Brown School of Engineering will make it possible, at least for “intrahost variants,” those that appear in genome data from the same COVID-19-positive person.
A Rice team led by computer scientist Todd Treangen and graduate student Yunxi Liu has developed Variabel, which accurately identifies “low-frequency variants” of the virus that causes COVID-19.
Finding these clues could be key to identifying potentially devastating variants before they have a chance to spread, Treangen said.
The data is freely available, but there’s a lot of it. The research makes low-frequency variant mining available for an estimated half-million SARS-CoV-2 genomes gathered by Oxford Nanopore Technologies (ONT), which offers an affordable platform for rapid sequencing of single, long molecules of DNA or RNA.
“Variabel directly enables the use of affordable nanopore sequencing technology for the identification of within-host variation after viral infection,” said Treangen, whose work has focused on infectious disease monitoring since long before the COVID-19 pandemic.
The lab had similar success in testing Variabel on sequence data from patients infected with Ebola and norovirus.
The open-source program, detailed in Nature Communications, is available for download at https://gitlab.com/treangenlab/variabel.
The researchers claim the key to Variabel is its ability to distinguish true variants from sequencing errors in the ONT process.
To validate Variabel, they compared data taken over time from single positive patients as well as sequences from cross-patient datasets, produced by ONT and another sequencing technique, Illumina. Over time, a single patient can host as many as a billion copies of a virus.
By comparing results before and after applying Variabel to the data, they found the program was able to correct the great majority of sequencing errors.
“Variabel opens the door to portable, affordable and rapid characterization of within-host variation, which ultimately could aid in the discovery of future mutations specific to variants of concern,” said Treangen, whose lab, along with Rice’s Ken Kennedy Institute, hosted a March 11 symposium to discuss scientific advances spurred by the pandemic. The virtual symposium can be viewed online here: http://www.youtube.com/watch?v=YaNm7QBmxD8.
Co-authors of the paper are Rice undergraduate Joshua Kearney and software engineer Bryce Kille, and Baylor College of Medicine postdoctoral associate Medhat Mahmoud and Fritz Sedlazeck, an associate professor at the Human Genome Sequencing Center. Treangen is an assistant professor of computer science.
The National Institute of Allergy and Infectious Diseases (1U19AI144297, 1P01AI152999-01), a C3.ai Digital Transformation Institute COVID-19 award, the Centers for Disease Control (75D30121C11180), the National Science Foundation (1338099) and Rice’s Center for Research Computing supported the research.
-30-
Read the paper at http://dx.doi.org/10.1038/s41467-022-28852-1.
This news release can be found online at https://news.rice.edu/news/2022/covid-19-variants-cant-hide-variabel.
Follow Rice News and Media Relations via Twitter @RiceUNews.
Related materials:
SARS-CoV-2 genomic diversity and the implications for qRT-PCR diagnostics and transmission: https://genome.cshlp.org/content/early/2021/03/19/gr.268961.120.full.pdf
Variabel: https://gitlab.com/treangenlab/variabel.
Treangen Lab: https://www.treangenlab.com
Department of Computer Science: https://csweb.rice.edu
George R. Brown School of Engineering: https://engineering.rice.edu
Images for download:
https://news-network.rice.edu/news/files/2022/03/0314_VARIABEL-1-WEB.jpg
An illustration defines what differentiates single-nucleotide variants (iSNVs) within a single host from single nucleotide polymorphisms that spread from host to host. Rice University computer scientists have introduced Variabel, which uses sequencing data to identify low-frequency, intra-host variants of SARS-CoV-19 from public data sets. (Credit: Illustration courtesy of the Treangen Lab/Rice University)
Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nation’s top 20 universities by U.S. News & World Report. Rice has highly respected schools of Architecture, Business, Continuing Studies, Engineering, Humanities, Music, Natural Sciences and Social Sciences and is home to the Baker Institute for Public Policy. With 4,052 undergraduates and 3,484 graduate students, Rice’s undergraduate student-to-faculty ratio is just under 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is ranked No. 1 for lots of race/class interaction and No. 1 for quality of life by the Princeton Review. Rice is also rated as a best value among private universities by Kiplinger’s Personal Finance.
Journal
Nature Communications
Article Title
Rescuing low frequency variants within intra-host viral populations directly from Oxford Nanopore sequencing data
Article Publication Date
14-Mar-2022
COI Statement
Fritz Sedlazeck received research support from PacBio and ONT. The remaining authors declare no competing interests.