ATCC ATCC Logo 0
  • Careers
  • Support

RAISIN: A Pipeline Intended to Better Characterize Variants on the Amino Acid Level

Virus-iStock-517547809.jpg

Abstract

During the SARS-CoV2 pandemic, the overwhelming emergence of new viral variants saw a need for a system to capture and characterize these variants quickly and effectively, with a greater than typical focus on the functional amino-acid implications. As SARS-CoV2 continues to mutate and other viruses show potential to become the next epidemic, it is important more than ever to actively track these variants and their latent effects. Current variant callers will provide variants and their allelic frequencies given a reference sequence, but very few natively generate reading-frame aware translations. The process of manually identifying these amino acid variants can be time-consuming as the variants get more complex. To address this issue, ATCC presents RAISIN (Retrieving Amino acid Implications from Sequencing Iterations), a simple, fast, and accurate variant annotation pipeline built for characterizing and notating variants given a pair of Illumina sequencing FASTQs and an NCBI reference accession number. 

RAISIN was initially created to better characterize variants in non-SARS-CoV2 viruses, such as Lassa mammarenavirus, Marburgvirus, other human coronaviruses, and influenza. During the analysis, RAISIN retrieves the FASTA and annotation files from RefSeq or GenBank when given an accession number. Illumina sequencing reads are then mapped to the reference and variants are called. RAISIN will then iterate through genetic elements within the annotation file to generate the codon-aware amino acid implications for each variant. As the final outputs, the pipeline will return the consensus sequence, nucleotide variant profile, and anticipated amino acid mutations. Here, for our analysis, a range of viruses was run through RAISIN in order to test its ability to generate anticipated amino acids mutations and its comprehensiveness. From initial testing, several amino acid changes have been observed and notated for certain human coronaviruses and Marburgviruses. For example, for one Marburgvirus sample, a total of 28 variants were found, of which 4 had amino acid changes compared to the reference. In summary, the RAISIN pipeline is fast and flexible, enabling researchers to identify and describe variants accurately to detect changes due to serial passaging, host adaptation, or as new viral mutants emerge.  

Download the presentation to learn about our new simple, fast, and accurate variant annotation pipeline.

Download

Presenter

Nikhita Puthuveetil, headshot.

Nikhita Puthuveetil, MS

Senior Bioinformatician, Sequencing and Bioinformatics Center, ATCC

Nikhita Puthuveetil is a bioinformatician at ATCC that performs routine bioinformatics analysis on internal sequencing submissions, primarily SARS-CoV-2 samples as well as plasmid and bacterial samples. She also works with her team to aid in the development of the ATCC Genome Portal. She first joined ATCC as an bioinformatics intern in 2019 where she worked to create an internal sequencing dashboard in R to track and manage sequencing at the Sequencing and Bioinformatics Center. She has an MS in Bioinformatics from Virginia Commonwealth University.

Explore our featured products

Concentric circles with purple, orange and yellow markers for DNA sequencing.

Discover the ATCC Genome Portal

The ATCC Genome Portal is a rapidly growing ISO 9001–compliant database of high-quality reference genomes from authenticated microbial strains in the ATCC collection. Through this cloud-based platform, you can easily access and download meticulously curated whole-genome sequences from your browser or our secure API. With high-quality, annotated data at your fingertips, you can confidently perform bioinformatics analyses and make insightful correlations.

More
3D illustration of SARS-CoV-2

SARS-CoV-2 Molecular Diagnostics Development

ATCC provides a variety of authenticated and clinically relevant materials for evaluating limit of detection, inclusivity, and cross-reactivity of novel SARS-CoV-2 molecular diagnostic assays.

More
Red tipped multi-channel pipettor above multicolored well plates.

Molecular Diagnostics Development

Develop and validate your diagnostic assays faster with relevant, ready-to-use reference materials

More