Files
Abstract
Tracking COVID-19 cases has been one of the most important aspects of controlling the pandemic and observing how the virus causing COVID-19 spreads between populations. Tracking cases of the disease helps epidemiologists understand the reason for spreading of the virus and is an important tool in implementing ways of containing the virus and reducing its spread. On a local level, tracking of cases can lead to effective quarantining of people who have been infected by quarantining them as soon as they test positive for the virus. It can also show which areas are most susceptible to being infected and in which areas person to person contact needs to be limited. The first aim of this research is to apply this same method of tracking the presence of the COVID-19 causing virus (SARS-CoV-2) to tracking other potentially infectious viruses and bacteria. By extracting genetic material from samples of on-campus wastewater and negative COVID-19 PCR tests and applying sequencing and analysis techniques on them, we are able to see what other viruses or bacteria are circulating around campus. Furthermore, this method of tracking can be used to make predictions about when a disease will spread around campus in the future. The "twindemic" refers to the spread of both COVID-19 and Influenzas A/B at the same time; observing what is found in negative COVID-19 tests can show the effects of the twindemic and to what extent both viral infections were being spread around the university’s campus. Various softwares are available that take the results of DNA sequencing (referred to as "reads") and return the taxonomic classification of each read, therefore revealing what organisms were originally in the sample. This is the final step of the process known as "sequencing" which involves processing the raw extracted genetic material to eventually obtain the sequence of DNA nucleotides belonging to the sample’s contents. Kraken is a popular metagenomic classification tool that was developed by researchers at Johns Hopkins University’s Center for Computational Biology in 2014. Kraken has, as of September 2022, been eradicated and replaced by Kraken2, which boasts a faster and more compact algorithm. Kraken works by looking up sequences of DNA called "k-mers" of length "k" in a database and selects the least common organism whose DNA contains that k-mer. It then travels down the taxonomic tree in search of the most specific organism that contains the k-mer, eventually matching it to the input DNA sequence. The developers of Kraken inspired researchers at the University of California at Riverside, who developed a metagenomic classification software called CLARK in 2015 to address Kraken’s shortcomings. CLARK works by first building an index using all the possible k-mers of potential target organisms. It then builds a dictionary that associates each k-mer to potential targets. For each input sequence, CLARK’s algorithm searches the index to match the set of k-mers within the input sequence. Whenever an organism contains the k-mer being searched, it receives a "hit". The organism with the most hits at the end of the search is mapped to the input sequence. The second aim of this research is to compare the results obtained from using Kraken2 and CLARK to classify the collected samples. Results can be used to see which classification tool is best in accuracy and other metrics. Comparing both CLARK and Kraken2 can also show whether one tool is better at detecting the presence of certain organisms than the other. Comparing these widely popular tools can help suggest if there is a need for the development of other software that overcomes any shortcomings these softwares may possess.