Matching Items (19)
Filtering by
- Genre: Academic theses

Description
Next-generation sequencing is a powerful tool for detecting genetic variation. How-ever, it is also error-prone, with error rates that are much larger than mutation rates.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The problem of accurate genotyping is exacerbated when
there is not a reference genome or other auxiliary information available.
I explore several methods for sensitively detecting mutations in non-model or-
ganisms using an example Eucalyptus melliodora individual. I use the structure of
the tree to find bounds on its somatic mutation rate and evaluate several algorithms
for variant calling. I find that conventional methods are suitable if the genome of a
close relative can be adapted to the study organism. However, with structured data,
a likelihood framework that is aware of this structure is more accurate. I use the
techniques developed here to evaluate a reference-free variant calling algorithm.
I also use this data to evaluate a k-mer based base quality score recalibrator
(KBBQ), a tool I developed to recalibrate base quality scores attached to sequencing
data. Base quality scores can help detect errors in sequencing reads, but are often
inaccurate. The most popular method for correcting this issue requires a known
set of variant sites, which is unavailable in most cases. I simulate data and show
that errors in this set of variant sites can cause calibration errors. I then show that
KBBQ accurately recalibrates base quality scores while requiring no reference or other
information and performs as well as other methods.
Finally, I use the Eucalyptus data to investigate the impact of quality score calibra-
tion on the quality of output variant calls and show that improved base quality score
calibration increases the sensitivity and reduces the false positive rate of a variant
calling algorithm.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The problem of accurate genotyping is exacerbated when
there is not a reference genome or other auxiliary information available.
I explore several methods for sensitively detecting mutations in non-model or-
ganisms using an example Eucalyptus melliodora individual. I use the structure of
the tree to find bounds on its somatic mutation rate and evaluate several algorithms
for variant calling. I find that conventional methods are suitable if the genome of a
close relative can be adapted to the study organism. However, with structured data,
a likelihood framework that is aware of this structure is more accurate. I use the
techniques developed here to evaluate a reference-free variant calling algorithm.
I also use this data to evaluate a k-mer based base quality score recalibrator
(KBBQ), a tool I developed to recalibrate base quality scores attached to sequencing
data. Base quality scores can help detect errors in sequencing reads, but are often
inaccurate. The most popular method for correcting this issue requires a known
set of variant sites, which is unavailable in most cases. I simulate data and show
that errors in this set of variant sites can cause calibration errors. I then show that
KBBQ accurately recalibrates base quality scores while requiring no reference or other
information and performs as well as other methods.
Finally, I use the Eucalyptus data to investigate the impact of quality score calibra-
tion on the quality of output variant calls and show that improved base quality score
calibration increases the sensitivity and reduces the false positive rate of a variant
calling algorithm.
ContributorsOrr, Adam James (Author) / Cartwright, Reed (Thesis advisor) / Wilson, Melissa (Committee member) / Kusumi, Kenro (Committee member) / Taylor, Jesse (Committee member) / Pfeifer, Susanne (Committee member) / Arizona State University (Publisher)
Created2020

Description
Leprosy, or Hansen’s disease, is often relegated to antiquity, yet it remains a modern public health concern with around 200,000 new cases reported annually around the world (World Health Organization, 2023). Most leprosy cases in humans are caused by Mycobacterium leprae, but a small number of cases are now known to be caused by Mycobacterium lepromatosis, which is found mostly in Mexico and the Caribbean (Han et al., 2008; 2012). Recent work has improved our understanding of the pattern of genomic variation in M. leprae strains, however certain regions remain understudied. Additionally, ongoing surveillance has identified pockets of hyperendemicity, which could act as hotspots for the spread of drug-resistant strains. Despite increased surveillance, there has been limited genomic research in regions like the Pacific and few surveys incorporating a broad species range in endemic areas. To address this, M. leprae genomes were isolated from clinical formalin-fixed, paraffin-embedded (FFPE) samples from the Pacific Islands and used in phylogenetic and phylogeographic analyses. 21 novel M. leprae strains from these samples were sequenced and dating analyses determined that M. leprae strains have been circulating in the Pacific since the original peopling of the region. Furthermore, recent radiation events among the Pacific Islands, as well as a rise in drug-resistant M. leprae strains, were identified and described. Additionally, a survey of animals in an endemic state in Brazil resulted in the first M. leprae genome isolated from a big cat. This body of research leverages genomic data to characterize novel diversity of M. leprae strains, identify drug-resistant strains in these regions, and determine how this pathogen spreads through space and time. The results of this work will aid in our understanding of the history of leprosy and improve public health responses to this disease.
ContributorsCrane, Adele Elizabeth (Author) / Stone, Anne (Thesis advisor) / Fox, Keolu (Committee member) / Varsani, Arvind (Committee member) / Wilson, Melissa (Committee member) / Arizona State University (Publisher)
Created2024

Description
Mutation is the source of heritable variation of genotype and phenotype, on which selection may act. Mutation rates describe a fundamental parameter of living things, which influence the rate at which evolution may occur, from viral pathogens to human crops and even to aging cells and the emergence of cancer. An understanding of the variables which impact mutation rates and their estimation is necessary to place mutation rate estimates in their proper contexts. To better understand mutation rate estimates, this research investigates the impact of temperature upon transcription rate error estimates; the impact of growing cells in liquid culture vs. on agar plates; the impact of many in vitro variables upon the estimation of deoxyribonucleic acid (DNA) mutation rates from a single sample; and the mutational hazard induced by expressing clustered regularly interspaced short palindromic repeat (CRISPR) proteins in yeast. This research finds that many of the variables tested did not significantly alter the estimation of mutation rates, strengthening the claims of previous mutation rate estimates across the tree of life by diverse experimental approaches. However, it is clear that sonication is a mutagen of DNA, part of an effort which has reduced the sequencing error rate of circle-seq by over 1,000-fold. This research also demonstrates that growth in liquid culture modestly skews the mutation spectrum of MMR- Escherichia coli, though it does not significantly impact the overall mutation rate. Finally, this research demonstrates a modest mutational hazard of expressing Cas9 and similar CRISPR proteins in yeast cells at an un-targeted genomic locus, though it is possible the indel rate has been increased by an order of magnitude.
ContributorsBaehr, Stephan (Author) / Lynch, Michael (Thesis advisor) / Geiler-Samerotte, Kerry (Committee member) / Mangone, Marco (Committee member) / Wilson, Melissa (Committee member) / Arizona State University (Publisher)
Created2023

Description
Following injury, dying cells act as essential regulators of the damage response by promoting tissue repair and regeneration. Evidence of apoptotic signaling during regeneration has been found in diverse tissue contexts, and several mechanisms by which these signals dictate the collective tissue outcome have been identified. However, much less is understood about how tissues respond to other types of cell death, like necrosis. Necrosis is a catastrophic type of tissue death that can occur in diverse tissues and is central to many human injuries and inherited and congenital conditions. Characterized by the sudden loss of membrane integrity, necrosis often spreads to adjacent, healthy cells, severely compromising tissue health and requiring invasive medical procedures for treatment. A better understanding of how necrotic cells communicate with surrounding tissue is crucial to more effectively treat necrotic wounds. However, a lack of accessible genetic models to study the interactions between necrotic and healthy cells has impaired progress in the field. To address this, this work has established a novel genetic ablation system in the Drosophila wing imaginal disc to study the tissue response to necrosis.Following necrosis, a strong tissue regeneration response occurs that relies on the apoptosis of cells at a distance from the wound, known as necrosis-induced apoptosis (NiA). Unlike other instances of damage-associated apoptosis, NiA cells do not secrete mitogenic factors. However, these cells promote regeneration by stimulating regenerative proliferation, while the inhibition of NiA cell activity results in a reduced capacity to regenerate, as assayed by adult wing size. Moreover, NiA cells may potentially survive apoptosis and instead utilize apoptotic factors to persist in the disc over time to repair the damaged tissue as necrosis-induced caspase positive (NiCP) cells. NiCP appear to utilize a non-apoptotic function of the initiator caspase Dronc to promote regenerative proliferation, highlighting a potentially novel role for non-apoptotic caspase signaling during tissue regeneration.
ContributorsKlemm, Jacob William (Author) / Harris, Robin (Thesis advisor) / Wilson-Rawls, Jeanne (Committee member) / Wilson, Melissa (Committee member) / Newbern, Jason (Committee member) / Arizona State University (Publisher)
Created2024

Description
I studied the molecular mechanisms of ultraviolet radiation mitigation (UVR) in the terrestrial cyanobacterium Nostoc punctiforme ATCC 29133, which produces the indole-alkaloid sunscreen scytonemin and differentiates into motile filaments (hormogonia). While the early stages of scytonemin biosynthesis were known, the late stages were not. Gene deletion mutants were interrogated by metabolite analyses and confocal microscopy, demonstrating that the ebo gene cluster, was not only required for scytonemin biosynthesis, but was involved in the export of scytonemin monomers to the periplasm. Further, the product of gene scyE was also exported to the periplasm where it was responsible for terminal oxidative dimerization of the monomers. These results opened questions regarding the functional universality of the ebo cluster. To probe if it could play a similar role in organisms other than scytonemin producing cyanobacteria, I developed a bioinformatic pipeline (Functional Landscape And Neighbor Determining gEnomic Region Search; FLANDERS) and used it to scrutinize the neighboring regions of the ebo gene cluster in 90 different bacterial genomes for potentially informational features. Aside from the scytonemin operon and the edb cluster of Pseudomonas spp., responsible for nematode repellence, no known clusters were identified in genomic ebo neighbors, but many of the ebo adjacent regions were enriched in signal peptides for export, indicating a general functional connection between the ebo cluster and biosynthetic compartmentalization. Lastly, I investigated the regulatory span of the two-component regulator of the scytonemin operon (scyTCR) using RNAseq of scyTCR deletion mutants under UV induction. Surprisingly, the knockouts had decreased expression levels in many of the genes involved in hormogonia differentiation and in a putative multigene regulatory element, hcyA-D. This suggested that UV could be a cue for developmental motility responses in Nostoc, which I could confirm phenotypically. In fact, UV-A simultaneously elicited hormogonia differentiation and scytonemin production throughout a genetically homogenous population. I show through mutant analyses that the partner-switching mechanism coded for by hcyA-D acts as a hinge between the scytonemin and hormogonia based responses. Collectively, this dissertation contributes to the understanding of microbial adaptive responses to environmental stressors at the genetic and regulatory level, highlighting their phenomenological and mechanistic complexity.
ContributorsKlicki, Kevin (Author) / Garcia-Pichel, Ferran (Thesis advisor) / Wilson, Melissa (Committee member) / Mukhopadhyay, Aindrila (Committee member) / Misra, Rajeev (Committee member) / Arizona State University (Publisher)
Created2021

Description
Human leukocyte antigen (HLA) is a group of proteins that the human immune system uses to detect pathogens. HLA is highly polymorphic, especially in the peptide-binding groove, which allows the binding of a diverse range of peptides including peptides produced by pathogens. Hepatitis B virus (HBV), is a pathogen that can cause liver disease. Chronic HBV infection, if left untreated, can lead to hepatocellular carcinoma, the most common form of liver cancer. In this paper, the association of Class I and II HLA with HBV-mediated liver cancer in patients of East Asian and European ancestry was studied. Results showed that, in the initial combined ancestry analysis, some alleles from all HLA types are associated with HBV-mediated liver cancer. However, once stratified by population ancestry, most of the alleles are no longer significant but still associate with HBV-mediated liver cancer in the same directions. In contrast, HLA-DP is the only HLA with haplotypes that are significantly different before and after stratification by ancestry. Notably, DPA10103-DPB10401, a previously known protective haplotype in the Asian population, is associated negatively with HBV-mediated liver cancer in both East Asian and European populations. Additionally, DPA10202-DPB10501, a known risk haplotype in the Asian population, is associated positively with HBV-mediated liver cancer patients of European ancestry. To understand how HLA-DP is associated with HBV-mediated liver cancer, the binding affinity of HLA-DP to all peptides generated from HBV coding sequences of genotypes A-H was predicted. It was speculated that an individual with HLA types that can bind strongly to HBV peptides will be more likely to clear viral infection whereas an individual with HLA types that fail to bind strongly to HBV peptides will be less likely to clear viral infection, thus developing chronic infection. Results showed that DPA10103-DPB10401 binds strongly to HBV peptides (<50nM) whereas DPA10202-DPB10501 does not bind strongly to any HBV peptides (>50nM), consistent with the speculation that the binding affinity of HBV peptides to HLA will influence the association of HLA with HBV-mediated liver cancer.
ContributorsYap, Yan Rou (Author) / Wilson, Melissa (Thesis advisor) / Lim, Efrem (Thesis advisor) / Buetow, Kenneth (Committee member) / Arizona State University (Publisher)
Created2021

Description
Hepatocellular carcinoma (HCC) is the third leading cause of cancer death worldwide and exhibits a male-bias in occurrence and mortality. Previous studies have provided insight into the role of inherited genetic regulation of transcription in modulating sex-differences in HCC etiology and mortality. This study uses pathway analysis to add insight into the biological processes that drive sex-differences in HCC etiology as well as a provide additional framework for future studies on sex-biased cancers. Gene expression data from normal, tumor adjacent, and HCC liver tissue were used to calculate pathway scores using a tool called PathOlogist that not only takes into consideration the molecules in a biological pathway, but also the interaction type and directionality of the signaling pathways. Analysis of the pathway scores uncovered etiologically relevant pathways differentiating male and female HCC. In normal and tumor adjacent liver tissue, males showed higher activity of pathways related to translation factors and signaling. Females did not show higher activity of any pathways compared to males in normal and tumor adjacent liver tissue. Work suggest biologic processes that underlie sex-biases in HCC occurrence and mortality. Both males and females differed in the activation of pathways related apoptosis, cell cycle, signaling, and metabolism in HCC. These results identify clinically relevant pathways for future research and therapeutic targeting.
ContributorsRehling, Thomas E (Author) / Buetow, Kenneth (Thesis advisor) / Wilson, Melissa (Committee member) / Maley, Carlo (Committee member) / Arizona State University (Publisher)
Created2021

Description
The family Cactaceae is extremely diverse and has a near global distribution yet very little has been described regarding the community of viruses that infect or are associated with cacti. This research characterizes the diversity of viruses associated with Cactaceae plants and their evolutionary aspects. Five viruses belonging to the economically relevant plant virus family Geminiviridae were identified, initially, two novel divergent geminiviruses named Opuntia virus 1 (OpV1) and Opuntia virus 2 (OpV2) and Opuntia becurtovirus, a new strain within the genus Becurtovirus. These three viruses were also found in co-infection. In addition, two known geminiviruses, the squash leaf curl virus (SLCV) and watermelon chlorotic stunt virus (WCSV) were identified infecting Cactaceae plants and other non-cactus plants in the USA and Mexico. Both SLCV and WCSV are known to cause severe disease in cultivated Cucurbitaceae plants in the USA and Middle East, respectively. This study shows that WCSV was introduced in the America two times, and it is the first identification of this virus in the USA, demonstrating is likely more widespread in North America. These findings along with the Opuntia becurtovirus are probable events of spill-over in agro-ecological interfaces. A novel circular DNA possibly bipartite plant-infecting virus that encodes protein similar to those of geminiviruses was also identified in an Opuntia discolor plant in Brazil, named utkilio virus, but it is evolutionary distinct likely belonging to a new taxon. Viruses belonging to the ssDNA viral family Genomoviridae are also described and those thus far been associated with fungi hosts, so it is likely the ones identified in plants are associated with their phytobiome. Overall, the results of this project provide a molecular and biological characterization of novel geminiviruses and genomoviruses associated with cacti as well as demonstrate the impact of agro-ecological interfaces in the spread of viruses from or to native plants. It also highlights the importance of viral metagenomics studies in exploring virus diversity and evolution given then amount of virus diversity identified. This is important for conservation and management of cacti in a global scale, including the relevance of controlled movement of plants within countries.
ContributorsSalgado Fontenele, Rafaela (Author) / Varsani, Arvind (Thesis advisor) / Wilson, Melissa (Committee member) / Majure, Lucas (Committee member) / Van Doorslaer, Koenraad (Committee member) / Wojciechowski, Martin (Committee member) / Arizona State University (Publisher)
Created2021

Description
Parkinson’s disease (PD) is a progressive neurodegenerative disorder, diagnosed late in
the disease by a series of motor deficits that manifest over years or decades. It is characterized by degeneration of mid-brain dopaminergic neurons with a high prevalence of dementia associated with the spread of pathology to cortical regions. Patients exhibiting symptoms have already undergone significant neuronal loss without chance for recovery. Analysis of disease specific changes in gene expression directly from human patients can uncover invaluable clues about a still unknown etiology, the potential of which grows exponentially as additional gene regulatory measures are questioned. Epigenetic mechanisms are emerging as important components of neurodegeneration, including PD; the extent to which methylation changes correlate with disease progression has not yet been reported. This collection of work aims to define multiple layers of PD that will work toward developing biomarkers that not only could improve diagnostic accuracy, but also push the boundaries of the disease detection timeline. I examined changes in gene expression, alternative splicing of those gene products, and the regulatory mechanism of DNA methylation in the Parkinson’s disease system, as well as the pathologically related Alzheimer’s disease (AD). I first used RNA sequencing (RNAseq) to evaluate differential gene expression and alternative splicing in the posterior cingulate cortex of patients with PD and PD with dementia (PDD). Next, I performed a longitudinal genome-wide methylation study surveying ~850K CpG methylation sites in whole blood from 189 PD patients and 191 control individuals obtained at both a baseline and at a follow-up visit after 2 years. I also considered how symptom management medications could affect the regulatory mechanism of DNA methylation. In the last chapter of this work, I intersected RNAseq and DNA methylation array datasets from whole blood patient samples for integrated differential analyses of both PD and AD. Changes in gene expression and DNA methylation reveal clear patterns of pathway dysregulation that can be seen across brain and blood, from one study to the next. I present a thorough survey of molecular changes occurring within the idiopathic Parkinson’s disease patient and propose candidate targets for potential molecular biomarkers.
the disease by a series of motor deficits that manifest over years or decades. It is characterized by degeneration of mid-brain dopaminergic neurons with a high prevalence of dementia associated with the spread of pathology to cortical regions. Patients exhibiting symptoms have already undergone significant neuronal loss without chance for recovery. Analysis of disease specific changes in gene expression directly from human patients can uncover invaluable clues about a still unknown etiology, the potential of which grows exponentially as additional gene regulatory measures are questioned. Epigenetic mechanisms are emerging as important components of neurodegeneration, including PD; the extent to which methylation changes correlate with disease progression has not yet been reported. This collection of work aims to define multiple layers of PD that will work toward developing biomarkers that not only could improve diagnostic accuracy, but also push the boundaries of the disease detection timeline. I examined changes in gene expression, alternative splicing of those gene products, and the regulatory mechanism of DNA methylation in the Parkinson’s disease system, as well as the pathologically related Alzheimer’s disease (AD). I first used RNA sequencing (RNAseq) to evaluate differential gene expression and alternative splicing in the posterior cingulate cortex of patients with PD and PD with dementia (PDD). Next, I performed a longitudinal genome-wide methylation study surveying ~850K CpG methylation sites in whole blood from 189 PD patients and 191 control individuals obtained at both a baseline and at a follow-up visit after 2 years. I also considered how symptom management medications could affect the regulatory mechanism of DNA methylation. In the last chapter of this work, I intersected RNAseq and DNA methylation array datasets from whole blood patient samples for integrated differential analyses of both PD and AD. Changes in gene expression and DNA methylation reveal clear patterns of pathway dysregulation that can be seen across brain and blood, from one study to the next. I present a thorough survey of molecular changes occurring within the idiopathic Parkinson’s disease patient and propose candidate targets for potential molecular biomarkers.
ContributorsHenderson, Adrienne Rose (Author) / Huentelman, Matthew J (Thesis advisor) / Newbern, Jason (Thesis advisor) / Dunckley, Travis L (Committee member) / Jensen, Kendall (Committee member) / Wilson, Melissa (Committee member) / Arizona State University (Publisher)
Created2019

Description
In most diploid cells, autosomal genes are equally expressed from the paternal and maternal alleles resulting in biallelic expression. However, as an exception, there exists a small number of genes that show a pattern of monoallelic or biased-allele expression based on the allele’s parent-of-origin. This phenomenon is termed genomic imprinting and is an evolutionary paradox. The best explanation for imprinting is David Haig's kinship theory, which hypothesizes that monoallelic gene expression is largely the result of evolutionary conflict between males and females over maternal involvement in their offspring. One previous RNAseq study has investigated the presence of parent-of-origin effects, or imprinting, in the parasitic jewel wasp Nasonia vitripennis (N. vitripennis) and its sister species Nasonia giraulti (N. giraulti) to test the predictions of kinship theory in a non-eusocial species for comparison to a eusocial one. In order to continue to tease apart the connection between social and eusocial Hymenoptera, this study proposed a similar RNAseq study that attempted to reproduce these results in unique samples of reciprocal F1 Nasonia hybrids. Building a pseudo N. giraulti reference genome, differences were observed when aligning RNAseq reads to a N. vitripennis reference genome compared to aligning reads to a pseudo N. giraulti reference. As well, no evidence for parent-of-origin or imprinting patterns in adult Nasonia were found. These results demonstrated a species-of-origin effect. Importantly, the study continued to build a repository of support with the aim to elucidate the mechanisms behind imprinting in an excellent epigenetic model species, as it can also help with understanding the phenomenon of imprinting in complex human diseases.
ContributorsUnderwood, Avery Elizabeth (Author) / Wilson, Melissa (Thesis advisor) / Buetow, Kenneth (Committee member) / Gile, Gillian (Committee member) / Arizona State University (Publisher)
Created2019