Search Content

Methods for Detecting Mutations in Non-model Organisms

Description

Next-generation sequencing is a powerful tool for detecting genetic variation. How-ever, it is also error-prone, with error rates that are much larger than mutation rates.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The…

Next-generation sequencing is a powerful tool for detecting genetic variation. How-ever, it is also error-prone, with error rates that are much larger than mutation rates.
This can make mutation detection difficult; and while increasing sequencing depth
can often help, sequence-specific errors and other non-random biases cannot be de-
tected by increased depth. The problem of accurate genotyping is exacerbated when
there is not a reference genome or other auxiliary information available.
I explore several methods for sensitively detecting mutations in non-model or-
ganisms using an example Eucalyptus melliodora individual. I use the structure of
the tree to find bounds on its somatic mutation rate and evaluate several algorithms
for variant calling. I find that conventional methods are suitable if the genome of a
close relative can be adapted to the study organism. However, with structured data,
a likelihood framework that is aware of this structure is more accurate. I use the
techniques developed here to evaluate a reference-free variant calling algorithm.
I also use this data to evaluate a k-mer based base quality score recalibrator
(KBBQ), a tool I developed to recalibrate base quality scores attached to sequencing
data. Base quality scores can help detect errors in sequencing reads, but are often
inaccurate. The most popular method for correcting this issue requires a known
set of variant sites, which is unavailable in most cases. I simulate data and show
that errors in this set of variant sites can cause calibration errors. I then show that
KBBQ accurately recalibrates base quality scores while requiring no reference or other
information and performs as well as other methods.
Finally, I use the Eucalyptus data to investigate the impact of quality score calibra-
tion on the quality of output variant calls and show that improved base quality score
calibration increases the sensitivity and reduces the false positive rate of a variant
calling algorithm.

ContributorsOrr, Adam James (Author) / Cartwright, Reed (Thesis advisor) / Wilson, Melissa (Committee member) / Kusumi, Kenro (Committee member) / Taylor, Jesse (Committee member) / Pfeifer, Susanne (Committee member) / Arizona State University (Publisher)

Created2020

Pathway Analysis Reveals Sex Differences in Human Hepatocellular Carcinoma

Description

Hepatocellular carcinoma (HCC) is the third leading cause of cancer death worldwide and exhibits a male-bias in occurrence and mortality. Previous studies have provided insight into the role of inherited genetic regulation of transcription in modulating sex-differences in HCC etiology and mortality. This study uses pathway analysis to add insight…

Hepatocellular carcinoma (HCC) is the third leading cause of cancer death worldwide and exhibits a male-bias in occurrence and mortality. Previous studies have provided insight into the role of inherited genetic regulation of transcription in modulating sex-differences in HCC etiology and mortality. This study uses pathway analysis to add insight into the biological processes that drive sex-differences in HCC etiology as well as a provide additional framework for future studies on sex-biased cancers. Gene expression data from normal, tumor adjacent, and HCC liver tissue were used to calculate pathway scores using a tool called PathOlogist that not only takes into consideration the molecules in a biological pathway, but also the interaction type and directionality of the signaling pathways. Analysis of the pathway scores uncovered etiologically relevant pathways differentiating male and female HCC. In normal and tumor adjacent liver tissue, males showed higher activity of pathways related to translation factors and signaling. Females did not show higher activity of any pathways compared to males in normal and tumor adjacent liver tissue. Work suggest biologic processes that underlie sex-biases in HCC occurrence and mortality. Both males and females differed in the activation of pathways related apoptosis, cell cycle, signaling, and metabolism in HCC. These results identify clinically relevant pathways for future research and therapeutic targeting.

ContributorsRehling, Thomas E (Author) / Buetow, Kenneth (Thesis advisor) / Wilson, Melissa (Committee member) / Maley, Carlo (Committee member) / Arizona State University (Publisher)

Created2021

Pathways of Distinction Analysis of Liver Cancer Data: Genetic Differences Between Males and Females

Description

The Pathways of Distinction Analysis (PoDA) program calculates relationships between a given group of genes contained within a pathway, and a disease state. It was used here to investigate liver cancer, and to explore how genetic variability may contribute to the different rates of development of the disease in males…

The Pathways of Distinction Analysis (PoDA) program calculates relationships between a given group of genes contained within a pathway, and a disease state. It was used here to investigate liver cancer, and to explore how genetic variability may contribute to the different rates of development of the disease in males and females. The goal of the study was to identify germline variation that differs by sex in hepatocellular carcinoma. Using the program, multiple pathways and genes were identified to have significant differences in their relationship to liver cancer in males and females. In animal studies, the genes which were identified using the PoDA analysis have been shown to impact liver cancer, often with different results for males and females. While these genes are often the focus in animal models, they are absent from current Genome Wide Association Studies (GWAS) catalogs for humans. By working to bridge the results of animal studies and human studies, the results help to identify the causes of liver cancer, and more specifically, the reason the disease affects males at much higher rates. The differences in pathways identified to be significant for the two sexes indicate the germline variance may play sex-specific roles in the development of hepatocellular carcinoma. Additionally, these results reinforce the capacity of the PoDA analysis to identify genes that may be missed by more traditional GWAS methods. This study lays the groundwork for further investigations into the identified genes and pathways, and how they behave differently within males and females.

ContributorsOlson, Erik Jon (Author) / Buetow, Kenneth (Thesis advisor) / Wilson, Melissa (Committee member) / Cartwright, Reed (Committee member) / Arizona State University (Publisher)

Created2021

Statistical Methods for Analysis of Genomic Data with Applications in Oncology

Description

This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature…

This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature ranking algorithm is presented, which shows a significant increase in ability to select true predictive features from simulated data sets when compared to other state of the art graphical feature ranking methods. The methodology also shows an increased ability to predict pathological complete response to preoperative chemotherapy from genomic sequencing data of breast cancer patients utilizing domain knowledge from protein-protein interaction networks. Second, an algorithm that overcomes population biases inherent in the use of a human reference genome developed primarily from European populations is presented to classify microsatellite instability (MSI) status from next-generation-sequencing (NGS) data. The methodology significantly increases the accuracy of MSI status prediction in African and African American ancestries. Finally, a single variable model is presented to capture the bimodality inherent in genomic data stemming from heterogeneous diseases. This model shows improvements over other parametric models in the measurements of receiver-operator characteristic (ROC) curves for bimodal data. The model is used to estimate ROC curves for heterogeneous biomarkers in a dataset containing breast cancer and cancer-free specimen.

ContributorsSaul, Michelle (Author) / Dinu, Valentin (Thesis advisor) / Liu, Li (Committee member) / Wang, Junwen (Committee member) / Arizona State University (Publisher)

Created2021

Characterizing Glioblastoma Multiforme By Linking Molecular Profiles to Macro Phenotypes

Description

Glioblastoma multiforme (GBM) is an aggressive brain cancer without effectivetreatment options, leaving patient survival rates extremely low. HDAC1 knockdown was found to initiate an invasive phenotype in vivo, particularly within the BT145 human glioma stem cell (hGSC) line. Analysis through RNA sequencing (RNA-seq) gene expression and regulatory networks found both CEBPβ, a known transcription…

Glioblastoma multiforme (GBM) is an aggressive brain cancer without effectivetreatment options, leaving patient survival rates extremely low. HDAC1 knockdown was found to initiate an invasive phenotype in vivo, particularly within the BT145 human glioma stem cell (hGSC) line. Analysis through RNA sequencing (RNA-seq) gene expression and regulatory networks found both CEBPβ, a known transcription factor (TF) involved in cellular invasion, and the STAT3 pathway, a notorious genetic component of GBM, were differentially expressed in BT145 hGSCs after HDAC1 knockdown. Furthermore, overlap of genes regulated by CEBPβ and STAT3 indicate the CEBPβ/STAT3 pathway may be involved in the observed BT145- specific invasive phenotype. The SYstems Genetics Network AnaLysis (SYGNAL) pipeline was applied to construct sex-specific gene regulatory networks from The Cancer Genome Atlas (TCGA) GBM patient expression data. Unique bicluster eigengenes were discovered separately for all, female, and male patients. Through the application of these bicluster eigengenes to a GBM cohort with multiparametric magnetic resonance imaging (mpMRI) localized biopsies, sex-specific associations between bicluster expression, mpMRI readout, and hallmarks of cancer were determined. Distinctive cancer functions were revealed transcriptionally through bicluster expression, and connected to a unique mpMRI feature. Specifically, SPGRC mpMRI indicated a strong signal for both immune hallmarks (evading immune detection and tumor-promoting inflammation). At the same time, MD mpMRI displayed a tendency toward sustained angiogenesis, possibly signaling the formation of new blood vessels. Uncovering each mpMRI feature’s underlying biological processes enables improved GBM diagnosis and treatment utilizing an individualized, non-invasive approach.

ContributorsLewis, Erika (Author) / Plaisier, Christopher L (Thesis advisor) / Nikkhah, Medhi (Committee member) / Hu, Leland (Committee member) / Arizona State University (Publisher)

Created2021

Enhancing Reductive Dechlorination through Electrokinetic Transport and Microbially Driven H2 Cycling in the Subsurface

Description

Water is a vital resource, and its protection is a priority world-wide. One widespread threat to water quality is contamination by chlorinated solvents. These dry-cleaning and degreasing agents entered the watershed through spills and improper disposal and now are detected in 4% of U.S. aquifers and 4.5-18% of U.S.…

Water is a vital resource, and its protection is a priority world-wide. One widespread threat to water quality is contamination by chlorinated solvents. These dry-cleaning and degreasing agents entered the watershed through spills and improper disposal and now are detected in 4% of U.S. aquifers and 4.5-18% of U.S. drinking water sources. The health effects of these contaminants can be severe, as they are associated with damage to the nervous, liver, kidney, and reproductive systems, developmental issues, and possibly cancer. Chlorinated solvents must be removed or transformed to improve water quality and protect human and environmental health. One remedy, bioaugmentation, the subsurface addition of microbial cultures able to transform contaminants, has been implemented successfully at hundreds of sites since the 1990s. Bioaugmentation uses the bacteria Dehalococcoides to transform chlorinated solvents with hydrogen, H2, as the electron donor. At advection limited sites, bioaugmentation can be combined with electrokinetics (EK-Bio) to enhance transport. However, challenges for successful bioremediation remain. In this work I addressed several knowledge gaps surrounding bioaugmentation and EK-Bio. I measured the H2 consuming capacity of soils, detailed the microbial metabolisms driving this demand, and evaluated how these finding relate to reductive dechlorination. I determined which reactions dominated at a contaminated site with mixed geochemistry treated with EK-Bio and compared it to traditional bioaugmentation. Lastly, I assessed the effect of EK-Bio on the microbial community at a field-scale site. Results showed the H2 consuming capacity of soils was greater than that predicted by initial measurements of inorganic electron acceptors and primarily driven by carbon-based microbial metabolisms. Other work demonstrated that, given the benefits of some carbon-based metabolisms to microbial reductive dechlorination, high levels of H2 consumption in soils are not necessarily indicative of hostile conditions for Dehalococcoides. Bench-scale experiments of EK-Bio under mixed geochemical conditions showed EK-Bio out-performed traditional bioaugmentation by facilitating biotic and abiotic transformations. Finally, results of microbial community analysis at a field-scale implementation of EK-Bio showed that while there were significant changes in alpha and beta diversity, the impact of EK-Bio on native microbial communities was minimal.

ContributorsAltizer, Megan Leigh (Author) / Torres, César I (Thesis advisor) / Krajmalnik-Brown, Rosa (Thesis advisor) / Rittmann, Bruce E (Committee member) / Kavazanjian, Edward (Committee member) / Delgado, Anca G (Committee member) / Arizona State University (Publisher)

Created2020

Biochemical Networks Across Planets and Scales

Description

Biochemical reactions underlie all living processes. Their complex web of interactions is difficult to fully capture and quantify with simple mathematical objects. Applying network science to biology has advanced our understanding of the metabolisms of individual organisms and the organization of ecosystems, but has scarcely been applied to life at…

Biochemical reactions underlie all living processes. Their complex web of interactions is difficult to fully capture and quantify with simple mathematical objects. Applying network science to biology has advanced our understanding of the metabolisms of individual organisms and the organization of ecosystems, but has scarcely been applied to life at a planetary scale. To characterize planetary-scale biochemistry, I constructed biochemical networks using global databases of annotated genomes and metagenomes, and biochemical reactions. I uncover scaling laws governing biochemical diversity and network structure shared across levels of organization from individuals to ecosystems, to the biosphere as a whole. Comparing real biochemical reaction networks to random reaction networks reveals the observed biological scaling is not a product of chemistry alone, but instead emerges due to the particular structure of selected reactions commonly participating in living processes. I perform distinguishability tests across properties of individual and ecosystem-level biochemical networks to determine whether or not they share common structure, indicative of common generative mechanisms across levels. My results indicate there is no sharp transition in the organization of biochemistry across distinct levels of the biological hierarchy—a result that holds across different network projections.

Finally, I leverage these large biochemical datasets, in conjunction with planetary observations and computational tools, to provide a methodological foundation for the quantitative assessment of biology’s viability amongst other geospheres. Investigating a case study of alkaliphilic prokaryotes in the context of Enceladus, I find that the chemical compounds observed on Enceladus thus far would be insufficient to allow even these extremophiles to produce the compounds necessary to sustain a viable metabolism. The environmental precursors required by these organisms provides a reference for the compounds which should be prioritized for detection in future planetary exploration missions. The results of this framework have further consequences in the context of planetary protection, and hint that forward contamination may prove infeasible without meticulous intent. Taken together these results point to a deeper level of organization in biochemical networks than what has been understood so far, and suggests the existence of common organizing principles operating across different levels of biology and planetary chemistry.

ContributorsSmith, Harrison Brodsky (Author) / Walker, Sara I (Thesis advisor) / Anbar, Ariel D (Committee member) / Line, Michael R (Committee member) / Okie, Jordan G. (Committee member) / Romaniello, Stephen J. (Committee member) / Arizona State University (Publisher)

Created2018

Filtering by

Methods for Detecting Mutations in Non-model Organisms

Pathway Analysis Reveals Sex Differences in Human Hepatocellular Carcinoma

Pathways of Distinction Analysis of Liver Cancer Data: Genetic Differences Between Males and Females

Statistical Methods for Analysis of Genomic Data with Applications in Oncology

Characterizing Glioblastoma Multiforme By Linking Molecular Profiles to Macro Phenotypes

Enhancing Reductive Dechlorination through Electrokinetic Transport and Microbially Driven H2 Cycling in the Subsurface

Biochemical Networks Across Planets and Scales