Search Content

Towards a systems biology understanding of metabolic syndrome

Description

This dissertation investigates the condition of skeletal muscle insulin resistance using bioinformatics and computational biology approaches. Drawing from several studies and numerous data sources, I have attempted to uncover molecular mechanisms at multiple levels. From the detailed atomistic simulations of a single protein, to datamining approaches applied at the systems…

This dissertation investigates the condition of skeletal muscle insulin resistance using bioinformatics and computational biology approaches. Drawing from several studies and numerous data sources, I have attempted to uncover molecular mechanisms at multiple levels. From the detailed atomistic simulations of a single protein, to datamining approaches applied at the systems biology level, I provide new targets to explore for the research community. Furthermore I present a new online web resource that unifies various bioinformatics databases to enable discovery of relevant features in 3D protein structures.

ContributorsMielke, Clinton (Author) / Mandarino, Lawrence (Committee member) / LaBaer, Joshua (Committee member) / Magee, D. Mitchell (Committee member) / Dinu, Valentin (Committee member) / Willis, Wayne (Committee member) / Arizona State University (Publisher)

Created2013

Differential Gene Expression in Type II Diabetes

Description

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develo…

This research project investigated known and novel differential genetic variants and their associated molecular pathways involved in Type II diabetes mellitus for the purpose of improving diagnosis and treatment methods. The goal of this investigation was to 1) identify the genetic variants and SNPs in Type II diabetes to develop a gene regulatory pathway, and 2) utilize this pathway to determine suitable drug therapeutics for prevention and treatment. Using a Gene Set Enrichment Analysis (GSEA), a set of 1000 gene identifiers from a Mayo Clinic database was analyzed to determine the most significant genetic variants related to insulin signaling pathways involved in Type II Diabetes. The following genes were identified: NRAS, KRAS, PIK3CA, PDE3B, TSC1, AKT3, SOS1, NEU1, PRKAA2, AMPK, and ACC. In an extensive literature review and cross-analysis with Kegg and Reactome pathway databases, novel SNPs located on these gene variants were identified and used to determine suitable drug therapeutics for treatment. Overall, understanding how genetic mutations affect target gene function related to Type II Diabetes disease pathology is crucial to the development of effective diagnosis and treatment. This project provides new insight into the molecular basis of the Type II Diabetes, serving to help untangle the regulatory complexity of the disease and aid in the advancement of diagnosis and treatment. Keywords: Type II Diabetes mellitus, Gene Set Enrichment Analysis, genetic variants, KEGG Insulin Pathway, gene-regulatory pathway

ContributorsBucklin, Lindsay (Co-author) / Davis, Vanessa (Co-author) / Holechek, Susan (Thesis director) / Wang, Junwen (Committee member) / Nyarige, Verah (Committee member) / School of Human Evolution & Social Change (Contributor) / School of Life Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Novel Bioinformatics Methods for Co-expression Analysis of Single Cell RNA Sequencing and Circular RNA Sequencing Time Series Data

Description

High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular…

High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular transition (tipping points). In Chapter 2 of this dissertation, I present a novel cell-type specific and co-expression-based tipping point detection method to identify target gene (TG) versus transcription factor (TF) pairs whose differential co-expression across time points drive biological changes in different cell types and the time point when these changes are observed. This method was applied to scRNA-seq data sets from a SARS-CoV-2 study (18 time points), a human cerebellum development study (9 time points), and a lung injury study (18 time points). Similarly, leveraging transcriptome data across treatment time points, I developed methodologies to identify treatment-induced and cell-type specific differentially co-expressed pairs (DCEPs). In part one of Chapter 3, I presented a pipeline that used a series of statistical tests to detect DCEPs. This method was applied to scRNA-seq data of patients with non-small cell lung cancer (NSCLC) sequenced across cancer treatment times. However, this pipeline does not account for correlations among multiple single cells from the same sample and correlations among multiple samples from the same patient. In Part 2 of Chapter 3, I presented a solution to this problem using a mixed-effect model. In Chapter 4, I present a summary of my work that focused on the cross-species analysis of circRNA transcriptome time series data. I compared circRNA profiles in neonatal pig and mouse hearts, identified orthologous circRNAs, and discussed regulation mechanisms of cardiomyocyte proliferation and myocardial regeneration conserved between mouse and pig at different time points.

ContributorsNyarige, Verah Mocheche (Author) / Liu, Li (Thesis advisor) / Wang, Junwen (Thesis advisor) / Dinu, Valentin (Committee member) / Arizona State University (Publisher)

Created2022

Understanding and Utilizing Protein Interactions in Diverse Environments

Description

Transient protein-protein and protein-molecule interactions fluctuate between associated and dissociated states. They are widespread in nature and mediate most biological processes. These interactions are complex and are strongly influenced by factors such as concentration, structure, and environment. Understanding and utilizing these types of interactions is useful from both a fundamental…

Transient protein-protein and protein-molecule interactions fluctuate between associated and dissociated states. They are widespread in nature and mediate most biological processes. These interactions are complex and are strongly influenced by factors such as concentration, structure, and environment. Understanding and utilizing these types of interactions is useful from both a fundamental and design perspective. In this dissertation, transient protein interactions are used as the sensing element of a biosensor for small molecule detection. This is done by using a transcription factor-small molecule pair that mediates the activation of a CRISPR/Cas12a complex. Activation of the Cas12a enzyme results in an amplified readout mechanism that is either fluorescence or paper based. This biosensor can successfully detect 9 different small molecules including antibiotics with a tuneable detection limit ranging from low µM to low nM. By combining protein and nucleic acid-based systems, this biosensor has the potential to report on almost any protein-molecule interaction, linking this to the intrinsic amplification that is possible when working with nucleic acid-based technologies. The second part of this dissertation focuses on understanding protein-molecule interactions at a more fundamental level, and, in so doing, exploring design rules required to generalize sensors like the ones described above. This is done by training a neural network algorithm with binding data from high density peptide micro arrays incubated with specific protein targets. Because the peptide sequences were chosen simply to evenly, though sparsely, represent all sequence space, the resulting network provides a comprehensive sequence/binding relationship for a given target protein. While past work had shown that this works well on the arrays, here I have explored how well the neural networks thus trained, predict sequence-dependent binding in the context of protein-protein and peptide-protein interactions. Amino acid sequences, either free in solution or embedded in protein structure, will display somewhat different binding properties than sequences affixed to the surface of a high-density array. However, the neural network trained on array sequences was able to both identify binding regions in between proteins and predict surface plasmon resonance-based binding propensities for peptides with statistically significant levels of accuracy.

ContributorsSwingle, Kirstie Lynn (Author) / Woodbury, Neal W (Thesis advisor) / Green, Alexander A (Thesis advisor) / Stephanopoulos, Nicholas (Committee member) / Borges, Chad (Committee member) / Arizona State University (Publisher)

Created2022

Statistical Methods for Analysis of Genomic Data with Applications in Oncology

Description

This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature…

This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature ranking algorithm is presented, which shows a significant increase in ability to select true predictive features from simulated data sets when compared to other state of the art graphical feature ranking methods. The methodology also shows an increased ability to predict pathological complete response to preoperative chemotherapy from genomic sequencing data of breast cancer patients utilizing domain knowledge from protein-protein interaction networks. Second, an algorithm that overcomes population biases inherent in the use of a human reference genome developed primarily from European populations is presented to classify microsatellite instability (MSI) status from next-generation-sequencing (NGS) data. The methodology significantly increases the accuracy of MSI status prediction in African and African American ancestries. Finally, a single variable model is presented to capture the bimodality inherent in genomic data stemming from heterogeneous diseases. This model shows improvements over other parametric models in the measurements of receiver-operator characteristic (ROC) curves for bimodal data. The model is used to estimate ROC curves for heterogeneous biomarkers in a dataset containing breast cancer and cancer-free specimen.

ContributorsSaul, Michelle (Author) / Dinu, Valentin (Thesis advisor) / Liu, Li (Committee member) / Wang, Junwen (Committee member) / Arizona State University (Publisher)

Created2021

Needle in a Haystack: the search for immunogenic epitopes for TPD52

Description

Breast cancer is the leading cause of cancer-related deaths of women in the united states. Traditionally, Breast cancer is predominantly treated by a combination of surgery, chemotherapy, and radiation therapy. However, due to the significant negative side effects associated with these traditional treatments, there has been substantial efforts to develo…

Breast cancer is the leading cause of cancer-related deaths of women in the united states. Traditionally, Breast cancer is predominantly treated by a combination of surgery, chemotherapy, and radiation therapy. However, due to the significant negative side effects associated with these traditional treatments, there has been substantial efforts to develop alternative therapies to treat cancer. One such alternative therapy is a peptide-based therapeutic cancer vaccine. Therapeutic cancer vaccines enhance an individual's immune response to a specific tumor. They are capable of doing this through artificial activation of tumor specific CTLs (Cytotoxic T Lymphocytes). However, in order to artificially activate tumor specific CTLs, a patient must be treated with immunogenic epitopes derived from their specific cancer type. We have identified that the tumor associated antigen, TPD52, is an ideal target for a therapeutic cancer vaccine. This designation was due to the overexpression of TPD52 in a variety of different cancer types. In order to start the development of a therapeutic cancer vaccine for TPD52-related cancers, we have devised a two-step strategy. First, we plan to create a list of potential TPD52 epitopes by using epitope binding and processing prediction tools. Second, we plan to attempt to experimentally identify MHC class I TPD52 epitopes in vitro. We identified 942 potential 9 and 10 amino acid epitopes for the HLAs A1, A2, A3, A11, A24, B07, B27, B35, B44. These epitopes were predicted by using a combination of 3 binding prediction tools and 2 processing prediction tools. From these 942 potential epitopes, we selected the top 50 epitopes ranked by a combination of binding and processing scores. Due to the promiscuity of some predicted epitopes for multiple HLAs, we ordered 38 synthetic epitopes from the list of the top 50 epitope. We also performed a frequency analysis of the TPD52 protein sequence and identified 3 high volume regions of high epitope production. After the epitope predictions were completed, we proceeded to attempt to experimentally detected presented TPD52 epitopes. First, we successful transduced parental K562 cells with TPD52. After transduction, we started the optimization process for the immunoprecipitation protocol. The optimization of the immunoprecipitation protocol proved to be more difficult than originally believed and was the main reason that we were unable to progress past the transduction of the parental cells. However, we believe that we have identified the issues and will be able to complete the experiment in the coming months.

ContributorsWilson, Eric Andrew (Author) / Anderson, Karen (Thesis director) / Borges, Chad (Committee member) / School of Molecular Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Filtering by

Towards a systems biology understanding of metabolic syndrome

Differential Gene Expression in Type II Diabetes

Novel Bioinformatics Methods for Co-expression Analysis of Single Cell RNA Sequencing and Circular RNA Sequencing Time Series Data

Understanding and Utilizing Protein Interactions in Diverse Environments

Statistical Methods for Analysis of Genomic Data with Applications in Oncology

Needle in a Haystack: the search for immunogenic epitopes for TPD52