Matching Items (3)
Filtering by

Clear all filters

171582-Thumbnail Image.png
Description
High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular

High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular transition (tipping points). In Chapter 2 of this dissertation, I present a novel cell-type specific and co-expression-based tipping point detection method to identify target gene (TG) versus transcription factor (TF) pairs whose differential co-expression across time points drive biological changes in different cell types and the time point when these changes are observed. This method was applied to scRNA-seq data sets from a SARS-CoV-2 study (18 time points), a human cerebellum development study (9 time points), and a lung injury study (18 time points). Similarly, leveraging transcriptome data across treatment time points, I developed methodologies to identify treatment-induced and cell-type specific differentially co-expressed pairs (DCEPs). In part one of Chapter 3, I presented a pipeline that used a series of statistical tests to detect DCEPs. This method was applied to scRNA-seq data of patients with non-small cell lung cancer (NSCLC) sequenced across cancer treatment times. However, this pipeline does not account for correlations among multiple single cells from the same sample and correlations among multiple samples from the same patient. In Part 2 of Chapter 3, I presented a solution to this problem using a mixed-effect model. In Chapter 4, I present a summary of my work that focused on the cross-species analysis of circRNA transcriptome time series data. I compared circRNA profiles in neonatal pig and mouse hearts, identified orthologous circRNAs, and discussed regulation mechanisms of cardiomyocyte proliferation and myocardial regeneration conserved between mouse and pig at different time points.
ContributorsNyarige, Verah Mocheche (Author) / Liu, Li (Thesis advisor) / Wang, Junwen (Thesis advisor) / Dinu, Valentin (Committee member) / Arizona State University (Publisher)
Created2022
171902-Thumbnail Image.png
Description
Beta-Amyloid(Aβ) plaques and tau protein tangles in the brain are now widely recognized as the defining hallmarks of Alzheimer’s disease (AD), followed by structural atrophy detectable on brain magnetic resonance imaging (MRI) scans. However, current methods to detect Aβ/tau pathology are either invasive (lumbar puncture) or quite costly and not

Beta-Amyloid(Aβ) plaques and tau protein tangles in the brain are now widely recognized as the defining hallmarks of Alzheimer’s disease (AD), followed by structural atrophy detectable on brain magnetic resonance imaging (MRI) scans. However, current methods to detect Aβ/tau pathology are either invasive (lumbar puncture) or quite costly and not widely available (positron emission tomography (PET)). And one of the particular neurodegenerative regions is the hippocampus to which the influence of Aβ/tau on has been one of the research projects focuses in the AD pathophysiological progress. In this dissertation, I proposed three novel machine learning and statistical models to examine subtle aspects of the hippocampal morphometry from MRI that are associated with Aβ /tau burden in the brain, measured using PET images. The first model is a novel unsupervised feature reduction model to generate a low-dimensional representation of hippocampal morphometry for each individual subject, which has superior performance in predicting Aβ/tau burden in the brain. The second one is an efficient federated group lasso model to identify the hippocampal subregions where atrophy is strongly associated with abnormal Aβ/Tau. The last one is a federated model for imaging genetics, which can identify genetic and transcriptomic influences on hippocampal morphometry. Finally, I stated the results of these three models that have been published or submitted to peer-reviewed conferences and journals.
ContributorsWu, Jianfeng (Author) / Wang, Yalin (Thesis advisor) / Li, Baoxin (Committee member) / Liang, Jianming (Committee member) / Wang, Junwen (Committee member) / Wu, Teresa (Committee member) / Arizona State University (Publisher)
Created2022
161916-Thumbnail Image.png
Description
This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature

This dissertation presents three novel algorithms with real-world applications to genomic oncology. While the methodologies presented here were all developed to overcome various challenges associated with the adoption of high throughput genomic data in clinical oncology, they can be used in other domains as well. First, a network informed feature ranking algorithm is presented, which shows a significant increase in ability to select true predictive features from simulated data sets when compared to other state of the art graphical feature ranking methods. The methodology also shows an increased ability to predict pathological complete response to preoperative chemotherapy from genomic sequencing data of breast cancer patients utilizing domain knowledge from protein-protein interaction networks. Second, an algorithm that overcomes population biases inherent in the use of a human reference genome developed primarily from European populations is presented to classify microsatellite instability (MSI) status from next-generation-sequencing (NGS) data. The methodology significantly increases the accuracy of MSI status prediction in African and African American ancestries. Finally, a single variable model is presented to capture the bimodality inherent in genomic data stemming from heterogeneous diseases. This model shows improvements over other parametric models in the measurements of receiver-operator characteristic (ROC) curves for bimodal data. The model is used to estimate ROC curves for heterogeneous biomarkers in a dataset containing breast cancer and cancer-free specimen.
ContributorsSaul, Michelle (Author) / Dinu, Valentin (Thesis advisor) / Liu, Li (Committee member) / Wang, Junwen (Committee member) / Arizona State University (Publisher)
Created2021