Matching Items (61)
149307-Thumbnail Image.png
Description
Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to

Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to rapidly and effectively survey the literature is necessary for the creation of large scale models of the relationships among biomedical entities as well as hypothesis generation to guide biomedical research. To reduce the effort and time spent in performing these activities, an intelligent search system is required. Even though many systems aid in navigating through this wide collection of documents, the vastness and depth of this information overload can be overwhelming. An automated extraction system coupled with a cognitive search and navigation service over these document collections would not only save time and effort, but also facilitate discovery of the unknown information implicitly conveyed in the texts. This thesis presents the different approaches used for large scale biomedical named entity recognition, and the challenges faced in each. It also proposes BioEve: an integrative framework to fuse a faceted search with information extraction to provide a search service that addresses the user's desire for "completeness" of the query results, not just the top-ranked ones. This information extraction system enables discovery of important semantic relationships between entities such as genes, diseases, drugs, and cell lines and events from biomedical text on MEDLINE, which is the largest publicly available database of the world's biomedical journal literature. It is an innovative search and discovery service that makes it easier to search
avigate and discover knowledge hidden in life sciences literature. To demonstrate the utility of this system, this thesis also details a prototype enterprise quality search and discovery service that helps researchers with a guided step-by-step query refinement, by suggesting concepts enriched in intermediate results, and thereby facilitating the "discover more as you search" paradigm.
ContributorsKanwar, Pradeep (Author) / Davulcu, Hasan (Thesis advisor) / Dinu, Valentin (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)
Created2010
163984-Thumbnail Image.png
Description

Dyslexia is a learning disability that negatively affects reading, writing, and spelling development at the word level in 5%-9% of children. The phenotype is variable and complex, involving several potential cognitive and physical concomitants such as sensory dysregulation and immunodeficiencies. The biological pathogenesis is not well-understood. Toward a better understanding

Dyslexia is a learning disability that negatively affects reading, writing, and spelling development at the word level in 5%-9% of children. The phenotype is variable and complex, involving several potential cognitive and physical concomitants such as sensory dysregulation and immunodeficiencies. The biological pathogenesis is not well-understood. Toward a better understanding of the biological drivers of dyslexia, we conducted the first joint exome and metabolome investigation in a pilot sample of 30 participants with dyslexia and 13 controls. In the metabolite analysis, eight metabolites of interest emerged (pyridoxine, kynurenic acid, citraconic acid, phosphocreatine, hippuric acid, xylitol, 2-deoxyuridine, and acetylcysteine). A metabolite-metabolite interaction analysis identified Krebs cycle intermediates that may be implicated in the development of dyslexia. Gene ontology analysis based on exome variants resulted in several pathways of interest, including the sensory perception of smell (olfactory) and immune system-related responses. In the joint exome and metabolite analysis, the olfactory transduction pathway emerged as the primary pathway of interest. Although the olfactory transduction and Krebs cycle pathways have not previously been described in the dyslexia literature, these pathways have been implicated in other neurodevelopmental disorders including autism spectrum disorder and obsessive-compulsive disorder, suggesting the possibility of these pathways playing a role in dyslexia as well. Immune system response pathways, on the other hand, have been implicated in both dyslexia and other neurodevelopmental disorders.

ContributorsNandakumar, Rohit (Author) / Dinu, Valentin (Thesis director) / Peter, Beate (Committee member) / Barrett, The Honors College (Contributor) / College of Health Solutions (Contributor)
Created2022-05
171582-Thumbnail Image.png
Description
High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular

High throughput transcriptome data analysis like Single-cell Ribonucleic Acid sequencing (scRNA-seq) and Circular Ribonucleic Acid (circRNA) data have made significant breakthroughs, especially in cancer genomics. Analysis of transcriptome time series data is core in identifying time point(s) where drastic changes in gene transcription are associated with homeostatic to non-homeostatic cellular transition (tipping points). In Chapter 2 of this dissertation, I present a novel cell-type specific and co-expression-based tipping point detection method to identify target gene (TG) versus transcription factor (TF) pairs whose differential co-expression across time points drive biological changes in different cell types and the time point when these changes are observed. This method was applied to scRNA-seq data sets from a SARS-CoV-2 study (18 time points), a human cerebellum development study (9 time points), and a lung injury study (18 time points). Similarly, leveraging transcriptome data across treatment time points, I developed methodologies to identify treatment-induced and cell-type specific differentially co-expressed pairs (DCEPs). In part one of Chapter 3, I presented a pipeline that used a series of statistical tests to detect DCEPs. This method was applied to scRNA-seq data of patients with non-small cell lung cancer (NSCLC) sequenced across cancer treatment times. However, this pipeline does not account for correlations among multiple single cells from the same sample and correlations among multiple samples from the same patient. In Part 2 of Chapter 3, I presented a solution to this problem using a mixed-effect model. In Chapter 4, I present a summary of my work that focused on the cross-species analysis of circRNA transcriptome time series data. I compared circRNA profiles in neonatal pig and mouse hearts, identified orthologous circRNAs, and discussed regulation mechanisms of cardiomyocyte proliferation and myocardial regeneration conserved between mouse and pig at different time points.
ContributorsNyarige, Verah Mocheche (Author) / Liu, Li (Thesis advisor) / Wang, Junwen (Thesis advisor) / Dinu, Valentin (Committee member) / Arizona State University (Publisher)
Created2022
190974-Thumbnail Image.png
Description
Advancements in high-throughput biotechnologies have generated large-scale multi-omics datasets encompassing diverse dimensions such as genomics, epigenomics, transcriptomics, proteomics, metabolomics, metagenomics, and phenomics. Traditionally, statistical and machine learning-based approaches utilize single-omics data sources to uncover molecular signatures, dissect complicated cellular mechanisms, and predict clinical results. However, to capture the multifaceted pathological

Advancements in high-throughput biotechnologies have generated large-scale multi-omics datasets encompassing diverse dimensions such as genomics, epigenomics, transcriptomics, proteomics, metabolomics, metagenomics, and phenomics. Traditionally, statistical and machine learning-based approaches utilize single-omics data sources to uncover molecular signatures, dissect complicated cellular mechanisms, and predict clinical results. However, to capture the multifaceted pathological mechanisms, integrative multi-omics analysis is needed that can provide a comprehensive picture of the disease. Here, I present three novel approaches to multi-omics integrative analysis. I introduce a single-cell integrative clustering method, which leverages multi-omics to enhance the resolution of cell subpopulations. Applied to a Cellular Indexing of Transcriptomes and Epitopes (CITE-Seq) dataset from human Acute Myeloid Lymphoma (AML) and control samples, this approach unveiled nuanced cell populations that otherwise remain elusive. I then shift the focus to a computational framework to discover transcriptional regulatory trios in which a transcription factor binds to a regulatory element harboring a genetic variant and subsequently differentially regulates the transcription level of a target gene. Applied to whole-exome, whole-genome, and transcriptome data of multiple myeloma samples, this approach discovered synergetic cis-acting and trans-acting regulatory elements associated with tumorigenesis. The next part of this work introduces a novel methodology that leverages the transcriptome and surface protein data at the single-cell level produced by CITE-Seq to model the intracellular protein trafficking process. Applied to COVID-19 samples, this approach revealed dysregulated protein trafficking associated with the severity of the infection.
ContributorsMudappathi, Rekha (Author) / Liu, Li (Thesis advisor) / Dinu, Valentin (Committee member) / Sun, Zhifu (Committee member) / Arizona State University (Publisher)
Created2023
Description
A genome wide association study (GWAS) of treatment outcomes for citalopram and escitalopram, two frontline SSRI treatments for Major Depressive Disorder, was conducted with 529 subjects on an imputed dataset. While no variants of genome-wide significance were identified, various potentially interesting variants were identified that warrant further exploration. These findings

A genome wide association study (GWAS) of treatment outcomes for citalopram and escitalopram, two frontline SSRI treatments for Major Depressive Disorder, was conducted with 529 subjects on an imputed dataset. While no variants of genome-wide significance were identified, various potentially interesting variants were identified that warrant further exploration. These findings have the potential to elucidate novel mechanisms underlying drug response for SSRIs. This work will be continued further, with machine learning and deep learning analyses to perform non-linear analyses and employing a biologist or geneticist to provide more specialized knowledge for interpretation of results.
ContributorsLeiter-Weintraub, Ethan (Author) / Dinu, Valentin (Thesis director) / Scotch, Matthew (Committee member) / Barrett, The Honors College (Contributor) / Dean, W.P. Carey School of Business (Contributor) / College of Health Solutions (Contributor) / School of Life Sciences (Contributor)
Created2024-05
168722-Thumbnail Image.png
Description
Vitamin D is a nutrient that is obtained through the diet and vitamin D supplementation and created from exposure to Ultraviolet B (UVB) radiation. While there are many factors that determine how much serum 25-hydroxyvitamin D (25(OH)D) concentration is in the body, little is known about how genetic variation in

Vitamin D is a nutrient that is obtained through the diet and vitamin D supplementation and created from exposure to Ultraviolet B (UVB) radiation. While there are many factors that determine how much serum 25-hydroxyvitamin D (25(OH)D) concentration is in the body, little is known about how genetic variation in vitamin D-related genes influences serum 25(OH)D concentrations resulting from daily vitamin D intake and exposure to direct sunlight. Previous studies show that common genetic variants rs10741657 (CYP2R1), rs4588 (GC), rs228678 (GC), and rs4516035 (VDR) act as moderators and alter the effect of outdoor time and vitamin D intake on serum 25(OH)D concentrations. The objective of this study is to analyze the associations between serum 25(OH)D concentrations resulting from outdoor time and vitamin D intake, and genetic risk scores (GRS) established from previous studies involving single nucleotide polymorphisms (SNP) located on or near genes involving vitamin D synthesis, transport, activation, and degradation in 102 Hispanic and Non-Hispanic adults in the San Diego County, California. This study is a secondary analysis of data from the Community of Mine study. Global Positioning System (GPS) data collected by the Qstarz GPS device worn by each participant was used to measure outdoor time, a proxy measurement for sun exposure time. Vitamin D intake was assessed using two 24-hour dietary recalls. Blood samples were measured for serum 25(OH)D concentrations. DNA was provided to assess each participant for the various genetic variants. Adjusted analyses of the GRS and serum 25(OH)D concentrations showed that individuals with high GRS (3-4) had lower serum 25(OH)D concentrations than individuals with low GRS (0-2) for both Nissen GRS and Rivera-Paredez GRS.
ContributorsAnderson, Heather Ray (Author) / Sears, Dorothy (Thesis advisor) / Alexon, Christy (Committee member) / Dinu, Valentin (Committee member) / Jankowska, Marta (Committee member) / Arizona State University (Publisher)
Created2022
Description
Circular RNAs (circRNAs) are a class of endogenous, non-coding RNAs that are formed when exons back-splice to each other and represent a new area of transcriptomics research. Numerous RNA sequencing (RNAseq) studies since 2012 have revealed that circRNAs are pervasively expressed in eukaryotes, especially in the mammalian brain. While their

Circular RNAs (circRNAs) are a class of endogenous, non-coding RNAs that are formed when exons back-splice to each other and represent a new area of transcriptomics research. Numerous RNA sequencing (RNAseq) studies since 2012 have revealed that circRNAs are pervasively expressed in eukaryotes, especially in the mammalian brain. While their functional role and impact remains to be clarified, circRNAs have been found to regulate micro-RNAs (miRNAs) as well as parental gene transcription and may thus have key roles in transcriptional regulation. Although circRNAs have continued to gain attention, our understanding of their expression in a cell-, tissue- , and brain region-specific context remains limited. Further, computational algorithms produce varied results in terms of what circRNAs are detected. This thesis aims to advance current knowledge of circRNA expression in a region specific context focusing on the human brain, as well as address computational challenges.

The overarching goal of my research unfolds over three aims: (i) evaluating circRNAs and their predicted impact on transcriptional regulatory networks in cell-specific RNAseq data; (ii) developing a novel solution for de novo detection of full length circRNAs as well as in silico validation of selected circRNA junctions using assembly; and (iii) application of these assembly based detection and validation workflows, and integrating existing tools, to systematically identify and characterize circRNAs in functionally distinct human brain regions. To this end, I have developed novel bioinformatics workflows that are applicable to non-polyA selected RNAseq datasets and can be used to characterize circRNA expression across various sample types and diseases. Further, I establish a reference dataset of circRNA expression profiles and regulatory networks in a brain region-specific manner. This resource along with existing databases such as circBase will be invaluable in advancing circRNA research as well as improving our understanding of their role in transcriptional regulation and various neurological conditions.
ContributorsSekar, Shobana (Author) / Liang, Winnie S (Thesis advisor) / Dinu, Valentin (Thesis advisor) / Craig, David (Committee member) / Liu, Li (Committee member) / Arizona State University (Publisher)
Created2018
Description
The ability to tolerate bouts of oxygen deprivation varies tremendously across the animal kingdom. Adult humans from different regions show large variation in tolerance to hypoxia; additionally, it is widely known that neonatal mammals are much more tolerant to anoxia than their adult counterparts, including in humans. Drosophila melanogaster are

The ability to tolerate bouts of oxygen deprivation varies tremendously across the animal kingdom. Adult humans from different regions show large variation in tolerance to hypoxia; additionally, it is widely known that neonatal mammals are much more tolerant to anoxia than their adult counterparts, including in humans. Drosophila melanogaster are very anoxia-tolerant relative to mammals, with adults able to survive 12 h of anoxia, and represent a well-suited model for studying anoxia tolerance. Drosophila live in rotting, fermenting media and a result are more likely to experience environmental hypoxia; therefore, they could be expected to be more tolerant of anoxia than adults. However, adults have the capacity to survive anoxic exposure times ~8 times longer than larvae. This dissertation focuses on understanding the mechanisms responsible for variation in survival from anoxic exposure in the genetic model organism, Drosophila melanogaster, focused in particular on effects of developmental stage (larval vs. adults) and within-population variation among individuals.

Vertebrate studies suggest that surviving anoxia requires the maintenance of ATP despite the loss of aerobic metabolism in a manner that prevents a disruption of ionic homeostasis. Instead, the abilities to maintain a hypometabolic state with low ATP and tolerate large disturbances in ionic status appear to contribute to the higher anoxia tolerance of adults. Furthermore, metabolomics experiments support this notion by showing that larvae had higher metabolic rates during the initial 30 min of anoxia and that protective metabolites were upregulated in adults but not larvae. Lastly, I investigated the genetic variation in anoxia tolerance using a genome wide association study (GWAS) to identify target genes associated with anoxia tolerance. Results from the GWAS also suggest mechanisms related to protection from ionic and oxidative stress, in addition to a protective role for immune function.
ContributorsCampbell, Jacob B (Author) / Harrison, Jon F. (Thesis advisor) / Gadau, Juergen (Committee member) / Call, Gerald B (Committee member) / Sweazea, Karen L (Committee member) / Rosenberg, Michael S. (Committee member) / Arizona State University (Publisher)
Created2018
156777-Thumbnail Image.png
Description
Clinical Decision Support (CDS) is primarily associated with alerts, reminders, order entry, rule-based invocation, diagnostic aids, and on-demand information retrieval. While valuable, these foci have been in production use for decades, and do not provide a broader, interoperable means of plugging structured clinical knowledge into live electronic health record (EHR)

Clinical Decision Support (CDS) is primarily associated with alerts, reminders, order entry, rule-based invocation, diagnostic aids, and on-demand information retrieval. While valuable, these foci have been in production use for decades, and do not provide a broader, interoperable means of plugging structured clinical knowledge into live electronic health record (EHR) ecosystems for purposes of orchestrating the user experiences of patients and clinicians. To date, the gap between knowledge representation and user-facing EHR integration has been considered an “implementation concern” requiring unscalable manual human efforts and governance coordination. Drafting a questionnaire engineered to meet the specifications of the HL7 CDS Knowledge Artifact specification, for example, carries no reasonable expectation that it may be imported and deployed into a live system without significant burdens. Dramatic reduction of the time and effort gap in the research and application cycle could be revolutionary. Doing so, however, requires both a floor-to-ceiling precoordination of functional boundaries in the knowledge management lifecycle, as well as formalization of the human processes by which this occurs.

This research introduces ARTAKA: Architecture for Real-Time Application of Knowledge Artifacts, as a concrete floor-to-ceiling technological blueprint for both provider heath IT (HIT) and vendor organizations to incrementally introduce value into existing systems dynamically. This is made possible by service-ization of curated knowledge artifacts, then injected into a highly scalable backend infrastructure by automated orchestration through public marketplaces. Supplementary examples of client app integration are also provided. Compilation of knowledge into platform-specific form has been left flexible, in so far as implementations comply with ARTAKA’s Context Event Service (CES) communication and Health Services Platform (HSP) Marketplace service packaging standards.

Towards the goal of interoperable human processes, ARTAKA’s treatment of knowledge artifacts as a specialized form of software allows knowledge engineers to operate as a type of software engineering practice. Thus, nearly a century of software development processes, tools, policies, and lessons offer immediate benefit: in some cases, with remarkable parity. Analyses of experimentation is provided with guidelines in how choice aspects of software development life cycles (SDLCs) apply to knowledge artifact development in an ARTAKA environment.

Portions of this culminating document have been further initiated with Standards Developing Organizations (SDOs) intended to ultimately produce normative standards, as have active relationships with other bodies.
ContributorsLee, Preston Victor (Author) / Dinu, Valentin (Thesis advisor) / Sottara, Davide (Committee member) / Greenes, Robert (Committee member) / Arizona State University (Publisher)
Created2018
156764-Thumbnail Image.png
Description
Amongst the most studied of the social insects, the honey bee has a prominent place due to its economic importance and influence on human societies. Honey bee colonies can have over 50,000 individuals, whose activities are coordinated by chemical signals called pheromones. Because these pheromones are secreted from various exocrine

Amongst the most studied of the social insects, the honey bee has a prominent place due to its economic importance and influence on human societies. Honey bee colonies can have over 50,000 individuals, whose activities are coordinated by chemical signals called pheromones. Because these pheromones are secreted from various exocrine glands, the proper development and function of these glands are vital to colony dynamics. In this thesis, I present a study of the developmental ontogeny of the exocrine glands found in the head of the honey bee. In Chapter 2, I elucidate how the larval salivary gland transitions to an adult salivary gland through apoptosis and cell growth, differentiation and migration. I also explain the development of the hypopharyngeal and the mandibular gland using apoptotic markers and cytoskeletal markers like tubulin and actin. I explain the fundamental developmental plan for the formation of the glands and show that apoptosis plays an important role in the transformation toward an adult gland.
ContributorsNath, Rachna (Author) / Gadau, Juergen (Thesis advisor) / Rawls, Alan (Committee member) / Harrison, Jon (Committee member) / Arizona State University (Publisher)
Created2018