Search Content

Healthcare Leadership Strategies during the COVID-19 Pandemic

Description

COVID-19 has been challenging for nearly everyone in different ways. Healthcare organizations have had to quickly change policy, modify operations, reorganize facilities, hire, and train staff to overcome COVID-19 related challenges to be able to still provide care for patients, all while being mindful of the protection of their staff.…

COVID-19 has been challenging for nearly everyone in different ways. Healthcare organizations have had to quickly change policy, modify operations, reorganize facilities, hire, and train staff to overcome COVID-19 related challenges to be able to still provide care for patients, all while being mindful of the protection of their staff. Some healthcare organizations have responded particularly well, perhaps due to preparedness, planning, or exceptional leadership in times of crisis. To explore this, we invited seven healthcare system leaders from three different organizations in Arizona to talk about how they overcame challenges at the beginning of this pandemic with effective strategies and any leadership tips they had for the future. After the interviews were conducted, the interviews were transcribed, coded qualitatively, and separated into themes and categories to analyze their answers to the questions asked. The results and conclusions included strategies such as having open and honest communication, teamwork, rapidly developing communicating policies, and widely adopting new work practices like Telemedicine, Zoom, and working at home as crucial. This report is designed to assist in aiding and inspiring future or other leaders to be better prepared for solving various challenges with other emergencies that arise in the future.

ContributorsDarira, Saigayatri (Author) / Doebbeling, Bradley (Thesis director) / Don, Rachael (Committee member) / Franczak, Michael (Committee member) / College of Health Solutions (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Informatics Methods to Support Patient-Driven Granular Medical Record Sharing

Description

The traditional model of assessing and treating behavioral health (BH) and physical health (PH) in silos is inadequate for supporting whole-person health and wellness. The integration of BH and PH may result in better care quality, patient-provider experiences, outcomes, and reduced costs. Cross-organizational health data sharing between BH and PH…

The traditional model of assessing and treating behavioral health (BH) and physical health (PH) in silos is inadequate for supporting whole-person health and wellness. The integration of BH and PH may result in better care quality, patient-provider experiences, outcomes, and reduced costs. Cross-organizational health data sharing between BH and PH providers is critical to patients with BH conditions (BHCs). In the last few decades, many initiatives -including health information exchange organizations- have facilitated cross-organizational health data sharing. The current challenge is affording meaningful consent and ensuring patient privacy, two of the core requirements for advancing the adoption and use of health information technology (HIT) in the US. The Office of the National Coordinator for HIT (ONC) recommends that patients should be given granular control beyond the “share all” or “share none” approach widely used currently in consent practices. But there is no consensus on the variables relevant to promote granularity in data sharing to honor privacy satisfaction for patients. As a result, existing granular data sharing (GDS) studies use ad-hoc and non-standardized approaches to implement or investigate patient data sharing preferences. Novel informatics methods were proposed and piloted to support patient-driven GDS and to validate the suitability and applicability of such methods in clinical environments. The hypotheses were: H1) the variables recommended by the ONC are relevant to support GDS; H2) there is diversity in medical record sharing preferences of individuals with BHCs; and H3) the most frequently used sensitive data taxonomy captures sensitive data sharing preferences of patients with BHCs. Findings validated the study hypotheses by proposing an innovative standards-based GDS framework, validating the framework with the design and pilot testing of a clinical decision support system with 209 patients with BHCs, validating with patients the adequacy of the most frequently used sensitive data taxonomy, and systematically exploring data privacy views and data sharing perceptions of patients with BHCs. This research built the foundations for a new generation of future data segmentation methods and tools that advances the vision of the ONC of creating standards-based, interoperable models to share sensitive health information in compliance with patients’ data privacy preferences.

ContributorsKarway, George K (Author) / Grando, Adela Maria (Thesis advisor) / Murcko, Anita C (Committee member) / Franczak, Michael (Committee member) / Arizona State University (Publisher)

Created2022

Evaluating the Heterogeneity of Logistic Regression Models to Predict Coronary Artery Disease Status

Description

Coronary artery disease (CAD) is one of the most diagnosed heart diseases globally, affecting about 5% of adults over the age of twenty[1]. Lifestyle changes can positively impact risk of developing CAD and are especially important for individuals with high genetic risk [1]. In this study, we sought to predict…

Coronary artery disease (CAD) is one of the most diagnosed heart diseases globally, affecting about 5% of adults over the age of twenty[1]. Lifestyle changes can positively impact risk of developing CAD and are especially important for individuals with high genetic risk [1]. In this study, we sought to predict the likelihood of developing CAD using genetic, demographic, and clinical variables. Leveraging genetic and clinical data from the UK Biobank on over 500,000 individuals, we classified and separated 500 genetically similar individuals to a target individual from another 500 genetically dissimilar individuals. This process was repeated for 10 target individuals as a proof-of-concept. Then, CAD-related variables were used and these include age, relevant clinical factors, and polygenic risk score to train models for predicting CAD status for the 500 genetically similar and 500 genetically dissimilar groups, and determine which group predicts the likelihood of CAD more accurately. To compute genetic similarity to the target individuals we used the Mahalanobis distance. To reduce the heterogeneity between sexes and races, the studies were restricted to British male Caucasians. The models using the more similar individuals demonstrated better predictive performance. The area under the receiver operating characteristic curve (AUC) was found to be significantly higher for the ‘similar’ rather than the ’dissimilar’ groups, indicating better predictive capability (AUC=0.67 vs. 0.65, respectively; p-value<0.05). These findings support the potential of precision prevention strategies, since one should build predictive models of disease for any one target individual from more similar individuals to that target even within an otherwise homogenous group of individuals (e.g., British Caucasians). Although intuitive, such practices are not done routinely. Further validation and exploration of additional predictors are warranted to enhance the predictive accuracy and applicability of the model.

ContributorsPandari, Sadhana (Author) / Ghassamzadeh, Hassan (Thesis director) / Scotch, Matthew (Committee member) / Barrett, The Honors College (Contributor) / College of Health Solutions (Contributor)

Created2024-05

Pharmacogenomics of Selective Serotonin Reuptake Inhibitor Treatment for Major Depressive Disorder: a Genome Wide Association Study

Description

A genome wide association study (GWAS) of treatment outcomes for citalopram and escitalopram, two frontline SSRI treatments for Major Depressive Disorder, was conducted with 529 subjects on an imputed dataset. While no variants of genome-wide significance were identified, various potentially interesting variants were identified that warrant further exploration. These findings…

A genome wide association study (GWAS) of treatment outcomes for citalopram and escitalopram, two frontline SSRI treatments for Major Depressive Disorder, was conducted with 529 subjects on an imputed dataset. While no variants of genome-wide significance were identified, various potentially interesting variants were identified that warrant further exploration. These findings have the potential to elucidate novel mechanisms underlying drug response for SSRIs. This work will be continued further, with machine learning and deep learning analyses to perform non-linear analyses and employing a biologist or geneticist to provide more specialized knowledge for interpretation of results.

ContributorsLeiter-Weintraub, Ethan (Author) / Dinu, Valentin (Thesis director) / Scotch, Matthew (Committee member) / Barrett, The Honors College (Contributor) / Dean, W.P. Carey School of Business (Contributor) / College of Health Solutions (Contributor) / School of Life Sciences (Contributor)

Created2024-05

Integrative analysis of genomic aberrations in cancer and xenograft Models

Description

No two cancers are alike. Cancer is a dynamic and heterogeneous disease, such heterogeneity arise among patients with the same cancer type, among cancer cells within the same individual’s tumor and even among cells within the same sub-clone over time. The recent application of next-generation sequencing and precision medicine techniques…

No two cancers are alike. Cancer is a dynamic and heterogeneous disease, such heterogeneity arise among patients with the same cancer type, among cancer cells within the same individual’s tumor and even among cells within the same sub-clone over time. The recent application of next-generation sequencing and precision medicine techniques is the driving force to uncover the complexity of cancer and the best clinical practice. The core concept of precision medicine is to move away from crowd-based, best-for-most treatment and take individual variability into account when optimizing the prevention and treatment strategies. Next-generation sequencing is the method to sift through the entire 3 billion letters of each patient’s DNA genetic code in a massively parallel fashion.

The deluge of next-generation sequencing data nowadays has shifted the bottleneck of cancer research from multiple “-omics” data collection to integrative analysis and data interpretation. In this dissertation, I attempt to address two distinct, but dependent, challenges. The first is to design specific computational algorithms and tools that can process and extract useful information from the raw data in an efficient, robust, and reproducible manner. The second challenge is to develop high-level computational methods and data frameworks for integrating and interpreting these data. Specifically, Chapter 2 presents a tool called Snipea (SNv Integration, Prioritization, Ensemble, and Annotation) to further identify, prioritize and annotate somatic SNVs (Single Nucleotide Variant) called from multiple variant callers. Chapter 3 describes a novel alignment-based algorithm to accurately and losslessly classify sequencing reads from xenograft models. Chapter 4 describes a direct and biologically motivated framework and associated methods for identification of putative aberrations causing survival difference in GBM patients by integrating whole-genome sequencing, exome sequencing, RNA-Sequencing, methylation array and clinical data. Lastly, chapter 5 explores longitudinal and intratumor heterogeneity studies to reveal the temporal and spatial context of tumor evolution. The long-term goal is to help patients with cancer, particularly those who are in front of us today. Genome-based analysis of the patient tumor can identify genomic alterations unique to each patient’s tumor that are candidate therapeutic targets to decrease therapy resistance and improve clinical outcome.

ContributorsPeng, Sen (Author) / Dinu, Valentin (Thesis advisor) / Scotch, Matthew (Committee member) / Wallstrom, Garrick (Committee member) / Arizona State University (Publisher)

Created2015

Health information extraction from social media

Description

Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks such as pharmacovigilance via the use of Natural Language Processing (NLP) techniques. One of the critical steps in information extraction pipelines is Named Entity Recognition…

Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks such as pharmacovigilance via the use of Natural Language Processing (NLP) techniques. One of the critical steps in information extraction pipelines is Named Entity Recognition (NER), where the mentions of entities such as diseases are located in text and their entity type are identified. However, the language in social media is highly informal, and user-expressed health-related concepts are often non-technical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and advanced machine learning-based NLP techniques have been underutilized. This work explores the effectiveness of different machine learning techniques, and particularly deep learning, to address the challenges associated with extraction of health-related concepts from social media. Deep learning has recently attracted a lot of attention in machine learning research and has shown remarkable success in several applications particularly imaging and speech recognition. However, thus far, deep learning techniques are relatively unexplored for biomedical text mining and, in particular, this is the first attempt in applying deep learning for health information extraction from social media.

This work presents ADRMine that uses a Conditional Random Field (CRF) sequence tagger for extraction of complex health-related concepts. It utilizes a large volume of unlabeled user posts for automatic learning of embedding cluster features, a novel application of deep learning in modeling the similarity between the tokens. ADRMine significantly improved the medical NER performance compared to the baseline systems.

This work also presents DeepHealthMiner, a deep learning pipeline for health-related concept extraction. Most of the machine learning methods require sophisticated task-specific manual feature design which is a challenging step in processing the informal and noisy content of social media. DeepHealthMiner automatically learns classification features using neural networks and utilizing a large volume of unlabeled user posts. Using a relatively small labeled training set, DeepHealthMiner could accurately identify most of the concepts, including the consumer expressions that were not observed in the training data or in the standard medical lexicons outperforming the state-of-the-art baseline techniques.

ContributorsNikfarjam, Azadeh (Author) / Gonzalez, Graciela (Thesis advisor) / Greenes, Robert (Committee member) / Scotch, Matthew (Committee member) / Arizona State University (Publisher)

Created2016

Informatics approaches for integrative analysis of disparate high-throughput genomic datasets in cancer

Description

The processes of a human somatic cell are very complex with various genetic mechanisms governing its fate. Such cells undergo various genetic mutations, which translate to the genetic aberrations that we see in cancer. There are more than 100 types of cancer, each having many more subtypes with aberrations being…

The processes of a human somatic cell are very complex with various genetic mechanisms governing its fate. Such cells undergo various genetic mutations, which translate to the genetic aberrations that we see in cancer. There are more than 100 types of cancer, each having many more subtypes with aberrations being unique to each. In the past two decades, the widespread application of high-throughput genomic technologies, such as micro-arrays and next-generation sequencing, has led to the revelation of many such aberrations. Known types and subtypes can be readily identified using gene-expression profiling and more importantly, high-throughput genomic datasets have helped identify novel sub-types with distinct signatures. Recent studies showing usage of gene-expression profiling in clinical decision making in breast cancer patients underscore the utility of high-throughput datasets. Beyond prognosis, understanding the underlying cellular processes is essential for effective cancer treatment. Various high-throughput techniques are now available to look at a particular aspect of a genetic mechanism in cancer tissue. To look at these mechanisms individually is akin to looking at a broken watch; taking apart each of its parts, looking at them individually and finally making a list of all the faulty ones. Integrative approaches are needed to transform one-dimensional cancer signatures into multi-dimensional interaction and regulatory networks, consequently bettering our understanding of cellular processes in cancer. Here, I attempt to (i) address ways to effectively identify high quality variants when multiple assays on the same sample samples are available through two novel tools, snpSniffer and NGSPE; (ii) glean new biological insight into multiple myeloma through two novel integrative analysis approaches making use of disparate high-throughput datasets. While these methods focus on multiple myeloma datasets, the informatics approaches are applicable to all cancer datasets and will thus help advance cancer genomics.

ContributorsYellapantula, Venkata (Author) / Dinu, Valentin (Thesis advisor) / Scotch, Matthew (Committee member) / Wallstrom, Garrick (Committee member) / Keats, Jonathan (Committee member) / Arizona State University (Publisher)

Created2014

Structural variant detection: a novel approach

Description

Genomic structural variation (SV) is defined as gross alterations in the genome broadly classified as insertions/duplications, deletions inversions and translocations. DNA sequencing ushered structural variant discovery beyond laboratory detection techniques to high resolution informatics approaches. Bioinformatics tools for computational discovery of SVs however are still missing variants in the complex…

Genomic structural variation (SV) is defined as gross alterations in the genome broadly classified as insertions/duplications, deletions inversions and translocations. DNA sequencing ushered structural variant discovery beyond laboratory detection techniques to high resolution informatics approaches. Bioinformatics tools for computational discovery of SVs however are still missing variants in the complex cancer genome. This study aimed to define genomic context leading to tool failure and design novel algorithm addressing this context. Methods: The study tested the widely held but unproven hypothesis that tools fail to detect variants which lie in repeat regions. Publicly available 1000-Genomes dataset with experimentally validated variants was tested with SVDetect-tool for presence of true positives (TP) SVs versus false negative (FN) SVs, expecting that FNs would be overrepresented in repeat regions. Further, the novel algorithm designed to informatically capture the biological etiology of translocations (non-allelic homologous recombination and 3&ndashD; placement of chromosomes in cells –context) was tested using simulated dataset. Translocations were created in known translocation hotspots and the novel&ndashalgorithm; tool compared with SVDetect and BreakDancer. Results: 53% of false negative (FN) deletions were within repeat structure compared to 81% true positive (TP) deletions. Similarly, 33% FN insertions versus 42% TP, 26% FN duplication versus 57% TP and 54% FN novel sequences versus 62% TP were within repeats. Repeat structure was not driving the tool's inability to detect variants and could not be used as context. The novel algorithm with a redefined context, when tested against SVDetect and BreakDancer was able to detect 10/10 simulated translocations with 30X coverage dataset and 100% allele frequency, while SVDetect captured 4/10 and BreakDancer detected 6/10. For 15X coverage dataset with 100% allele frequency, novel algorithm was able to detect all ten translocations albeit with fewer reads supporting the same. BreakDancer detected 4/10 and SVDetect detected 2/10 Conclusion: This study showed that presence of repetitive elements in general within a structural variant did not influence the tool's ability to capture it. This context-based algorithm proved better than current tools even with half the genome coverage than accepted protocol and provides an important first step for novel translocation discovery in cancer genome.

ContributorsShetty, Sheetal (Author) / Dinu, Valentin (Thesis advisor) / Bussey, Kimberly (Committee member) / Scotch, Matthew (Committee member) / Wallstrom, Garrick (Committee member) / Arizona State University (Publisher)

Created2014

Biomedical Information Extraction Pipelines for Public Health in the Age of Deep Learning

Description

Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an…

Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an important role in the implementation and adoption of applications in areas such as public health. Advancements in machine learning and deep learning techniques have enabled rapid development of such pipelines. This dissertation presents entity extraction pipelines for two public health applications: virus phylogeography and pharmacovigilance. For virus phylogeography, geographical locations are extracted from biomedical scientific texts for metadata enrichment in the GenBank database containing 2.9 million virus nucleotide sequences. For pharmacovigilance, tools are developed to extract adverse drug reactions from social media posts to open avenues for post-market drug surveillance from non-traditional sources. Across these pipelines, high variance is observed in extraction performance among the entities of interest while using state-of-the-art neural network architectures. To explain the variation, linguistic measures are proposed to serve as indicators for entity extraction performance and to provide deeper insight into the domain complexity and the challenges associated with entity extraction. For both the phylogeography and pharmacovigilance pipelines presented in this work the annotated datasets and applications are open source and freely available to the public to foster further research in public health.

ContributorsMagge, Arjun (Author) / Scotch, Matthew (Thesis advisor) / Gonzalez-Hernandez, Graciela (Thesis advisor) / Greenes, Robert (Committee member) / Arizona State University (Publisher)

Created2019

Knowledge-driven methods for geographic information extraction in the biomedical domain

Description

Accounting for over a third of all emerging and re-emerging infections, viruses represent a major public health threat, which researchers and epidemiologists across the world have been attempting to contain for decades. Recently, genomics-based surveillance of viruses through methods such as virus phylogeography has grown into a popular tool for…

Accounting for over a third of all emerging and re-emerging infections, viruses represent a major public health threat, which researchers and epidemiologists across the world have been attempting to contain for decades. Recently, genomics-based surveillance of viruses through methods such as virus phylogeography has grown into a popular tool for infectious disease monitoring. When conducting such surveillance studies, researchers need to manually retrieve geographic metadata denoting the location of infected host (LOIH) of viruses from public sequence databases such as GenBank and any publication related to their study. The large volume of semi-structured and unstructured information that must be reviewed for this task, along with the ambiguity of geographic locations, make it especially challenging. Prior work has demonstrated that the majority of GenBank records lack sufficient geographic granularity concerning the LOIH of viruses. As a result, reviewing full-text publications is often necessary for conducting in-depth analysis of virus migration, which can be a very time-consuming process. Moreover, integrating geographic metadata pertaining to the LOIH of viruses from different sources, including different fields in GenBank records as well as full-text publications, and normalizing the integrated metadata to unique identifiers for subsequent analysis, are also challenging tasks, often requiring expert domain knowledge. Therefore, automated information extraction (IE) methods could help significantly accelerate this process, positively impacting public health research. However, very few research studies have attempted the use of IE methods in this domain.

This work explores the use of novel knowledge-driven geographic IE heuristics for extracting, integrating, and normalizing the LOIH of viruses based on information available in GenBank and related publications; when evaluated on manually annotated test sets, the methods were found to have a high accuracy and shown to be adequate for addressing this challenging problem. It also presents GeoBoost, a pioneering software system for georeferencing GenBank records, as well as a large-scale database containing over two million virus GenBank records georeferenced using the algorithms introduced here. The methods, database and software developed here could help support diverse public health domains focusing on sequence-informed virus surveillance, thereby enhancing existing platforms for controlling and containing disease outbreaks.

ContributorsTahsin, Tasnia (Author) / Gonzalez, Graciela (Thesis advisor) / Scotch, Matthew (Thesis advisor) / Runger, George C. (Committee member) / Arizona State University (Publisher)

Created2019

Filtering by