Lee, Heewook

Risk Factors of Neurodegenerative Diseases and Cancer at the Population Level for the Application of Wastewater-Based Epidemiology

Description

Novel means are needed to diagnose neurodegenerative diseases (NDDs) and cancer, given delays in medical diagnosis and rising rates of disease incidence, prevalence, and mortality worldwide. Development of NDDs and cancer has been linked to environmental toxins. Ensuing epigenetic changes may serve as helpful biomarkers to diagnose amyotrophic lateral sclerosis (ALS), Parkinson’s Disease (PD), and Alzheimer’s Disease (AD) as well as various cancers sooner and more accurately. This dissertation tabulates and evaluates a spectrum of diagnostic matrixes (i.e., soil, sewage sludge, blood) and markers of disease to inform disease surveillance. A literature search using Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and Bradford Hill criteria implicated BMAA, formaldehyde, Mn, Hg, and Zn as environmental factors with strong association to ALS risk. Another PRISMA search identified epigenetic changes (e.g., DNA methylation) in NDD patients associated with environmental toxic exposures to air pollutants, heavy metals, and organic chemicals. Of the 180 environmental toxins hypothesized to be associated with AD, PD, or ALS, four heavy metals (As, Cd, Mn, and Hg) were common to these NDDs. Sources, as well as evidence and proxies of human exposure to these heavy metals and Pb were investigated here, namely the metal industries, and metal concentrations in topsoil, sewage sludge, and blood. Concentrations of Cd and Pb in sewage sludge were found to be significantly correlated with NDD prevalence rates in co-located populations (state-level) with odds ratios of 2.91 and 4.08, respectively. Markers of exposure and disease in urine and feces were also evaluated using PRISMA, finding 73 of 94 epigenetic biomarker panels to be valid for tracking primarily gastric and urinary cancers. In all studies, geospatial analyses indicated a preference in study cohorts located in the U.S., Europe, and the northern hemisphere, leaving underserved many populous regions particularly in the southern hemisphere. This dissertation draws attention to sewage sludge as a currently underutilized proxy matrix for assessing toxic human exposures and further identified a spectrum of particularly attractive, non-invasive biomarkers for future diagnostic use to promote early detection, survivability, and quality of life of individuals at risk of NDDs and cancer.

Date Created

2023

Agent

Author (aut): Newell, Melanie Engstrom
Thesis advisor (ths): Halden, Rolf U.
Committee member: Mastroeni, Diego
Committee member: Lee, Heewook
Publisher (pbl): Arizona State University

BB-Player: a HMM Planning Strategy for Blackjack

Description

We propose a new strategy for blackjack, BB-Player, which leverages Hidden Markov Models (HMMs) in online planning to sample a normalized predicted deck distribution for a partially-informed distance heuristic. Viterbi learning is applied to the most-likely sampled future sequence in each game state to generate transition and emission matrices for this upcoming sequence. These are then iteratively updated with each observed game on a given deck. Ultimately, this process informs a heuristic to estimate the true symbolic distance left, which allows BB-Player to determine the action with the highest likelihood of winning (by opponent bust or blackjack) and not going bust. We benchmark this strategy against six common card counting strategies from three separate levels of difficulty and a randomized action strategy. On average, BB-Player is observed to beat card-counting strategies in win optimality, attaining a 30.00% expected win percentage, though it falls short of beating state-of-the-art methods.

Date Created

2023-05

Agent

Author (aut): Lakamsani, Sreeharsha
Thesis director: Ren, Yi
Committee member: Lee, Heewook
Contributor (ctb): Barrett, The Honors College
Contributor (ctb): School of Mathematical and Statistical Sciences
Contributor (ctb): Computer Science and Engineering Program

Alternative Promoter Usage using Transcription Start Sites in Patients with Alzheimer’s Disease

Description

Alzheimer’s Disease (AD) is one of the most common forms of dementia and a major cause of disability and dependency in older patients worldwide.Although there has been a lot of research done in the field of gene expression and possible drivers of AD, there has not been enough investigation into transcription start site and alternative promoter usage of AD. With relatively small genomes, species have evolved mechanisms for diversifying their transcriptome, which is the set of messenger mRNA transcripts produced in a given cell. While the most well-known mechanism of diversification is alternative splicing, another mechanism that has been less explored is alternative promoter (AP) usage, which generates different transcripts by selecting different transcription start sites (TSSs) upstream of a gene. More importantly, AP usage can bring about different coding sequences, which can in some cases lead to changes within the N-termini of the cognate proteins. Alternative promoter usage has the potential to regulate processes like alternative splicing, tissue specificity, regional specificity and subcellular specificity of gene expression and gene activation during development. In this study a customized pipeline for STRIPE-seq generated data was applied to AD and control data set and the first AD promoter atlas was generated. This atlas was used to generate list of genes with differentially used TSRs and biological pathways they are involved in. Finally, a consensus cluster set was created to investigate alternative promoter usage in AD patients and alternative promoter usage was shown in Alzheimer’s Disease related genes such as APOE and MAPT.

Date Created

2022

Agent

Author (aut): Stampar, Mojca
Thesis advisor (ths): Lee, Heewook
Committee member: Raborn, Randolph T
Committee member: Mastroeni, Diego
Publisher (pbl): Arizona State University

An Application of Attention for the Prediction of TCR-Epitope Binding Affinity

Description

T-cells are an integral component of the immune system, enabling the body to distinguish between pathogens and the self. The primary mechanism which enables this is their T-cell receptors (TCR) which bind to antigen epitopes foreign to the body. This detection mechanism allows the T-cell to determine when an immune response is necessary. The computational prediction of TCR-epitope binding is important to researchers for both medical applications and for furthering their understanding of the biological mechanisms that impact immunity. Models which have been developed for this purpose fail to account for the interrelationships between amino acids and demonstrate poor out-of-sample performance. Small changes to the amino acids in these protein sequences can drastically change their structure and function. In recent years, attention-based deep learning models have shown success in their ability to learn rich contextual representations of data. To capture the contextual biological relationships between the amino acids, a multi-head self-attention model was created to predict the binding affinity between given TCR and epitope sequences. By learning the structural nuances of the sequences, this model is able to improve upon existing model performance and grant insights into the underlying mechanisms which impact binding.

Date Created

2021

Agent

Author (aut): Cai, Michael Ray
Thesis advisor (ths): Lee, Heewook
Committee member: Bang, Seojin
Committee member: Baral, Chitta
Publisher (pbl): Arizona State University

Filtering Noise in RNAseq Data of the HLA Genes

Description

The HLA, Human Leukocyte Antigens, are encoded by a polymorphic set of genes where even a single base change can impact the function of the body’s immune response to foreign antigens [1]. Although many methods exist to type these alleles using whole-genome sequencing (WGS), few can use RNA sequencing (RNA-seq) to show the functional expression of the alleles with its inconsistency in coverage, and none of these allow for novel allele discovery. We present an approach using partially ordered graphs to project sequenced data onto the known alleles allowing for accurate and efficient typing of the HLA genes with flexibility for discovering new alleles and tolerance for poor sequence quality. This graph-guided approach to assembling and typing the HLA genes from RNA-seq has applications throughout precision medicine, facilitating the prevention and treatment of autoimmune diseases where allele expression can change. It is also a necessary step for determining donors for organ transplants with the least likelihood of rejection. This novel approach of combining database matching with partially ordered graphs for assembling genetic sequences of RNA-seq data could be applied towards typing other alleles.

Date Created

2022-05

Agent

Author (aut): Mallett, Shayna
Thesis director: Lee, Heewook
Committee member: Wilson, Melissa
Contributor (ctb): Barrett, The Honors College
Contributor (ctb): Computer Science and Engineering Program

Target Detection Using Algorithmic Matter

Description

Over the years, advances in research have continued to decrease the size of computers from the size of a room to a small device that could fit in one’s palm. However, if an application does not require extensive computation power nor accessories such as a screen, the corresponding machine could be microscopic, only a few nanometers big. Researchers at MIT have successfully created Syncells, which are micro- scale robots with limited computation power and memory that can communicate locally to achieve complex collective tasks. In order to control these Syncells for a desired outcome, they must each run a simple distributed algorithm. As they are only capable of local communication, Syncells cannot receive commands from a control center, so their algorithms cannot be centralized. In this work, we created a distributed algorithm that each Syncell can execute so that the system of Syncells is able to find and converge to a specific target within the environment. The most direct applications of this problem are in medicine. Such a system could be used as a safer alternative to invasive surgery or could be used to treat internal bleeding or tumors. We tested and analyzed our algorithm through simulation and visualization in Python. Overall, our algorithm successfully caused the system of particles to converge on a specific target present within the environment.

Date Created

2021-05

Agent

Author (aut): Martin, Rebecca Clare
Committee member: Lee, Heewook
Contributor (ctb): Computer Science and Engineering Program
Contributor (ctb): School of Mathematical and Statistical Sciences
Contributor (ctb): School of Mathematical and Statistical Sciences
Contributor (ctb): Barrett, The Honors College

Prediction of Binding Affinity of T cell Receptor and Antigens using Deep Neural Networks

Description

Immunotherapy is an effective treatment for cancer which enables the patient's immune system to recognize tumor cells as pathogens. In order to design an individualized treatment, the t cell receptors (TCR) which bind to a tumor's unique antigens need to be determined. We created a convolutional neural network to predict the binding affinity between a given TCR and antigen to enable this.

Date Created

2020-12

Agent

Author (aut): Cai, Michael Ray
Thesis director: Lee, Heewook
Committee member: Meuth, Ryan
Contributor (ctb): Computer Science and Engineering Program
Contributor (ctb): Computer Science and Engineering Program
Contributor (ctb): Barrett, The Honors College

Subscribe to Lee, Heewook