Search Content

Bayesian Inference and Information Learning for Switching Nonlinear Gene Regulatory Networks

Description

This dissertation centers on the development of Bayesian methods for learning differ- ent types of variation in switching nonlinear gene regulatory networks (GRNs). A new nonlinear and dynamic multivariate GRN model is introduced to account for different sources of variability in GRNs. The new model is aimed at more precisely…

This dissertation centers on the development of Bayesian methods for learning differ- ent types of variation in switching nonlinear gene regulatory networks (GRNs). A new nonlinear and dynamic multivariate GRN model is introduced to account for different sources of variability in GRNs. The new model is aimed at more precisely capturing the complexity of GRN interactions through the introduction of time-varying kinetic order parameters, while allowing for variability in multiple model parameters. This model is used as the drift function in the development of several stochastic GRN mod- els based on Langevin dynamics. Six models are introduced which capture intrinsic and extrinsic noise in GRNs, thereby providing a full characterization of a stochastic regulatory system. A Bayesian hierarchical approach is developed for learning the Langevin model which best describes the noise dynamics at each time step. The trajectory of the state, which are the gene expression values, as well as the indicator corresponding to the correct noise model are estimated via sequential Monte Carlo (SMC) with a high degree of accuracy. To address the problem of time-varying regulatory interactions, a Bayesian hierarchical model is introduced for learning variation in switching GRN architectures with unknown measurement noise covariance. The trajectory of the state and the indicator corresponding to the network configuration at each time point are estimated using SMC. This work is extended to a fully Bayesian hierarchical model to account for uncertainty in the process noise covariance associated with each network architecture. An SMC algorithm with local Gibbs sampling is developed to estimate the trajectory of the state and the indicator correspond- ing to the network configuration at each time point with a high degree of accuracy. The results demonstrate the efficacy of Bayesian methods for learning information in switching nonlinear GRNs.

ContributorsVélez-Cruz, Nayely (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Moraffah, Bahman (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2023

Semantic Information Extraction From Natural Language Using a Learning and Rule-Based Approach

Description

Open Information Extraction (OIE) is a subset of Natural Language Processing (NLP) that constitutes the processing of natural language into structured and machine-readable data. This thesis uses data in Resource Description Framework (RDF) triple format that comprises of a subject, predicate, and object. The extraction of RDF triples from…

Open Information Extraction (OIE) is a subset of Natural Language Processing (NLP) that constitutes the processing of natural language into structured and machine-readable data. This thesis uses data in Resource Description Framework (RDF) triple format that comprises of a subject, predicate, and object. The extraction of RDF triples from natural language is an essential step towards importing data into web ontologies as part of the linked open data cloud on the Semantic web. There have been a number of related techniques for extraction of triples from plain natural language text including but not limited to ClausIE, OLLIE, Reverb, and DeepEx. This proposed study aims to reduce the dependency on conventional machine learning models since they require training datasets, and the models are not easily customizable or explainable. By leveraging a context-free grammar (CFG) based model, this thesis aims to address some of these issues while minimizing the trade-offs on performance and accuracy. Furthermore, a deep-dive is conducted to analyze the strengths and limitations of the proposed approach.

ContributorsSingh, Varun (Author) / Bansal, Srividya (Thesis advisor) / Bansal, Ajay (Committee member) / Mehlhase, Alexandra (Committee member) / Arizona State University (Publisher)

Created2023

Repeatability and Accuracy of a Widely-Available Voice-Based Stress Analysis Tool

Description

Stress, depression, and anxiety are prevailing mental health issues that affect individuals worldwide. As the search for effective solutions continues, advancements in technology have led to the development of digital tools for stress identification and management purposes. The Cigna StressWaves Test (CSWT) is a publicly available stress analysis toolkit that…

Stress, depression, and anxiety are prevailing mental health issues that affect individuals worldwide. As the search for effective solutions continues, advancements in technology have led to the development of digital tools for stress identification and management purposes. The Cigna StressWaves Test (CSWT) is a publicly available stress analysis toolkit that claims to use “clinical-grade” artificial intelligence (AI) technology to evaluate individual stress levels through speech. To investigate their claim, this research stands as an independent validation study involving 60 participants over the age of 18. The primary objective of the study is to assess the repeatability and efficacy of the CSWT as a stress measurement tool. Key results indicate the CSWT lacks test-retest reliability and convergent validity. This implies that the CWST is not a repeatable tool and does not provide similar stress outcomes relative to an established measure of stress, the Perceived Stress Scale (PSS). These findings cast doubt on the accuracy and effectiveness of the CWST as a stress assessment tool. The public availability of the CSWT and the claim that it is a “clinical-grade” tool highlights concerns regarding premature deployment of digital health tools for stress management.

ContributorsYawer, Batul (Author) / Berisha, Visar (Thesis advisor) / Liss, Julie (Committee member) / Luo, Xin (Committee member) / Arizona State University (Publisher)

Created2023

Graph Regularized Linear Regression

Description

Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The…

Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The differences stem from the introduction of a bias into the parameter estimation through the use of various regularization strategies. One of the more popular ones is ridge regression which uses ℓ2-penalization of the parameter vector. In this work, the proposed graph regularized linear estimator is pitted against the popular ridge regression when the parameter vector is known to be dense. When additional knowledge that parameters are smooth with respect to a graph is available, it can be used to improve the parameter estimates. To achieve this goal an additional smoothing penalty is introduced into the traditional loss function of ridge regression. The mean squared error(m.s.e) is used as a performance metric and the analysis is presented for fixed design matrices having a unit covariance matrix. The specific problem setup enables us to study the theoretical conditions where the graph regularized estimator out-performs the ridge estimator. The eigenvectors of the laplacian matrix indicating the graph of connections between the various dimensions of the parameter vector form an integral part of the analysis. Experiments have been conducted on simulated data to compare the performance of the two estimators for laplacian matrices of several types of graphs – complete, star, line and 4-regular. The experimental results indicate that the theory can possibly be extended to more general settings taking smoothness, a concept defined in this work, into consideration.

ContributorsSajja, Akarshan (Author) / Dasarathy, Gautam (Thesis advisor) / Berisha, Visar (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)

Created2022

Modeling and Exploiting the Structure of Data via Meta-Features for Robust and Efficient Machine Learning

Description

In the standard pipeline for machine learning model development, several design decisions are made largely based on trial and error. Take the classification problem as an example. The starting point for classifier design is a dataset with samples from the classes of interest. From this, the algorithm developer must decide…

In the standard pipeline for machine learning model development, several design decisions are made largely based on trial and error. Take the classification problem as an example. The starting point for classifier design is a dataset with samples from the classes of interest. From this, the algorithm developer must decide which features to extract, which hypothesis class to condition on, which hyperparameters to select, and how to train the model. The design process is iterative with the developer trying different classifiers, feature sets, and hyper-parameters and using cross-validation to pick the model with the lowest error. As there are no guidelines for when to stop searching, developers can continue "optimizing" the model to the point where they begin to "fit to the dataset". These problems are amplified in the active learning setting, where the initial dataset may be unlabeled and label acquisition is costly. The aim in this dissertation is to develop algorithms that provide ML developers with additional information about the complexity of the underlying problem to guide downstream model development. I introduce the concept of "meta-features" - features extracted from a dataset that characterize the complexity of the underlying data generating process. In the context of classification, the complexity of the problem can be characterized by understanding two complementary meta-features: (a) the amount of overlap between classes, and (b) the geometry/topology of the decision boundary. Across three complementary works, I present a series of estimators for the meta-features that characterize overlap and geometry/topology of the decision boundary, and demonstrate how they can be used in algorithm development.

ContributorsLi, Weizhi (Author) / Berisha, Visar (Thesis advisor) / Dasarathy, Gautam (Thesis advisor) / Natesan Ramamurthy, Karthikeyan (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2022

Addressing the Challenges of Automated Speech and Language Analysis for the Assessment of Mental Health and Functional Competency

Description

Severe forms of mental illness, such as schizophrenia and bipolar disorder, are debilitating conditions that negatively impact an individual's quality of life. Additionally, they are often difficult and expensive to diagnose and manage, placing a large burden on society. Mental illness is typically diagnosed by the use of clinical interviews…

Severe forms of mental illness, such as schizophrenia and bipolar disorder, are debilitating conditions that negatively impact an individual's quality of life. Additionally, they are often difficult and expensive to diagnose and manage, placing a large burden on society. Mental illness is typically diagnosed by the use of clinical interviews and a set of neuropsychiatric batteries; a key component of nearly all of these evaluations is some spoken language task. Clinicians have long used speech and language production as a proxy for neurological health, but most of these assessments are subjective in nature. Meanwhile, technological advancements in speech and natural language processing have grown exponentially over the past decade, increasing the capacity of computer models to assess particular aspects of speech and language. For this reason, many have seen an opportunity to leverage signal processing and machine learning applications to objectively assess clinical speech samples in order to automatically compute objective measures of neurological health. This document summarizes several contributions to expand upon this body of research. Mainly, there is still a large gap between the theoretical power of computational language models and their actual use in clinical applications. One of the largest concerns is the limited and inconsistent reliability of speech and language features used in models for assessing specific aspects of mental health; numerous methods may exist to measure the same or similar constructs and lead researchers to different conclusions in different studies. To address this, a novel measurement model based on a theoretical framework of speech production is used to motivate feature selection, while also performing a smoothing operation on features across several domains of interest. Then, these composite features are used to perform a much wider range of analyses than is typical of previous studies, looking at everything from diagnosis to functional competency assessments. Lastly, potential improvements to address practical implementation challenges associated with the use of speech and language technology in a real-world environment are investigated. The goal of this work is to demonstrate the ability of speech and language technology to aid clinical practitioners toward improvements in quality of life outcomes for their patients.

ContributorsVoleti, Rohit Nihar Uttam (Author) / Berisha, Visar (Thesis advisor) / Liss, Julie M (Thesis advisor) / Turaga, Pavan (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)

Created2022

Bilingual Subtypes and Individual Bilingual Experiences Using Latent Variable Modeling; Latent Profile Analysis and Fuzzy Set Qualitative Comparative Analysis with the Language and Social Background Questionnaire.

Description

The bilingual experience is an often-studied multivariate phenomenon with a heterogeneous population that is often described using subtypes of bilingualism. “Bilingualism” as well as its subtypes lack consistent definitions and often share overlapping features, requiring researchers to measure a number of aspects of the bilingual experience. Different variables have been…

The bilingual experience is an often-studied multivariate phenomenon with a heterogeneous population that is often described using subtypes of bilingualism. “Bilingualism” as well as its subtypes lack consistent definitions and often share overlapping features, requiring researchers to measure a number of aspects of the bilingual experience. Different variables have been operationalized to quantify the language proficiencies, use, and histories of bilinguals, but the combination of these variables and their contributions to these subtypes often vary between studies on bilingualism. Research supports that these variables have an influence not only on bilingual classification, but also on non-linguistic outcomes including perceptions of self-worth and bicultural identification. To date, there is a lack of research comparing the quantification of these bilingual subtypes and these non-linguistic outcomes, despite research supporting the need to address both. Person-centered approaches such as latent profile analysis (LPA) and fuzzy set qualitative comparative analysis (fsQCA) have been applied to describe other multivariate constructs with heterogeneous populations, but these applications have yet to be used with bilingualism. The present study integrates models of bilingualism with these analytic methods in order to quantitatively identify latent profiles of bilinguals, describe the sets of conditions that define these subtypes, and to characterize the subjective experiences that differentiate these subtypes. The first study uses an existing data set of participants who completed the Language and Social Background Questionnaire (LSBQ) and performs LPA and fsQCA, identifying latent profiles and the sets of conditions that these subtypes. The following studies use a second set of bilinguals who also completed the LSBQ as well as a supplementary questionnaire, characterizing their identification with biculturalism and their feelings of self-worth. The analyses are repeated with these data to describe the profiles within these data and the subjective experiences in common. Finally, all analyses are repeated with the combined datasets to develop a final model of bilingual subtypes, describing the differences in language use and history within each subtype. Results demonstrate that latent models can be used to consistently characterize bilingual subtypes, while also providing additional information about the relationship between individual bilingual history and attitudes towards cultural identification.

ContributorsMcGee, Samuel (Author) / Azuma, Tamiko (Thesis advisor) / Gray, Shelley (Committee member) / Roscoe, Rod (Committee member) / Grimm, Kevin (Committee member) / Arizona State University (Publisher)

Created2022

Efficacy of the Cognitive Apprenticeship Approach for Teaching Behavior Analysis

Description

Behavior challenges impact children and educational professionals on a daily basis; however, it is difficult for educators to obtain high quality training in behavior management. The purpose of this study was to compare cognitive apprenticeship and group work, two teaching methods, to determine which provides better knowledge and implementation outcomes…

Behavior challenges impact children and educational professionals on a daily basis; however, it is difficult for educators to obtain high quality training in behavior management. The purpose of this study was to compare cognitive apprenticeship and group work, two teaching methods, to determine which provides better knowledge and implementation outcomes for educators taking a course on behavior analysis. Seventeen educational professionals currently working with students who display challenging behavior were randomly assigned to the cognitive apprenticeship or group work conditions. The difference between the conditions is the introduction of a coach in the cognitive apprenticeship condition. The coach guides learners through the process of understanding and using behavior analysis throughout the course by providing feedback, scaffolding, and encouraging reflection and exploration. Participants completed pre-, post-, and post-posttests that measured their knowledge of behavior analysis and how well they implemented the skills taught in the course. Additionally, they completed weekly quizzes and reported how often they used the skills in real-life situations. Overall group differences across time points for knowledge and implementation scores were analyzed using a repeated measures analysis of variance (ANOVA). There were significant differences across time for both scores but not condition or time by condition. A covariance pattern model was used to determine if self-efficacy, self-confidence, previous behavior knowledge, or overall quiz performance predicted the variance in knowledge and implementation scores on the pre-, post-, and post-posttests across conditions. Time was the only significant predictor of knowledge scores, while time, condition and self-efficacy significantly predicted the variance in implementation scores. Additionally, one-way ANOVAs were used to find condition-based differences in quiz scores and practical skill use, neither of which were significant. Finally, a linear regression was used to determine if on quiz performance predicts the use of skills in real-world settings, which it did not. The courses impact on learning, skill use, and student behavior as well as future applications are discussed.

ContributorsSacchetta, Melissa (Author) / Gray, Shelley (Thesis advisor) / Braden, B. Blair (Committee member) / McNeish, Daniel (Committee member) / Zuiker, Steve (Committee member) / Arizona State University (Publisher)

Created2022

Correlational Analysis Between Speech and Gait in Parkinson's Disease

Description

Parkinson’s Disease is one of the most complicated and abundantneurodegenerative diseases in the world. Previous analysis of Parkinson’s disease has identified both speech and gait deficits throughout progression of the disease. There has been minimal research looking into the correlation between both the speech and gait deficits in those diagnosed with Parkinson’s. There…

Parkinson’s Disease is one of the most complicated and abundantneurodegenerative diseases in the world. Previous analysis of Parkinson’s disease has identified both speech and gait deficits throughout progression of the disease. There has been minimal research looking into the correlation between both the speech and gait deficits in those diagnosed with Parkinson’s. There is high indication that there is a correlation between the two given the similar pathology and origins of both deficits. This exploratory study aims to establish correlation between both the gait and speech deficits in those diagnosed with Parkinson’s disease. Using previously identified motor and speech measurements and tasks, I conducted a correlational study of individuals with Parkinson’s disease at baseline. There were correlations between multiple speech and gait variability outcomes. The expected correlations ranged from average harmonics-to-noise ratio values against anticipatory postural adjustments-lateral peak distance to average shimmer values against anticipatory postural adjustments-lateral peak distance. There were also unexpected outcomes that ranged from F2 variability against the average number of steps in a turn to intensity variability against step duration variability. I also analyzed the speech changes over 1 year as a secondary outcome of the study. Finally, I found that averages and variabilities increased over 1 year regarding speech primary outcomes. This study serves as a basis for further treatment that may be able to simultaneously treat both speech and gait deficits in those diagnosed with Parkinson’s. The exploratory study also indicates multiple targets for further investigation to better understand cohesive and compensatory mechanisms.

ContributorsBelnavis, Alexander Salvador (Author) / Peterson, Daniel (Thesis advisor) / Daliri, Ayoub (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2022

A Tunable Loss Function for Robust, Rigorous, and Reliable Machine Learning

Description

In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML systems, which have had detrimental societal impacts. In the next…

In the era of big data, more and more decisions and recommendations are being made by machine learning (ML) systems and algorithms. Despite their many successes, there have been notable deficiencies in the robustness, rigor, and reliability of these ML systems, which have had detrimental societal impacts. In the next generation of ML, these significant challenges must be addressed through careful algorithmic design, and it is crucial that practitioners and meta-algorithms have the necessary tools to construct ML models that align with human values and interests. In an effort to help address these problems, this dissertation studies a tunable loss function called α-loss for the ML setting of classification. The alpha-loss is a hyperparameterized loss function originating from information theory that continuously interpolates between the exponential (alpha = 1/2), log (alpha = 1), and 0-1 (alpha = infinity) losses, hence providing a holistic perspective of several classical loss functions in ML. Furthermore, the alpha-loss exhibits unique operating characteristics depending on the value (and different regimes) of alpha; notably, for alpha > 1, alpha-loss robustly trains models when noisy training data is present. Thus, the alpha-loss can provide robustness to ML systems for classification tasks, and this has bearing in many applications, e.g., social media, finance, academia, and medicine; indeed, results are presented where alpha-loss produces more robust logistic regression models for COVID-19 survey data with gains over state of the art algorithmic approaches.

ContributorsSypherd, Tyler (Author) / Sankar, Lalitha (Thesis advisor) / Berisha, Visar (Committee member) / Dasarathy, Gautam (Committee member) / Kosut, Oliver (Committee member) / Arizona State University (Publisher)

Created2022