Search Content

Applications of nonlinear systems of ordinary differential equations and Volterra integral equations to infectious disease epidemiology

Description

In the field of infectious disease epidemiology, the assessment of model robustness outcomes plays a significant role in the identification, reformulation, and evaluation of preparedness strategies aimed at limiting the impact of catastrophic events (pandemics or the deliberate release of biological agents) or used in the management of disease prevention…

In the field of infectious disease epidemiology, the assessment of model robustness outcomes plays a significant role in the identification, reformulation, and evaluation of preparedness strategies aimed at limiting the impact of catastrophic events (pandemics or the deliberate release of biological agents) or used in the management of disease prevention strategies, or employed in the identification and evaluation of control or mitigation measures. The research work in this dissertation focuses on: The comparison and assessment of the role of exponentially distributed waiting times versus the use of generalized non-exponential parametric distributed waiting times of infectious periods on the quantitative and qualitative outcomes generated by Susceptible-Infectious-Removed (SIR) models. Specifically, Gamma distributed infectious periods are considered in the three research projects developed following the applications found in (Bailey 1964, Anderson 1980, Wearing 2005, Feng 2007, Feng 2007, Yan 2008, lloyd 2009, Vergu 2010). i) The first project focuses on the influence of input model parameters, such as the transmission rate, mean and variance of Gamma distributed infectious periods, on disease prevalence, the peak epidemic size and its timing, final epidemic size, epidemic duration and basic reproduction number. Global uncertainty and sensitivity analyses are carried out using a deterministic Susceptible-Infectious-Recovered (SIR) model. The quantitative effect and qualitative relation between input model parameters and outcome variables are established using Latin Hypercube Sampling (LHS) and Partial rank correlation coefficient (PRCC) and Spearman rank correlation coefficient (RCC) sensitivity indices. We learnt that: For relatively low (R0 close to one) to high (mean of R0 equals 15) transmissibility, the variance of the Gamma distribution for the infectious period, input parameter of the deterministic age-of-infection SIR model, is key (statistically significant) on the predictability of the epidemiological variables such as the epidemic duration and the peak size and timing of the prevalence of infectious individuals and therefore, for the predictability these variables, it is preferable to utilize a nonlinear system of Volterra integral equations, rather than a nonlinear system of ordinary differential equations. The predictability of epidemiological variables such as the final epidemic size and the basic reproduction number are unaffected by (or independent of) the variance of the Gamma distribution for the infectious period and therefore for the choice on which type of nonlinear system for the description of the SIR model (VIE's or ODE's) is irrelevant. Although, for practical proposes, with the aim of lowering the complexity and number operations in the numerical methods, a nonlinear system of ordinary differential equations is preferred. The main contribution lies in the development of a model based decision-tool that helps determine when SIR models given in terms of Volterra integral equations are equivalent or better suited than SIR models that only consider exponentially distributed infectious periods. ii) The second project addresses the question of whether or not there is sufficient evidence to conclude that two empirical distributions for a single epidemiological outcome, one generated using a stochastic SIR model under exponentially distributed infectious periods and the other under the non-exponentially distributed infectious period, are statistically dissimilar. The stochastic formulations are modeled via a continuous time Markov chain model. The statistical hypothesis test is conducted using the non-parametric Kolmogorov-Smirnov test. We found evidence that shows that for low to moderate transmissibility, all empirical distribution pairs (generated from exponential and non-exponential distributions) for each of the epidemiological quantities considered are statistically dissimilar. The research in this project helps determine whether the weakening exponential distribution assumption must be considered in the estimation of probability of events defined from the empirical distribution of specific random variables. iii) The third project involves the assessment of the effect of exponentially distributed infectious periods on estimates of input parameter and the associated outcome variable predictions. Quantities unaffected by the use of exponentially distributed infectious period within low transmissibility scenarios include, the prevalence peak time, final epidemic size, epidemic duration and basic reproduction number and for high transmissibility scenarios only the prevalence peak time and final epidemic size. An application designed to determine from incidence data whether there is sufficient statistical evidence to conclude that the infectious period distribution should not be modeled by an exponential distribution is developed. A method for estimating explicitly specified non-exponential parametric probability density functions for the infectious period from epidemiological data is developed. The methodologies presented in this dissertation may be applicable to models where waiting times are used to model transitions between stages, a process that is common in the study of life-history dynamics of many ecological systems.

ContributorsMorales Butler, Emmanuel J (Author) / Castillo-Chavez, Carlos (Thesis advisor) / Aparicio, Juan P (Thesis advisor) / Camacho, Erika T (Committee member) / Kang, Yun (Committee member) / Arizona State University (Publisher)

Created2014

Machine Learning for the Design of Screening Tests: General Principles and Applications in Criminology and Digital Medicine

Description

This dissertation explores applications of machine learning methods in service of the design of screening tests, which are ubiquitous in applications from social work, to criminology, to healthcare. In the first part, a novel Bayesian decision theory framework is presented for designing tree-based adaptive tests. On an application to youth…

This dissertation explores applications of machine learning methods in service of the design of screening tests, which are ubiquitous in applications from social work, to criminology, to healthcare. In the first part, a novel Bayesian decision theory framework is presented for designing tree-based adaptive tests. On an application to youth delinquency in Honduras, the method produces a 15-item instrument that is almost as accurate as a full-length 150+ item test. The framework includes specific considerations for the context in which the test will be administered, and provides uncertainty quantification around the trade-offs of shortening lengthy tests. In the second part, classification complexity is explored via theoretical and empirical results from statistical learning theory, information theory, and empirical data complexity measures. A simulation study that explicitly controls two key aspects of classification complexity is performed to relate the theoretical and empirical approaches. Throughout, a unified language and notation that formalizes classification complexity is developed; this same notation is used in subsequent chapters to discuss classification complexity in the context of a speech-based screening test. In the final part, the relative merits of task and feature engineering when designing a speech-based cognitive screening test are explored. Through an extensive classification analysis on a clinical speech dataset from patients with normal cognition and Alzheimer’s disease, the speech elicitation task is shown to have a large impact on test accuracy; carefully performed task and feature engineering are required for best results. A new framework for objectively quantifying speech elicitation tasks is introduced, and two methods are proposed for automatically extracting insights into the aspects of the speech elicitation task that are driving classification performance. The dissertation closes with recommendations for how to evaluate the obtained insights and use them to guide future design of speech-based screening tests.

ContributorsKrantsevich, Chelsea (Author) / Hahn, P. Richard (Thesis advisor) / Berisha, Visar (Committee member) / Lopes, Hedibert (Committee member) / Renaut, Rosemary (Committee member) / Zheng, Yi (Committee member) / Arizona State University (Publisher)

Created2023

Graph Regularized Linear Regression

Description

Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The…

Linear-regression estimators have become widely accepted as a reliable statistical tool in predicting outcomes. Because linear regression is a long-established procedure, the properties of linear-regression estimators are well understood and can be trained very quickly. Many estimators exist for modeling linear relationships, each having ideal conditions for optimal performance. The differences stem from the introduction of a bias into the parameter estimation through the use of various regularization strategies. One of the more popular ones is ridge regression which uses ℓ2-penalization of the parameter vector. In this work, the proposed graph regularized linear estimator is pitted against the popular ridge regression when the parameter vector is known to be dense. When additional knowledge that parameters are smooth with respect to a graph is available, it can be used to improve the parameter estimates. To achieve this goal an additional smoothing penalty is introduced into the traditional loss function of ridge regression. The mean squared error(m.s.e) is used as a performance metric and the analysis is presented for fixed design matrices having a unit covariance matrix. The specific problem setup enables us to study the theoretical conditions where the graph regularized estimator out-performs the ridge estimator. The eigenvectors of the laplacian matrix indicating the graph of connections between the various dimensions of the parameter vector form an integral part of the analysis. Experiments have been conducted on simulated data to compare the performance of the two estimators for laplacian matrices of several types of graphs – complete, star, line and 4-regular. The experimental results indicate that the theory can possibly be extended to more general settings taking smoothness, a concept defined in this work, into consideration.

ContributorsSajja, Akarshan (Author) / Dasarathy, Gautam (Thesis advisor) / Berisha, Visar (Committee member) / Yang, Yingzhen (Committee member) / Arizona State University (Publisher)

Created2022

Numerical computation of Wishart eigenvalue distributions for multistatic radar detection

Description

Eigenvalues of the Gram matrix formed from received data frequently appear in sufficient detection statistics for multi-channel detection with Generalized Likelihood Ratio (GLRT) and Bayesian tests. In a frequently presented model for passive radar, in which the null hypothesis is that the channels are independent and contain only complex white…

Eigenvalues of the Gram matrix formed from received data frequently appear in sufficient detection statistics for multi-channel detection with Generalized Likelihood Ratio (GLRT) and Bayesian tests. In a frequently presented model for passive radar, in which the null hypothesis is that the channels are independent and contain only complex white Gaussian noise and the alternative hypothesis is that the channels contain a common rank-one signal in the mean, the GLRT statistic is the largest eigenvalue $\lambda_1$ of the Gram matrix formed from data. This Gram matrix has a Wishart distribution. Although exact expressions for the distribution of $\lambda_1$ are known under both hypotheses, numerically calculating values of these distribution functions presents difficulties in cases where the dimension of the data vectors is large. This dissertation presents tractable methods for computing the distribution of $\lambda_1$ under both the null and alternative hypotheses through a technique of expanding known expressions for the distribution of $\lambda_1$ as inner products of orthogonal polynomials. These newly presented expressions for the distribution allow for computation of detection thresholds and receiver operating characteristic curves to arbitrary precision in floating point arithmetic. This represents a significant advancement over the state of the art in a problem that could previously only be addressed by Monte Carlo methods.

ContributorsJones, Scott, Ph.D (Author) / Cochran, Douglas (Thesis advisor) / Berisha, Visar (Committee member) / Bliss, Daniel (Committee member) / Kosut, Oliver (Committee member) / Richmond, Christ (Committee member) / Arizona State University (Publisher)

Created2019

Bayesian Inference and Information Learning for Switching Nonlinear Gene Regulatory Networks

Description

This dissertation centers on the development of Bayesian methods for learning differ- ent types of variation in switching nonlinear gene regulatory networks (GRNs). A new nonlinear and dynamic multivariate GRN model is introduced to account for different sources of variability in GRNs. The new model is aimed at more precisely…

This dissertation centers on the development of Bayesian methods for learning differ- ent types of variation in switching nonlinear gene regulatory networks (GRNs). A new nonlinear and dynamic multivariate GRN model is introduced to account for different sources of variability in GRNs. The new model is aimed at more precisely capturing the complexity of GRN interactions through the introduction of time-varying kinetic order parameters, while allowing for variability in multiple model parameters. This model is used as the drift function in the development of several stochastic GRN mod- els based on Langevin dynamics. Six models are introduced which capture intrinsic and extrinsic noise in GRNs, thereby providing a full characterization of a stochastic regulatory system. A Bayesian hierarchical approach is developed for learning the Langevin model which best describes the noise dynamics at each time step. The trajectory of the state, which are the gene expression values, as well as the indicator corresponding to the correct noise model are estimated via sequential Monte Carlo (SMC) with a high degree of accuracy. To address the problem of time-varying regulatory interactions, a Bayesian hierarchical model is introduced for learning variation in switching GRN architectures with unknown measurement noise covariance. The trajectory of the state and the indicator corresponding to the network configuration at each time point are estimated using SMC. This work is extended to a fully Bayesian hierarchical model to account for uncertainty in the process noise covariance associated with each network architecture. An SMC algorithm with local Gibbs sampling is developed to estimate the trajectory of the state and the indicator correspond- ing to the network configuration at each time point with a high degree of accuracy. The results demonstrate the efficacy of Bayesian methods for learning information in switching nonlinear GRNs.

ContributorsVélez-Cruz, Nayely (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Moraffah, Bahman (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2023

Theses and Dissertations

Filtering by

Applications of nonlinear systems of ordinary differential equations and Volterra integral equations to infectious disease epidemiology

Machine Learning for the Design of Screening Tests: General Principles and Applications in Criminology and Digital Medicine

Graph Regularized Linear Regression

Numerical computation of Wishart eigenvalue distributions for multistatic radar detection

Bayesian Inference and Information Learning for Switching Nonlinear Gene Regulatory Networks