Matching Items (14)
Filtering by

Clear all filters

152382-Thumbnail Image.png
Description
A P-value based method is proposed for statistical monitoring of various types of profiles in phase II. The performance of the proposed method is evaluated by the average run length criterion under various shifts in the intercept, slope and error standard deviation of the model. In our proposed approach, P-values

A P-value based method is proposed for statistical monitoring of various types of profiles in phase II. The performance of the proposed method is evaluated by the average run length criterion under various shifts in the intercept, slope and error standard deviation of the model. In our proposed approach, P-values are computed at each level within a sample. If at least one of the P-values is less than a pre-specified significance level, the chart signals out-of-control. The primary advantage of our approach is that only one control chart is required to monitor several parameters simultaneously: the intercept, slope(s), and the error standard deviation. A comprehensive comparison of the proposed method and the existing KMW-Shewhart method for monitoring linear profiles is conducted. In addition, the effect that the number of observations within a sample has on the performance of the proposed method is investigated. The proposed method was also compared to the T^2 method discussed in Kang and Albin (2000) for multivariate, polynomial, and nonlinear profiles. A simulation study shows that overall the proposed P-value method performs satisfactorily for different profile types.
ContributorsAdibi, Azadeh (Author) / Montgomery, Douglas C. (Thesis advisor) / Borror, Connie (Thesis advisor) / Li, Jing (Committee member) / Zhang, Muhong (Committee member) / Arizona State University (Publisher)
Created2013
152768-Thumbnail Image.png
Description
In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments. The management of surgical instruments and medical devices can impact

In a healthcare setting, the Sterile Processing Department (SPD) provides ancillary services to the Operating Room (OR), Emergency Room, Labor & Delivery, and off-site clinics. SPD's function is to reprocess reusable surgical instruments and return them to their home departments. The management of surgical instruments and medical devices can impact patient safety and hospital revenue. Any time instrumentation or devices are not available or are not fit for use, patient safety and revenue can be negatively impacted. One step of the instrument reprocessing cycle is sterilization. Steam sterilization is the sterilization method used for the majority of surgical instruments and is preferred to immediate use steam sterilization (IUSS) because terminally sterilized items can be stored until needed. IUSS Items must be used promptly and cannot be stored for later use. IUSS is intended for emergency situations and not as regular course of action. Unfortunately, IUSS is used to compensate for inadequate inventory levels, scheduling conflicts, and miscommunications. If IUSS is viewed as an adverse event, then monitoring IUSS incidences can help healthcare organizations meet patient safety goals and financial goals along with aiding in process improvement efforts. This work recommends statistical process control methods to IUSS incidents and illustrates the use of control charts for IUSS occurrences through a case study and analysis of the control charts for data from a health care provider. Furthermore, this work considers the application of data mining methods to IUSS occurrences and presents a representative example of data mining to the IUSS occurrences. This extends the application of statistical process control and data mining in healthcare applications.
ContributorsWeart, Gail (Author) / Runger, George C. (Thesis advisor) / Li, Jing (Committee member) / Shunk, Dan (Committee member) / Arizona State University (Publisher)
Created2014
156932-Thumbnail Image.png
Description
Transfer learning is a sub-field of statistical modeling and machine learning. It refers to methods that integrate the knowledge of other domains (called source domains) and the data of the target domain in a mathematically rigorous and intelligent way, to develop a better model for the target domain than a

Transfer learning is a sub-field of statistical modeling and machine learning. It refers to methods that integrate the knowledge of other domains (called source domains) and the data of the target domain in a mathematically rigorous and intelligent way, to develop a better model for the target domain than a model using the data of the target domain alone. While transfer learning is a promising approach in various application domains, my dissertation research focuses on the particular application in health care, including telemonitoring of Parkinson’s Disease (PD) and radiomics for glioblastoma.

The first topic is a Mixed Effects Transfer Learning (METL) model that can flexibly incorporate mixed effects and a general-form covariance matrix to better account for similarity and heterogeneity across subjects. I further develop computationally efficient procedures to handle unknown parameters and large covariance structures. Domain relations, such as domain similarity and domain covariance structure, are automatically quantified in the estimation steps. I demonstrate METL in an application of smartphone-based telemonitoring of PD.

The second topic focuses on an MRI-based transfer learning algorithm for non-invasive surgical guidance of glioblastoma patients. Limited biopsy samples per patient create a challenge to build a patient-specific model for glioblastoma. A transfer learning framework helps to leverage other patient’s knowledge for building a better predictive model. When modeling a target patient, not every patient’s information is helpful. Deciding the subset of other patients from which to transfer information to the modeling of the target patient is an important task to build an accurate predictive model. I define the subset of “transferrable” patients as those who have a positive rCBV-cell density correlation, because a positive correlation is confirmed by imaging theory and the its respective literature.

The last topic is a Privacy-Preserving Positive Transfer Learning (P3TL) model. Although negative transfer has been recognized as an important issue by the transfer learning research community, there is a lack of theoretical studies in evaluating the risk of negative transfer for a transfer learning method and identifying what causes the negative transfer. My work addresses this issue. Driven by the theoretical insights, I extend Bayesian Parameter Transfer (BPT) to a new method, i.e., P3TL. The unique features of P3TL include intelligent selection of patients to transfer in order to avoid negative transfer and maintain patient privacy. These features make P3TL an excellent model for telemonitoring of PD using an At-Home Testing Device.
ContributorsYoon, Hyunsoo (Author) / Li, Jing (Thesis advisor) / Wu, Teresa (Committee member) / Yan, Hao (Committee member) / Hu, Leland S. (Committee member) / Arizona State University (Publisher)
Created2018
154099-Thumbnail Image.png
Description
Transfer learning refers to statistical machine learning methods that integrate the knowledge of one domain (source domain) and the data of another domain (target domain) in an appropriate way, in order to develop a model for the target domain that is better than a model using the data of the

Transfer learning refers to statistical machine learning methods that integrate the knowledge of one domain (source domain) and the data of another domain (target domain) in an appropriate way, in order to develop a model for the target domain that is better than a model using the data of the target domain alone. Transfer learning emerged because classic machine learning, when used to model different domains, has to take on one of two mechanical approaches. That is, it will either assume the data distributions of the different domains to be the same and thereby developing one model that fits all, or develop one model for each domain independently. Transfer learning, on the other hand, aims to mitigate the limitations of the two approaches by accounting for both the similarity and specificity of related domains. The objective of my dissertation research is to develop new transfer learning methods and demonstrate the utility of the methods in real-world applications. Specifically, in my methodological development, I focus on two different transfer learning scenarios: spatial transfer learning across different domains and temporal transfer learning along time in the same domain. Furthermore, I apply the proposed spatial transfer learning approach to modeling of degenerate biological systems.Degeneracy is a well-known characteristic, widely-existing in many biological systems, and contributes to the heterogeneity, complexity, and robustness of biological systems. In particular, I study the application of one degenerate biological system which is to use transcription factor (TF) binding sites to predict gene expression across multiple cell lines. Also, I apply the proposed temporal transfer learning approach to change detection of dynamic network data. Change detection is a classic research area in Statistical Process Control (SPC), but change detection in network data has been limited studied. I integrate the temporal transfer learning method called the Network State Space Model (NSSM) and SPC and formulate the problem of change detection from dynamic networks into a covariance monitoring problem. I demonstrate the performance of the NSSM in change detection of dynamic social networks.
ContributorsZou, Na (Author) / Li, Jing (Thesis advisor) / Baydogan, Mustafa (Committee member) / Borror, Connie (Committee member) / Montgomery, Douglas C. (Committee member) / Wu, Teresa (Committee member) / Arizona State University (Publisher)
Created2015
154578-Thumbnail Image.png
Description
Buildings consume nearly 50% of the total energy in the United States, which drives the need to develop high-fidelity models for building energy systems. Extensive methods and techniques have been developed, studied, and applied to building energy simulation and forecasting, while most of work have focused on developing dedicated modeling

Buildings consume nearly 50% of the total energy in the United States, which drives the need to develop high-fidelity models for building energy systems. Extensive methods and techniques have been developed, studied, and applied to building energy simulation and forecasting, while most of work have focused on developing dedicated modeling approach for generic buildings. In this study, an integrated computationally efficient and high-fidelity building energy modeling framework is proposed, with the concentration on developing a generalized modeling approach for various types of buildings. First, a number of data-driven simulation models are reviewed and assessed on various types of computationally expensive simulation problems. Motivated by the conclusion that no model outperforms others if amortized over diverse problems, a meta-learning based recommendation system for data-driven simulation modeling is proposed. To test the feasibility of the proposed framework on the building energy system, an extended application of the recommendation system for short-term building energy forecasting is deployed on various buildings. Finally, Kalman filter-based data fusion technique is incorporated into the building recommendation system for on-line energy forecasting. Data fusion enables model calibration to update the state estimation in real-time, which filters out the noise and renders more accurate energy forecast. The framework is composed of two modules: off-line model recommendation module and on-line model calibration module. Specifically, the off-line model recommendation module includes 6 widely used data-driven simulation models, which are ranked by meta-learning recommendation system for off-line energy modeling on a given building scenario. Only a selective set of building physical and operational characteristic features is needed to complete the recommendation task. The on-line calibration module effectively addresses system uncertainties, where data fusion on off-line model is applied based on system identification and Kalman filtering methods. The developed data-driven modeling framework is validated on various genres of buildings, and the experimental results demonstrate desired performance on building energy forecasting in terms of accuracy and computational efficiency. The framework could be easily implemented into building energy model predictive control (MPC), demand response (DR) analysis and real-time operation decision support systems.
ContributorsCui, Can (Author) / Wu, Teresa (Thesis advisor) / Weir, Jeffery D. (Thesis advisor) / Li, Jing (Committee member) / Fowler, John (Committee member) / Hu, Mengqi (Committee member) / Arizona State University (Publisher)
Created2016
187808-Thumbnail Image.png
Description
This dissertation covers several topics in machine learning and causal inference. First, the question of “feature selection,” a common byproduct of regularized machine learning methods, is investigated theoretically in the context of treatment effect estimation. This involves a detailed review and extension of frameworks for estimating causal effects and in-depth

This dissertation covers several topics in machine learning and causal inference. First, the question of “feature selection,” a common byproduct of regularized machine learning methods, is investigated theoretically in the context of treatment effect estimation. This involves a detailed review and extension of frameworks for estimating causal effects and in-depth theoretical study. Next, various computational approaches to estimating causal effects with machine learning methods are compared with these theoretical desiderata in mind. Several improvements to current methods for causal machine learning are identified and compelling angles for further study are pinpointed. Finally, a common method used for “explaining” predictions of machine learning algorithms, SHAP, is evaluated critically through a statistical lens.
ContributorsHerren, Andrew (Author) / Hahn, P Richard (Thesis advisor) / Kao, Ming-Hung (Committee member) / Lopes, Hedibert (Committee member) / McCulloch, Robert (Committee member) / Zhou, Shuang (Committee member) / Arizona State University (Publisher)
Created2023
158093-Thumbnail Image.png
Description
Model-based clustering is a sub-field of statistical modeling and machine learning. The mixture models use the probability to describe the degree of the data point belonging to the cluster, and the probability is updated iteratively during the clustering. While mixture models have demonstrated the superior performance in handling noisy data

Model-based clustering is a sub-field of statistical modeling and machine learning. The mixture models use the probability to describe the degree of the data point belonging to the cluster, and the probability is updated iteratively during the clustering. While mixture models have demonstrated the superior performance in handling noisy data in many fields, there exist some challenges for high dimensional dataset. It is noted that among a large number of features, some may not indeed contribute to delineate the cluster profiles. The inclusion of these “noisy” features will confuse the model to identify the real structure of the clusters and cost more computational time. Recognizing the issue, in this dissertation, I propose a new feature selection algorithm for continuous dataset first and then extend to mixed datatype. Finally, I conduct uncertainty quantification for the feature selection results as the third topic.

The first topic is an embedded feature selection algorithm termed Expectation-Selection-Maximization (ESM) model that can automatically select features while optimizing the parameters for Gaussian Mixture Model. I introduce a relevancy index (RI) revealing the contribution of the feature in the clustering process to assist feature selection. I demonstrate the efficacy of the ESM by studying two synthetic datasets, four benchmark datasets, and an Alzheimer’s Disease dataset.

The second topic focuses on extending the application of ESM algorithm to handle mixed datatypes. The Gaussian mixture model is generalized to Generalized Model of Mixture (GMoM), which can not only handle continuous features, but also binary and nominal features.

The last topic is about Uncertainty Quantification (UQ) of the feature selection. A new algorithm termed ESOM is proposed, which takes the variance information into consideration while conducting feature selection. Also, a set of outliers are generated in the feature selection process to infer the uncertainty in the input data. Finally, the selected features and detected outlier instances are evaluated by visualization comparison.
ContributorsFu, Yinlin (Author) / Wu, Teresa (Thesis advisor) / Mirchandani, Pitu (Committee member) / Li, Jing (Committee member) / Pedrielli, Giulia (Committee member) / Arizona State University (Publisher)
Created2020
158850-Thumbnail Image.png
Description
Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were

Spatial regression is one of the central topics in spatial statistics. Based on the goals, interpretation or prediction, spatial regression models can be classified into two categories, linear mixed regression models and nonlinear regression models. This dissertation explored these models and their real world applications. New methods and models were proposed to overcome the challenges in practice. There are three major parts in the dissertation.

In the first part, nonlinear regression models were embedded into a multistage workflow to predict the spatial abundance of reef fish species in the Gulf of Mexico. There were two challenges, zero-inflated data and out of sample prediction. The methods and models in the workflow could effectively handle the zero-inflated sampling data without strong assumptions. Three strategies were proposed to solve the out of sample prediction problem. The results and discussions showed that the nonlinear prediction had the advantages of high accuracy, low bias and well-performed in multi-resolution.

In the second part, a two-stage spatial regression model was proposed for analyzing soil carbon stock (SOC) data. In the first stage, there was a spatial linear mixed model that captured the linear and stationary effects. In the second stage, a generalized additive model was used to explain the nonlinear and nonstationary effects. The results illustrated that the two-stage model had good interpretability in understanding the effect of covariates, meanwhile, it kept high prediction accuracy which is competitive to the popular machine learning models, like, random forest, xgboost and support vector machine.

A new nonlinear regression model, Gaussian process BART (Bayesian additive regression tree), was proposed in the third part. Combining advantages in both BART and Gaussian process, the model could capture the nonlinear effects of both observed and latent covariates. To develop the model, first, the traditional BART was generalized to accommodate correlated errors. Then, the failure of likelihood based Markov chain Monte Carlo (MCMC) in parameter estimating was discussed. Based on the idea of analysis of variation, back comparing and tuning range, were proposed to tackle this failure. Finally, effectiveness of the new model was examined by experiments on both simulation and real data.
ContributorsLu, Xuetao (Author) / McCulloch, Robert (Thesis advisor) / Hahn, Paul (Committee member) / Lan, Shiwei (Committee member) / Zhou, Shuang (Committee member) / Saul, Steven (Committee member) / Arizona State University (Publisher)
Created2020
161801-Thumbnail Image.png
Description
High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are commonly used for dealing with such data. In addition, outliers

High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are commonly used for dealing with such data. In addition, outliers typically exist in such data, which may be of direct or indirect interest given the nature of the problem that is being solved. Current research does not address the interdependent nature of dimensionality reduction and outliers. Some works ignore the existence of outliers altogether—which discredits the robustness of these methods in real life—while others provide suboptimal, often band-aid solutions. In this dissertation, I propose novel methods to achieve outlier-awareness in various dimensionality reduction methods. The problem is considered from many different angles depend- ing on the dimensionality reduction technique used (e.g., deep autoencoder, tensors), the nature of the application (e.g., manufacturing, transportation) and the outlier structure (e.g., sparse point anomalies, novelties).
ContributorsSergin, Nurettin Dorukhan (Author) / Yan, Hao (Thesis advisor) / Li, Jing (Committee member) / Wu, Teresa (Committee member) / Tsung, Fugee (Committee member) / Arizona State University (Publisher)
Created2021
171927-Thumbnail Image.png
Description
Tracking disease cases is an essential task in public health; however, tracking the number of cases of a disease may be difficult not every infection can be recorded by public health authorities. Notably, this may happen with whole country measles case reports, even such countries with robust registration systems.

Tracking disease cases is an essential task in public health; however, tracking the number of cases of a disease may be difficult not every infection can be recorded by public health authorities. Notably, this may happen with whole country measles case reports, even such countries with robust registration systems. Eilertson et al. (2019) propose using a state-space model combined with maximum likelihood methods for estimating measles transmission. A Bayesian approach that uses particle Markov Chain Monte Carlo (pMCMC) is proposed to estimate the parameters of the non-linear state-space model developed in Eilertson et al. (2019) and similar previous studies. This dissertation illustrates the performance of this approach by calculating posterior estimates of the model parameters and predictions of the unobserved states in simulations and case studies. Also, Iteration Filtering (IF2) is used as a support method to verify the Bayesian estimation and to inform the selection of prior distributions. In the second half of the thesis, a birth-death process is proposed to model the unobserved population size of a disease vector. This model studies the effect of a disease vector population size on a second affected population. The second population follows a non-homogenous Poisson process when conditioned on the vector process with a transition rate given by a scaled version of the vector population. The observation model also measures a potential threshold event when the host species population size surpasses a certain level yielding a higher transmission rate. A maximum likelihood procedure is developed for this model, which combines particle filtering with the Minorize-Maximization (MM) algorithm and extends the work of Crawford et al. (2014).
ContributorsMartinez Rivera, Wilmer Osvaldo (Author) / Fricks, John (Thesis advisor) / Reiser, Mark (Committee member) / Zhou, Shuang (Committee member) / Cheng, Dan (Committee member) / Lan, Shiwei (Committee member) / Arizona State University (Publisher)
Created2022