Search Content

Utilizing EHR Data to Study and Measure the Relationship Between Primary Care Teams and Patient Activation

Description

Team-based care has been linked to key outcomes associated with the Quadruple Aim including improving the health of populations, patient and provider experience and lowering healthcare costs. Less is understood about the connection between team-based care and the patient experience. Emerging evidence connects team-based care with patient activation, a component…

Team-based care has been linked to key outcomes associated with the Quadruple Aim including improving the health of populations, patient and provider experience and lowering healthcare costs. Less is understood about the connection between team-based care and the patient experience. Emerging evidence connects team-based care with patient activation, a component of the patient experience. Use of the Electronic Health Record (EHR) and machine learning have significant potential to overcome previous barriers in how teams are studied to better understand their impact on critical care delivery outcomes, such as patient activation. This research program included a systematic review of the literature to analyze the relationship between team-based care and patient satisfaction, a proxy for the patient experience. Overall, this review found a positive relationship between team-based care and patient satisfaction, including 57% of studies with improved patient satisfaction with team-based care implementation. Secondary findings included a relationship between team composition and patient satisfaction, with larger teams (three or more disciplines) associated with improved patient satisfaction. A methodological paper was then prepared to describe the process in which primary care teams were identified within EHR data utilizing a common definition for team-based care supported by prominent team theorists. This novel approach provides a roadmap for the health services researcher to leverage EHR data to study the impact teams may have on critical patient outcomes in the real-world practice environment. The final study in this work utilized a large EHR data set (n = 316,542) from an urban health system to examine the relationship between team composition and patient activation. Patient Activation was measured using the Patient Activation Measure (PAM). Results from mixed-level model analysis were compared to machine learning analysis using multinomial logistic regression to calculate propensity scores for the multiple effect of team composition. After controlling for confounding variables in both analyses, more diverse, multidisciplinary teams were associated with improved patient activation scores. Implications for this research program include the feasibility of identifying teams within the EHR and utilize big data analytics with machine learning to measure the impact of teams and real-world patient related outcomes.

ContributorsWill, Kristen Kaye (Author) / Lamb, Gerri (Thesis advisor) / Delaney, Connie (Committee member) / Todd, Michael (Committee member) / Arizona State University (Publisher)

Created2021

Developing a Machine Learning Framework for Student Persistence Prediction

Description

Student retention is a critical metric for many universities whose intention is to support student success. The goal of this thesis is to create retention models utilizing machine learning (ML) techniques. The factors explored in this research include only those known during the admissions process. These models have two goals:…

Student retention is a critical metric for many universities whose intention is to support student success. The goal of this thesis is to create retention models utilizing machine learning (ML) techniques. The factors explored in this research include only those known during the admissions process. These models have two goals: first, to correctly predict as many non-returning students as possible, while minimizing the number of students who are falsely predicted as non-returning. Next, to identify important features in student retention and provide a practical explanation for a student's decision to no longer persist. The models are then used to provide outreach to students that need more support. The findings of this research indicate that the current top performing model is Adaboost which is able to successfully predict non-returning students with an accuracy of 54 percent.

ContributorsWade, Alexis N (Author) / Gel, Esma (Thesis advisor) / Yan, Hao (Thesis advisor) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2021

Agora: Introducing the Internet's Opinion to Traditional Stock Analysis and Prediction.

Description

This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence…

This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence Aware Dictionary for Sentiment Reasoning (VADER) natural language processing package to determine numerical polarities which represented positivity or negativity for a given stock ticker. These generated polarities were paired with stock metrics typically observed by stock analysts as the feature set for a Logistic Regression machine learning model. The model was trained on roughly 1500 major stocks to determine a binary classification between a “Buy” or “Not Buy” rating for each stock, and the results of the model were inserted into the back-end of the Agora Web UI which emulates search engine behavior specifically for stocks found in NYSE and NASDAQ. The model reported an accuracy of 82.5% and for most major stocks, the model’s prediction correlated with stock analysts’ ratings. Given the volatility of the stock market and the propensity for hive-mind behavior in online forums, the performance of the Logistic Regression model would benefit from incorporating historical stock data and more sources of opinion to balance any subjectivity in the model.

ContributorsRamaraju, Venkat (Author) / Rao, Jayanth (Co-author) / Bansal, Ajay (Thesis director) / Smith, James (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2021-12

Hilliker Thesis Paper

ContributorsHilliker, Jacob (Author) / Li, Baoxin (Thesis director) / Libman, Jeffrey (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2021-12

Developing Data-Driven Methods for Movement Pattern Analysis using Geographic Context

Description

The role of movement data is essential to understanding how geographic context influences movement patterns in urban areas. Owing to the growth in ubiquitous data collection platforms like smartphones, fitness trackers, and health monitoring apps, researchers are now able to collect movement data at increasingly fine spatial and temporal resolution.…

The role of movement data is essential to understanding how geographic context influences movement patterns in urban areas. Owing to the growth in ubiquitous data collection platforms like smartphones, fitness trackers, and health monitoring apps, researchers are now able to collect movement data at increasingly fine spatial and temporal resolution. Despite the surge in volumes of fine-grained movement data, there is a gap in the availability of quantitative and analytical tools to extract actionable insights from such big datasets and tease out the role of context in movement pattern analysis. As cities aim to be safer and healthier, policymakers require methods to generate efficient strategies for urban planning utilizing high-frequency movement data to make targeted decisions for infrastructure investments without compromising the safety of its residents. The objective of this Ph.D. dissertation is to develop quantitative methods that combine big spatial-temporal data from crowdsourced platforms with geographic context to analyze movement patterns over space and time. Knowledge about the role of context can help in assessing why changes in movement patterns occur and how those changes are affected by the immediate natural and built environment. In this dissertation I contribute to the rapidly expanding body of quantitative movement pattern analysis research by 1) developing a bias-correction framework for improving the representativeness of crowdsourced movement data by modeling bias with training data and geographical variables, 2) understanding spatial-temporal changes in movement patterns at different periods and how context influences those changes by generating hourly and monthly change maps in bicycle ridership patterns, and 3) quantifying the variation in accuracy and generalizability of transportation mode detection models using GPS (Global Positioning Systems) data upon adding geographic context. Using statistical models, supervised classification algorithms, and functional data analysis approaches I develop modeling frameworks that address each of the research objectives. The results are presented as street-level maps and predictive models which are reproducible in nature. The methods developed in this dissertation can serve as analytical tools by policymakers to plan infrastructure changes and facilitate data collection efforts that represent movement patterns for all ages and abilities.

ContributorsRoy, Avipsa (Author) / Nelson, Trisalyn A. (Thesis advisor) / Kedron, Peter J. (Committee member) / Li, Wenwen (Committee member) / Arizona State University (Publisher)

Created2021

Outlier-Aware Applications in High-Dimensional Industrial Systems

Description

High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are commonly used for dealing with such data. In addition, outliers…

High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are commonly used for dealing with such data. In addition, outliers typically exist in such data, which may be of direct or indirect interest given the nature of the problem that is being solved. Current research does not address the interdependent nature of dimensionality reduction and outliers. Some works ignore the existence of outliers altogether—which discredits the robustness of these methods in real life—while others provide suboptimal, often band-aid solutions. In this dissertation, I propose novel methods to achieve outlier-awareness in various dimensionality reduction methods. The problem is considered from many different angles depend- ing on the dimensionality reduction technique used (e.g., deep autoencoder, tensors), the nature of the application (e.g., manufacturing, transportation) and the outlier structure (e.g., sparse point anomalies, novelties).

ContributorsSergin, Nurettin Dorukhan (Author) / Yan, Hao (Thesis advisor) / Li, Jing (Committee member) / Wu, Teresa (Committee member) / Tsung, Fugee (Committee member) / Arizona State University (Publisher)

Created2021

Positive Unlabeled Learning - Optimization and Evaluation

Description

In many real-world machine learning classification applications, well labeled training data can be difficult, expensive, or even impossible to obtain. In such situations, it is sometimes possible to label a small subset of data as belonging to the class of interest though it is impractical to manually label all data…

In many real-world machine learning classification applications, well labeled training data can be difficult, expensive, or even impossible to obtain. In such situations, it is sometimes possible to label a small subset of data as belonging to the class of interest though it is impractical to manually label all data not of interest. The result is a small set of positive labeled data and a large set of unknown and unlabeled data. This is known as the Positive and Unlabeled learning (PU learning) problem, a type of semi-supervised learning. In this dissertation, the PU learning problem is rigorously defined, several common assumptions described, and a literature review of the field provided. A new family of effective PU learning algorithms, the MLR (Modified Logistic Regression) family of algorithms, is described. Theoretical and experimental justification for these algorithms is provided demonstrating their success and flexibility. Extensive experimentation and empirical evidence are provided comparing several new and existing PU learning evaluation estimation metrics in a wide variety of scenarios. The surprisingly clear advantage of a simple recall estimate as the best estimate for overall PU classifier performance is described. Finally, an application of PU learning to the field of solar fault detection, an area not previously explored in the field, demonstrates the advantage and potential of PU learning in new application domains.

ContributorsJaskie, Kristen P (Author) / Spanias, Andreas (Thesis advisor) / Blain-Christen, Jennifer (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Thiagarajan, Jayaraman (Committee member) / Arizona State University (Publisher)

Created2021

Deep Learning Strategies for Critical Heat Flux Detection in Pool Boiling

Description

Image-based deep learning (DL) models are employed to enable the detection of critical heat flux (CHF) based on pool boiling experimental images. Most machine learning approaches for pool boiling to date focus on a single dataset under a certain heater surface, working fluid, and operating conditions. For new datasets collected…

Image-based deep learning (DL) models are employed to enable the detection of critical heat flux (CHF) based on pool boiling experimental images. Most machine learning approaches for pool boiling to date focus on a single dataset under a certain heater surface, working fluid, and operating conditions. For new datasets collected under different conditions, a significant effort in re-training the model or developing a new model is required under the assumption that the new dataset has a sufficient amount of labeled data. This research is to explore supervised, semi-supervised, and unsupervised machine learning strategies that are formulated to adapt to two scenarios. The first is when the new dataset has limited labeled data available. This scenario was addressed in chapter 2 of this thesis, where Convolutional Neural Networks (CNNs) and Transfer learning (TL) were used in tackling such situations. The second scenario is when the new dataset has no labeled data available at all. In such cases, this research presents a methodology in Chapter 3, where one of the state-of-the-art Generative Adversarial Networks (GANs) called Fixed-Point GAN is deployed in collaboration with a regular CNN model to tackle the problem. To the best of my knowledge, the approaches presented in chapters 2 and 3 are the first of their kind to utilize TL and GANs to solve the boiling heat transfer problem within the heat transfer community and are a step forward towards obtaining a one-for-all general model.

ContributorsAl-Hindawi, Firas Al (Author) / Wu, Teresa TW (Thesis advisor) / Yoon, Hyunsoo HY (Thesis advisor) / Hu, Han HH (Committee member) / Iquebal, Ashif AI (Committee member) / Arizona State University (Publisher)

Created2021

Synchrophasor Estimation and Imaging With Electric Fields and Neural Networks

Description

This research presents advances in time-synchronized phasor (i.e.,synchrophasor) estimation and imaging with very-low-frequency electric fields. Phasor measurement units measure and track dynamic systems, often power systems, using synchrophasor estimation algorithms. Two improvements to subspace-based synchrophasor estimation algorithms are shown. The first improvement is a dynamic thresholding method for accurately determining the signal subspace…

This research presents advances in time-synchronized phasor (i.e.,synchrophasor) estimation and imaging with very-low-frequency electric fields. Phasor measurement units measure and track dynamic systems, often power systems, using synchrophasor estimation algorithms. Two improvements to subspace-based synchrophasor estimation algorithms are shown. The first improvement is a dynamic thresholding method for accurately determining the signal subspace when using the estimation of signal parameters via rotational invariance techniques (ESPRIT) algorithm. This improvement facilitates accurate ESPRIT-based frequency estimates of both the nominal system frequency and the frequencies of interfering signals such as harmonics or out-of-band interference signals. Proper frequency estimation of all signals present in measurement data allows for accurate least squares estimates of synchrophasors for the nominal system frequency. By including the effects of clutter signals in the synchrophasor estimate, interference from clutter signals can be excluded. The result is near-flat estimation error during nominal system frequency changes, the presence of harmonic distortion, and out-of-band interference. The second improvement reduces the computational burden of the ESPRIT frequency estimation step by showing that an optimized Eigenvalue decomposition of the measurement data can be used instead of a singular value decomposition. This research also explores a deep-learning-based inversion method for imaging objects with a uniform electric field and a 2D planar D-dot array. Using electric fields as an illumination source has seen multiple applications ranging from medical imaging to mineral deposit detection. It is shown that a planar D-dot array and deep neural network can reconstruct the electrical properties of randomized objects. A 16000-sample dataset of objects comprised of a three-by-three grid of randomized dielectric constants was generated to train a deep neural network for predicting these dielectric constants from measured field distortions. Increasingly complex imaging environments are simulated, ranging from objects in free space to objects placed in a physical cage designed to produce uniform electric fields. Finally, this research relaxes the uniform electric field constraint, showing that the volume of an opaque container can be imaged with a copper tube antenna and a 1x4 array of D-dot sensors. Real world experimental results show that it is possible to image buckets of water (targets) within a plastic shed These experiments explore the detectability of targets as a function of target placement within the shed.

ContributorsDrummond, Zachary (Author) / Allee, David R (Thesis advisor) / Claytor, Kevin E (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Aberle, James (Committee member) / Arizona State University (Publisher)

Created2021

Forward and Backward Machine Learning for Modeling Copper Diffusion in Cadmium Telluride Solar Cells

Description

To optimize solar cell performance, it is necessary to properly design the doping profile in the absorber layer of the solar cell. For CdTe solar cells, Cu is used for providing p-type doping. Hence, having an estimator that, given the diffusion parameter set (time and Temperature) and the doping concentration…

To optimize solar cell performance, it is necessary to properly design the doping profile in the absorber layer of the solar cell. For CdTe solar cells, Cu is used for providing p-type doping. Hence, having an estimator that, given the diffusion parameter set (time and Temperature) and the doping concentration at the junction, gives the junction depth of the absorber layer, is essential in the design process of CdTe solar cells (and other cell technologies). In this work it is called a forward (direct) estimation process. The backward (inverse) problem then is the one in which, given the junction depth and the desired concentration of Cu doping at the CdTe/CdS heterointerface, the estimator gives the time and/or the Temperature needed to achieve the desired doping profiles. This is called a backward (inverse) estimation process. Such estimators, both forward and backward, do not exist in the literature for solar cell technology. To train the Machine Learning (ML) estimator, it is necessary to first generate a large set of data that are obtained by using the PVRD-FASP Solver, which has been validated via comparison with experimental values. Note that this big dataset needs to be generated only once. Next, one uses Machine Learning (ML), Deep Learning (DL) and Artificial Intelligence (AI) to extract the actual Cu doping profiles that result from the process of diffusion, annealing, and cool-down in the fabrication sequence of CdTe solar cells. Two deep learning neural network models are used: (1) Multilayer Perceptron Artificial Neural Network (MLPANN) model using a Keras Application Programmable Interface (API) with TensorFlow backend, and (2) Radial Basis Function Network (RBFN) model to predict the Cu doping profiles for different Temperatures and durations of the annealing process. Excellent agreement between the simulated results obtained with the PVRD-FASP Solver and the predicted values is obtained. It is important to mention here that it takes a significant amount of time to generate the Cu doping profiles given the initial conditions using the PVRD-FASP Solver, because solving the drift-diffusion-reaction model is mathematically a stiff problem and leads to numerical instabilities if the time steps are not small enough, which, in turn, affects the time needed for completion of one simulation run. The generation of the same with Machine Learning (ML) is almost instantaneous and can serve as an excellent simulation tool to guide future fabrication of optimal doping profiles in CdTe solar cells.

ContributorsSalman, Ghaith (Author) / Vasileska, Dragica (Thesis advisor) / Goodnick, Stephen M. (Thesis advisor) / Ringhofer, Christian (Committee member) / Banerjee, Ayan (Committee member) / Arizona State University (Publisher)

Created2021

Filtering by