Matching Items (5)
Filtering by

Clear all filters

151957-Thumbnail Image.png
Description
Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The

Random Forests is a statistical learning method which has been proposed for propensity score estimation models that involve complex interactions, nonlinear relationships, or both of the covariates. In this dissertation I conducted a simulation study to examine the effects of three Random Forests model specifications in propensity score analysis. The results suggested that, depending on the nature of data, optimal specification of (1) decision rules to select the covariate and its split value in a Classification Tree, (2) the number of covariates randomly sampled for selection, and (3) methods of estimating Random Forests propensity scores could potentially produce an unbiased average treatment effect estimate after propensity scores weighting by the odds adjustment. Compared to the logistic regression estimation model using the true propensity score model, Random Forests had an additional advantage in producing unbiased estimated standard error and correct statistical inference of the average treatment effect. The relationship between the balance on the covariates' means and the bias of average treatment effect estimate was examined both within and between conditions of the simulation. Within conditions, across repeated samples there was no noticeable correlation between the covariates' mean differences and the magnitude of bias of average treatment effect estimate for the covariates that were imbalanced before adjustment. Between conditions, small mean differences of covariates after propensity score adjustment were not sensitive enough to identify the optimal Random Forests model specification for propensity score analysis.
ContributorsCham, Hei Ning (Author) / Tein, Jenn-Yun (Thesis advisor) / Enders, Stephen G (Thesis advisor) / Enders, Craig K. (Committee member) / Mackinnon, David P (Committee member) / Arizona State University (Publisher)
Created2013
151341-Thumbnail Image.png
Description
With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic

With the rapid development of mobile sensing technologies like GPS, RFID, sensors in smartphones, etc., capturing position data in the form of trajectories has become easy. Moving object trajectory analysis is a growing area of interest these days owing to its applications in various domains such as marketing, security, traffic monitoring and management, etc. To better understand movement behaviors from the raw mobility data, this doctoral work provides analytic models for analyzing trajectory data. As a first contribution, a model is developed to detect changes in trajectories with time. If the taxis moving in a city are viewed as sensors that provide real time information of the traffic in the city, a change in these trajectories with time can reveal that the road network has changed. To detect changes, trajectories are modeled with a Hidden Markov Model (HMM). A modified training algorithm, for parameter estimation in HMM, called m-BaumWelch, is used to develop likelihood estimates under assumed changes and used to detect changes in trajectory data with time. Data from vehicles are used to test the method for change detection. Secondly, sequential pattern mining is used to develop a model to detect changes in frequent patterns occurring in trajectory data. The aim is to answer two questions: Are the frequent patterns still frequent in the new data? If they are frequent, has the time interval distribution in the pattern changed? Two different approaches are considered for change detection, frequency-based approach and distribution-based approach. The methods are illustrated with vehicle trajectory data. Finally, a model is developed for clustering and outlier detection in semantic trajectories. A challenge with clustering semantic trajectories is that both numeric and categorical attributes are present. Another problem to be addressed while clustering is that trajectories can be of different lengths and also have missing values. A tree-based ensemble is used to address these problems. The approach is extended to outlier detection in semantic trajectories.
ContributorsKondaveeti, Anirudh (Author) / Runger, George C. (Thesis advisor) / Mirchandani, Pitu (Committee member) / Pan, Rong (Committee member) / Maciejewski, Ross (Committee member) / Arizona State University (Publisher)
Created2012
155450-Thumbnail Image.png
Description
Distributed Renewable energy generators are now contributing a significant amount of energy into the energy grid. Consequently, reliability adequacy of such energy generators will depend on making accurate forecasts of energy produced by them. Power outputs of Solar PV systems depend on the stochastic variation of environmental factors (solar irradiance,

Distributed Renewable energy generators are now contributing a significant amount of energy into the energy grid. Consequently, reliability adequacy of such energy generators will depend on making accurate forecasts of energy produced by them. Power outputs of Solar PV systems depend on the stochastic variation of environmental factors (solar irradiance, ambient temperature & wind speed) and random mechanical failures/repairs. Monte Carlo Simulation which is typically used to model such problems becomes too computationally intensive leading to simplifying state-space assumptions. Multi-state models for power system reliability offer a higher flexibility in providing a description of system state evolution and an accurate representation of probability. In this study, Universal Generating Functions (UGF) were used to solve such combinatorial problems. 8 grid connected Solar PV systems were analyzed with a combined capacity of about 5MW located in a hot-dry climate (Arizona) and accuracy of 98% was achieved when validated with real-time data. An analytics framework is provided to grid operators and utilities to effectively forecast energy produced by distributed energy assets and in turn, develop strategies for effective Demand Response in times of increased share of renewable distributed energy assets in the grid. Second part of this thesis extends the environmental modelling approach to develop an aging test to be run in conjunction with an accelerated test of Solar PV modules. Accelerated Lifetime Testing procedures in the industry are used to determine the dominant failure modes which the product undergoes in the field, as well as predict the lifetime of the product. UV stressor is one of the ten stressors which a PV module undergoes in the field. UV exposure causes browning of modules leading to drop in Short Circuit Current. This thesis presents an environmental modelling approach for the hot-dry climate and extends it to develop an aging test methodology. This along with the accelerated tests would help achieve the goal of correlating field failures with accelerated tests and obtain acceleration factor. This knowledge would help predict PV module degradation in the field within 30% of the actual value and help in knowing the PV module lifetime accurately.
ContributorsKadloor, Nikhil (Author) / Kuitche, Joseph (Thesis advisor) / Pan, Rong (Thesis advisor) / Wu, Teresa (Committee member) / Arizona State University (Publisher)
Created2017
149352-Thumbnail Image.png
Description
For this thesis a Monte Carlo simulation was conducted to investigate the robustness of three latent interaction modeling approaches (constrained product indicator, generalized appended product indicator (GAPI), and latent moderated structural equations (LMS)) under high degrees of nonnormality of the exogenous indicators, which have not been investigated in previous literature.

For this thesis a Monte Carlo simulation was conducted to investigate the robustness of three latent interaction modeling approaches (constrained product indicator, generalized appended product indicator (GAPI), and latent moderated structural equations (LMS)) under high degrees of nonnormality of the exogenous indicators, which have not been investigated in previous literature. Results showed that the constrained product indicator and LMS approaches yielded biased estimates of the interaction effect when the exogenous indicators were highly nonnormal. When the violation of nonnormality was not severe (symmetric with excess kurtosis < 1), the LMS approach with ML estimation yielded the most precise latent interaction effect estimates. The LMS approach with ML estimation also had the highest statistical power among the three approaches, given that the actual Type-I error rates of the Wald and likelihood ratio test of interaction effect were acceptable. In highly nonnormal conditions, only the GAPI approach with ML estimation yielded unbiased latent interaction effect estimates, with an acceptable actual Type-I error rate of both the Wald test and likelihood ratio test of interaction effect. No support for the use of the Satorra-Bentler or Yuan-Bentler ML corrections was found across all three methods.
ContributorsCham, Hei Ning (Author) / West, Stephen G. (Thesis advisor) / Aiken, Leona S. (Committee member) / Enders, Craig K. (Committee member) / Arizona State University (Publisher)
Created2010
158398-Thumbnail Image.png
Description
The main objective of this research is to develop reliability assessment methodologies to quantify the effect of various environmental factors on photovoltaic (PV) module performance degradation. The manufacturers of these photovoltaic modules typically provide a warranty level of about 25 years for 20% power degradation from the initial specified power

The main objective of this research is to develop reliability assessment methodologies to quantify the effect of various environmental factors on photovoltaic (PV) module performance degradation. The manufacturers of these photovoltaic modules typically provide a warranty level of about 25 years for 20% power degradation from the initial specified power rating. To quantify the reliability of such PV modules, the Accelerated Life Testing (ALT) plays an important role. But there are several obstacles that needs to be tackled to conduct such experiments, since there has not been enough historical field data available. Even if some time-series performance data of maximum output power (Pmax) is available, it may not be useful to develop failure/degradation mode-specific accelerated tests. This is because, to study the specific failure modes, it is essential to use failure mode-specific performance variable (like short circuit current, open circuit voltage or fill factor) that is directly affected by the failure mode, instead of overall power which would be affected by one or more of the performance variables. Hence, to address several of the above-mentioned issues, this research is divided into three phases. The first phase deals with developing models to study climate specific failure modes using failure mode specific parameters instead of power degradation. The limited field data collected after a long time (say 18-21 years), is utilized to model the degradation rate and the developed model is then calibrated to account for several unknown environmental effects using the available qualification testing data. The second phase discusses the cumulative damage modeling method to quantify the effects of various environmental variables on the overall power production of the photovoltaic module. Mainly, this cumulative degradation modeling approach is used to model the power degradation path and quantify the effects of high frequency multiple environmental input data (like temperature, humidity measured every minute or hour) with very sparse response data (power measurements taken quarterly or annually). The third phase deals with optimal planning and inference framework using Iterative-Accelerated Life Testing (I-ALT) methodology. All the proposed methodologies are demonstrated and validated using appropriate case studies.
ContributorsBala Subramaniyan, Arun (Author) / Pan, Rong (Thesis advisor) / Tamizhmani, Govindasamy (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Wu, Teresa (Committee member) / Kuitche, Joseph (Committee member) / Arizona State University (Publisher)
Created2020