Search Content

An empirical mega-analysis of retail locations: value platforms, real-estate maturity, and deployment decisions

Description

The objective of this dissertation is to empirically analyze the results of the retail location decision making process and how chain networks evolve given their value platform. It employs one of the largest cross-sectional databases of retailers ever assembled, including 50 US retail chains and over 70,000 store locations. Three…

The objective of this dissertation is to empirically analyze the results of the retail location decision making process and how chain networks evolve given their value platform. It employs one of the largest cross-sectional databases of retailers ever assembled, including 50 US retail chains and over 70,000 store locations. Three closely related articles, which develop new theory explaining location deployment and behaviors of retailers, are presented. The first article, "Regionalism in US Retailing," presents a comprehensive spatial analysis of the domestic patterns of retailers. Geographic Information Systems (GIS) and statistics examine the degree to which the chains are deployed regionally versus nationally. Regional bias is found to be associated with store counts, small market deployment, and the location of the founding store, but not the age of the chain. Chains that started in smaller markets deploy more stores in other small markets and vice versa for chains that started in larger markets. The second article, "The Location Types of US Retailers," is an inductive analysis of the types of locations chosen by the retailers. Retail locations are classified into types using cluster analysis on situational and trade area data at the geographical scale of the individual stores. A total of twelve distinct location types were identified. A second cluster analysis groups together the chains with the most similar location profiles. Retailers within the same retail business often chose similar types of locations and were placed in the same clusters. Retailers generally restrict their deployment to one of three overall strategies including metropolitan, large retail areas, or market size variety. The third article, "Modeling Retail Chain Expansion and Maturity through Wave Analysis: Theory and Application to Walmart and Target," presents a theory of retail chain expansion and maturity whereby retailers expand in waves with alternating periods of faster and slower growth. Walmart diffused gradually from Arkansas and Target grew from the coasts inward. They were similar, however, in that after expanding into an area they reached a point of saturation and opened fewer stores, then moved on to other areas, only to revisit the earlier areas for new stores.

ContributorsJoseph, Lawrence (Author) / Kuby, Michael (Thesis advisor) / Matthews, Richard (Committee member) / Ó Huallacháin, Breandán (Committee member) / Kumar, Ajith (Committee member) / Arizona State University (Publisher)

Created2013

A continuous latent factor model for non-ignorable missing data in longitudinal studies

Description

Many longitudinal studies, especially in clinical trials, suffer from missing data issues. Most estimation procedures assume that the missing values are ignorable or missing at random (MAR). However, this assumption leads to unrealistic simplification and is implausible for many cases. For example, an investigator is examining the effect of treatment…

Many longitudinal studies, especially in clinical trials, suffer from missing data issues. Most estimation procedures assume that the missing values are ignorable or missing at random (MAR). However, this assumption leads to unrealistic simplification and is implausible for many cases. For example, an investigator is examining the effect of treatment on depression. Subjects are scheduled with doctors on a regular basis and asked questions about recent emotional situations. Patients who are experiencing severe depression are more likely to miss an appointment and leave the data missing for that particular visit. Data that are not missing at random may produce bias in results if the missing mechanism is not taken into account. In other words, the missing mechanism is related to the unobserved responses. Data are said to be non-ignorable missing if the probabilities of missingness depend on quantities that might not be included in the model. Classical pattern-mixture models for non-ignorable missing values are widely used for longitudinal data analysis because they do not require explicit specification of the missing mechanism, with the data stratified according to a variety of missing patterns and a model specified for each stratum. However, this usually results in under-identifiability, because of the need to estimate many stratum-specific parameters even though the eventual interest is usually on the marginal parameters. Pattern mixture models have the drawback that a large sample is usually required. In this thesis, two studies are presented. The first study is motivated by an open problem from pattern mixture models. Simulation studies from this part show that information in the missing data indicators can be well summarized by a simple continuous latent structure, indicating that a large number of missing data patterns may be accounted by a simple latent factor. Simulation findings that are obtained in the first study lead to a novel model, a continuous latent factor model (CLFM). The second study develops CLFM which is utilized for modeling the joint distribution of missing values and longitudinal outcomes. The proposed CLFM model is feasible even for small sample size applications. The detailed estimation theory, including estimating techniques from both frequentist and Bayesian perspectives is presented. Model performance and evaluation are studied through designed simulations and three applications. Simulation and application settings change from correctly-specified missing data mechanism to mis-specified mechanism and include different sample sizes from longitudinal studies. Among three applications, an AIDS study includes non-ignorable missing values; the Peabody Picture Vocabulary Test data have no indication on missing data mechanism and it will be applied to a sensitivity analysis; the Growth of Language and Early Literacy Skills in Preschoolers with Developmental Speech and Language Impairment study, however, has full complete data and will be used to conduct a robust analysis. The CLFM model is shown to provide more precise estimators, specifically on intercept and slope related parameters, compared with Roy's latent class model and the classic linear mixed model. This advantage will be more obvious when a small sample size is the case, where Roy's model experiences challenges on estimation convergence. The proposed CLFM model is also robust when missing data are ignorable as demonstrated through a study on Growth of Language and Early Literacy Skills in Preschoolers.

ContributorsZhang, Jun (Author) / Reiser, Mark R. (Thesis advisor) / Barber, Jarrett (Thesis advisor) / Kao, Ming-Hung (Committee member) / Wilson, Jeffrey (Committee member) / St Louis, Robert D. (Committee member) / Arizona State University (Publisher)

Created2013

Optimal experimental design for accelerated life testing and design evaluation

Description

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has…

Nowadays product reliability becomes the top concern of the manufacturers and customers always prefer the products with good performances under long period. In order to estimate the lifetime of the product, accelerated life testing (ALT) is introduced because most of the products can last years even decades. Much research has been done in the ALT area and optimal design for ALT is a major topic. This dissertation consists of three main studies. First, a methodology of finding optimal design for ALT with right censoring and interval censoring have been developed and it employs the proportional hazard (PH) model and generalized linear model (GLM) to simplify the computational process. A sensitivity study is also given to show the effects brought by parameters to the designs. Second, an extended version of I-optimal design for ALT is discussed and then a dual-objective design criterion is defined and showed with several examples. Also in order to evaluate different candidate designs, several graphical tools are developed. Finally, when there are more than one models available, different model checking designs are discussed.

ContributorsYang, Tao (Author) / Pan, Rong (Thesis advisor) / Montgomery, Douglas C. (Committee member) / Borror, Connie (Committee member) / Rigdon, Steve (Committee member) / Arizona State University (Publisher)

Created2013

Alternative methods via random forest to identify interactions in a general framework and variable importance in the context of value-added models

Description

This work presents two complementary studies that propose heuristic methods to capture characteristics of data using the ensemble learning method of random forest. The first study is motivated by the problem in education of determining teacher effectiveness in student achievement. Value-added models (VAMs), constructed as linear mixed models, use students’…

This work presents two complementary studies that propose heuristic methods to capture characteristics of data using the ensemble learning method of random forest. The first study is motivated by the problem in education of determining teacher effectiveness in student achievement. Value-added models (VAMs), constructed as linear mixed models, use students’ test scores as outcome variables and teachers’ contributions as random effects to ascribe changes in student performance to the teachers who have taught them. The VAMs teacher score is the empirical best linear unbiased predictor (EBLUP). This approach is limited by the adequacy of the assumed model specification with respect to the unknown underlying model. In that regard, this study proposes alternative ways to rank teacher effects that are not dependent on a given model by introducing two variable importance measures (VIMs), the node-proportion and the covariate-proportion. These VIMs are novel because they take into account the final configuration of the terminal nodes in the constitutive trees in a random forest. In a simulation study, under a variety of conditions, true rankings of teacher effects are compared with estimated rankings obtained using three sources: the newly proposed VIMs, existing VIMs, and EBLUPs from the assumed linear model specification. The newly proposed VIMs outperform all others in various scenarios where the model was misspecified. The second study develops two novel interaction measures. These measures could be used within but are not restricted to the VAM framework. The distribution-based measure is constructed to identify interactions in a general setting where a model specification is not assumed in advance. In turn, the mean-based measure is built to estimate interactions when the model specification is assumed to be linear. Both measures are unique in their construction; they take into account not only the outcome values, but also the internal structure of the trees in a random forest. In a separate simulation study, under a variety of conditions, the proposed measures are found to identify and estimate second-order interactions.

ContributorsValdivia, Arturo (Author) / Eubank, Randall (Thesis advisor) / Young, Dennis (Committee member) / Reiser, Mark R. (Committee member) / Kao, Ming-Hung (Committee member) / Broatch, Jennifer (Committee member) / Arizona State University (Publisher)

Created2013

Exploring the impact of varying levels of augmented reality to teach probability and sampling with a mobile device

Description

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR)…

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR) delivered via mobile technology that could potentially provide rich, contextualized learning for understanding concepts related to statistics education. This study examined the effects of AR experiences for learning basic statistical concepts. Using a 3 x 2 research design, this study compared learning gains of 252 undergraduate and graduate students from a pre- and posttest given before and after interacting with one of three types of augmented reality experiences, a high AR experience (interacting with three dimensional images coupled with movement through a physical space), a low AR experience (interacting with three dimensional images without movement), or no AR experience (two dimensional images without movement). Two levels of collaboration (pairs and no pairs) were also included. Additionally, student perceptions toward collaboration opportunities and engagement were compared across the six treatment conditions. Other demographic information collected included the students' previous statistics experience, as well as their comfort level in using mobile devices. The moderating variables included prior knowledge (high, average, and low) as measured by the student's pretest score. Taking into account prior knowledge, students with low prior knowledge assigned to either high or low AR experience had statistically significant higher learning gains than those assigned to a no AR experience. On the other hand, the results showed no statistical significance between students assigned to work individually versus in pairs. Students assigned to both high and low AR experience perceived a statistically significant higher level of engagement than their no AR counterparts. Students with low prior knowledge benefited the most from the high AR condition in learning gains. Overall, the AR application did well for providing a hands-on experience working with statistical data. Further research on AR and its relationship to spatial cognition, situated learning, high order skill development, performance support, and other classroom applications for learning is still needed.

ContributorsConley, Quincy (Author) / Atkinson, Robert K (Thesis advisor) / Nguyen, Frank (Committee member) / Nelson, Brian C (Committee member) / Arizona State University (Publisher)

Created2013

Dissertation on generalized empirical likelihood estimators

Description

Schennach (2007) has shown that the Empirical Likelihood (EL) estimator may not be asymptotically normal when a misspecified model is estimated. This problem occurs because the empirical probabilities of individual observations are restricted to be positive. I find that even the EL estimator computed without the restriction can fail to…

Schennach (2007) has shown that the Empirical Likelihood (EL) estimator may not be asymptotically normal when a misspecified model is estimated. This problem occurs because the empirical probabilities of individual observations are restricted to be positive. I find that even the EL estimator computed without the restriction can fail to be asymptotically normal for misspecified models if the sample moments weighted by unrestricted empirical probabilities do not have finite population moments. As a remedy for this problem, I propose a group of alternative estimators which I refer to as modified EL (MEL) estimators. For correctly specified models, these estimators have the same higher order asymptotic properties as the EL estimator. The MEL estimators are obtained by the Generalized Method of Moments (GMM) applied to an exactly identified model. The simulation results provide promising evidence for these estimators. In the second chapter, I introduce an alternative group of estimators to the Generalized Empirical Likelihood (GEL) family. The new group is constructed by employing demeaned moment functions in the objective function while using the original moment functions in the constraints. This designation modifies the higher-order properties of estimators. I refer to these new estimators as Demeaned Generalized Empirical Likelihood (DGEL) estimators. Although Newey and Smith (2004) show that the EL estimator in the GEL family has fewer sources of bias and is higher-order efficient after bias-correction, the demeaned exponential tilting (DET) estimator in the DGEL group has those superior properties. In addition, if data are symmetrically distributed, every estimator in the DGEL family shares the same higher-order properties as the best member.

ContributorsXiang, Jin (Author) / Ahn, Seung (Thesis advisor) / Wahal, Sunil (Thesis advisor) / Bharath, Sreedhar (Committee member) / Mehra, Rajnish (Committee member) / Tserlukevich, Yuri (Committee member) / Arizona State University (Publisher)

Created2013

A comparison of DIMTEST and generalized dimensionality discrepancy approaches to assessing dimensionality in item response theory

Description

Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools;…

Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools; see e.g., Froelich & Habing, 2007). It remains to be seen how such procedures perform in the context of small-scale assessments characterized by relatively small sample sizes and/or short tests. The fact that some procedures come with minimum allowable values for characteristics of the data, such as the number of items, may even render them unusable for some small-scale assessments. Other measures designed to assess dimensionality do not come with such limitations and, as such, may perform better under conditions that do not lend themselves to evaluation via statistics that rely on asymptotic theory. The current work aimed to evaluate the performance of one such metric, the standardized generalized dimensionality discrepancy measure (SGDDM; Levy & Svetina, 2011; Levy, Xu, Yel, & Svetina, 2012), under both large- and small-scale testing conditions. A Monte Carlo study was conducted to compare the performance of DIMTEST and the SGDDM statistic in terms of evaluating assumptions of unidimensionality in item response data under a variety of conditions, with an emphasis on the examination of these procedures in small-scale assessments. Similar to previous research, increases in either test length or sample size resulted in increased power. The DIMTEST procedure appeared to be a conservative test of the null hypothesis of unidimensionality. The SGDDM statistic exhibited rejection rates near the nominal rate of .05 under unidimensional conditions, though the reliability of these results may have been less than optimal due to high sampling variability resulting from a relatively limited number of replications. Power values were at or near 1.0 for many of the multidimensional conditions. It was only when the sample size was reduced to N = 100 that the two approaches diverged in performance. Results suggested that both procedures may be appropriate for sample sizes as low as N = 250 and tests as short as J = 12 (SGDDM) or J = 19 (DIMTEST). When used as a diagnostic tool, SGDDM may be appropriate with as few as N = 100 cases combined with J = 12 items. The study was somewhat limited in that it did not include any complex factorial designs, nor were the strength of item discrimination parameters or correlation between factors manipulated. It is recommended that further research be conducted with the inclusion of these factors, as well as an increase in the number of replications when using the SGDDM procedure.

ContributorsReichenberg, Ray E (Author) / Levy, Roy (Thesis advisor) / Thompson, Marilyn S. (Thesis advisor) / Green, Samuel B. (Committee member) / Arizona State University (Publisher)

Created2013

Innovative insurance products in food safety: pricing revenue insurance in the fresh spinach industry

Description

The lack of food safety in a grower's produce presents the grower with two risks; (1) that an item will need to be recalled from the market, incurring substantial costs and damaging brand equity and (2) that the entire market for the commodity becomes impaired as consumers associate all produce…

The lack of food safety in a grower's produce presents the grower with two risks; (1) that an item will need to be recalled from the market, incurring substantial costs and damaging brand equity and (2) that the entire market for the commodity becomes impaired as consumers associate all produce as being risky to eat. Nowhere is this more prevalent than in the leafy green industry, where recalls are relatively frequent and there has been one massive E. coli outbreak that rocked the industry in 2006. The purpose of this thesis is to examine insurance policies that protect growers from these risks. In doing this, a discussion of current recall insurance policies is presented. Further, actuarially fair premiums for catastrophic revenue insurance policies are priced through a contingent claims framework. The results suggest that spinach industry revenue can be insured for $0.02 per carton. Given the current costs of leafy green industry food safety initiatives, growers may be willing to pay for such an insurance policy.

ContributorsPagaran, Jeremy (Author) / Manfredo, Mark R. (Thesis advisor) / Richards, Timothy J. (Thesis advisor) / Nganje, William (Committee member) / Arizona State University (Publisher)

Created2013

Inventory accumulation, cash flow, and corporate investment

Description

I show that firms' ability to adjust variable capital in response to productivity shocks has important implications for the interpretation of the widely documented investment-cash flow sensitivities. The variable capital adjustment is sufficient for firms to capture small variations in profitability, but when the revision in profitability is relatively large,…

I show that firms' ability to adjust variable capital in response to productivity shocks has important implications for the interpretation of the widely documented investment-cash flow sensitivities. The variable capital adjustment is sufficient for firms to capture small variations in profitability, but when the revision in profitability is relatively large, limited substitutability between the factors of production may call for fixed capital investment. Hence, firms with lower substitutability are more likely to invest in both factors together and have larger sensitivities of fixed capital investment to cash flow. By building a frictionless capital markets model that allows firms to optimize over fixed capital and inventories as substitutable factors, I establish the significance of the substitutability channel in explaining cross-sectional differences in cash flow sensitivities. Moreover, incorporating variable capital into firms' investment decisions helps explain the sharp decrease in cash flow sensitivities over the past decades. Empirical evidence confirms the model's predictions.

ContributorsKim, Kirak (Author) / Bates, Thomas (Thesis advisor) / Babenko, Ilona (Thesis advisor) / Hertzel, Michael (Committee member) / Tserlukevich, Yuri (Committee member) / Arizona State University (Publisher)

Created2013

Testing independence of parallel pseudorandom number streams: incorporating the data's multivariate nature

Description

Parallel Monte Carlo applications require the pseudorandom numbers used on each processor to be independent in a probabilistic sense. The TestU01 software package is the standard testing suite for detecting stream dependence and other properties that make certain pseudorandom generators ineffective in parallel (as well as serial) settings. TestU01 employs…

Parallel Monte Carlo applications require the pseudorandom numbers used on each processor to be independent in a probabilistic sense. The TestU01 software package is the standard testing suite for detecting stream dependence and other properties that make certain pseudorandom generators ineffective in parallel (as well as serial) settings. TestU01 employs two basic schemes for testing parallel generated streams. The first applies serial tests to the individual streams and then tests the resulting P-values for uniformity. The second turns all the parallel generated streams into one long vector and then applies serial tests to the resulting concatenated stream. Various forms of stream dependence can be missed by each approach because neither one fully addresses the multivariate nature of the accumulated data when generators are run in parallel. This dissertation identifies these potential faults in the parallel testing methodologies of TestU01 and investigates two different methods to better detect inter-stream dependencies: correlation motivated multivariate tests and vector time series based tests. These methods have been implemented in an extension to TestU01 built in C++ and the unique aspects of this extension are discussed. A variety of different generation scenarios are then examined using the TestU01 suite in concert with the extension. This enhanced software package is found to better detect certain forms of inter-stream dependencies than the original TestU01 suites of tests.

ContributorsIsmay, Chester (Author) / Eubank, Randall (Thesis advisor) / Young, Dennis (Committee member) / Kao, Ming-Hung (Committee member) / Lanchier, Nicolas (Committee member) / Reiser, Mark R. (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by