Matching Items (8)

151349-Thumbnail Image.png

Spatiotemporal data mining, analysis, and visualization of human activity data

Description

This dissertation addresses the research challenge of developing efficient new methods for discovering useful patterns and knowledge in large volumes of electronically collected spatiotemporal activity data. I propose to analyze

This dissertation addresses the research challenge of developing efficient new methods for discovering useful patterns and knowledge in large volumes of electronically collected spatiotemporal activity data. I propose to analyze three types of such spatiotemporal activity data in a methodological framework that integrates spatial analysis, data mining, machine learning, and geovisualization techniques. Three different types of spatiotemporal activity data were collected through different data collection approaches: (1) crowd sourced geo-tagged digital photos, representing people's travel activity, were retrieved from the website Panoramio.com through information retrieval techniques; (2) the same techniques were used to crawl crowd sourced GPS trajectory data and related metadata of their daily activities from the website OpenStreetMap.org; and finally (3) preschool children's daily activities and interactions tagged with time and geographical location were collected with a novel TabletPC-based behavioral coding system. The proposed methodology is applied to these data to (1) automatically recommend optimal multi-day and multi-stay travel itineraries for travelers based on discovered attractions from geo-tagged photos, (2) automatically detect movement types of unknown moving objects from GPS trajectories, and (3) explore dynamic social and socio-spatial patterns of preschool children's behavior from both geographic and social perspectives.

Contributors

Agent

Created

Date Created
  • 2012

154062-Thumbnail Image.png

Three essays on innovation: optimal licensing strategies, new variety adoption, and consumer preference in a peer network

Description

It is well understood that innovation drives productivity growth in agriculture. Innovation, however, is a process that involves activities distributed throughout the supply chain. In this dissertation I investigate

It is well understood that innovation drives productivity growth in agriculture. Innovation, however, is a process that involves activities distributed throughout the supply chain. In this dissertation I investigate three topics that are at the core of the distribution and diffusion of innovation: optimal licensing of university-based inventions, new variety adoption among farmers, and consumers’ choice of new products within a social network environment.

University researchers assume an important role in innovation, particularly as a result of the Bayh-Dole Act, which allowed universities to license inventions funded by federal research dollars, to private industry. Aligning the incentives to innovate at the university level with the incentives to adopt downstream, I show that non-exclusive licensing is preferred under both fixed fee and royalty licensing. Finding support for non-exclusive licensing is important as it provides evidence that the concept underlying the Bayh-Dole Act has economic merit, namely that the goals of university-based researchers are consistent with those of society, and taxpayers, in general.

After licensing, new products enter the diffusion process. Using a case study of small holders in Mozambique, I observe substantial geographic clustering of new-variety adoption decisions. Controlling for the other potential factors, I find that information diffusion through space is largely responsible for variation in adoption. As predicted by a social learning model, spatial effects are not based on geographic distance, but rather on neighbor-relationships that follow from information exchange. My findings are consistent with others who find information to be the primary barrier to adoption, and means that adoption can be accelerated by improving information exchange among farmers.

Ultimately, innovation is only useful when adopted by end consumers. Consumers’ choices of new products are determined by many factors such as personal preferences, the attributes of the products, and more importantly, peer recommendations. My experimental data shows that peers are indeed important, but “weak ties” or information from friends-of-friends is more important than close friends. Further, others regarded as experts in the subject matter exert the strongest influence on peer choices.

Contributors

Agent

Created

Date Created
  • 2015

154079-Thumbnail Image.png

A taxonomy of parallel vector spatial analysis algorithms

Description

Nearly 25 years ago, parallel computing techniques were first applied to vector spatial analysis methods. This initial research was driven by the desire to reduce computing times in order to

Nearly 25 years ago, parallel computing techniques were first applied to vector spatial analysis methods. This initial research was driven by the desire to reduce computing times in order to support scaling to larger problem sets. Since this initial work, rapid technological advancement has driven the availability of High Performance Computing (HPC) resources, in the form of multi-core desktop computers, distributed geographic information processing systems, e.g. computational grids, and single site HPC clusters. In step with increases in computational resources, significant advancement in the capabilities to capture and store large quantities of spatially enabled data have been realized. A key component to utilizing vast data quantities in HPC environments, scalable algorithms, have failed to keep pace. The National Science Foundation has identified the lack of scalable algorithms in codified frameworks as an essential research product. Fulfillment of this goal is challenging given the lack of a codified theoretical framework mapping atomic numeric operations from the spatial analysis stack to parallel programming paradigms, the diversity in vernacular utilized by research groups, the propensity for implementations to tightly couple to under- lying hardware, and the general difficulty in realizing scalable parallel algorithms. This dissertation develops a taxonomy of parallel vector spatial analysis algorithms with classification being defined by root mathematical operation and communication pattern, a computational dwarf. Six computational dwarfs are identified, three being drawn directly from an existing parallel computing taxonomy and three being created to capture characteristics unique to spatial analysis algorithms. The taxonomy provides a high-level classification decoupled from low-level implementation details such as hardware, communication protocols, implementation language, decomposition method, or file input and output. By taking a high-level approach implementation specifics are broadly proposed, breadth of coverage is achieved, and extensibility is ensured. The taxonomy is both informed and informed by five case studies im- plemented across multiple, divergent hardware environments. A major contribution of this dissertation is a theoretical framework to support the future development of concrete parallel vector spatial analysis frameworks through the identification of computational dwarfs and, by extension, successful implementation strategies.

Contributors

Agent

Created

Date Created
  • 2015

155931-Thumbnail Image.png

Spatializing partisan gerrymandering forensics: local measures and spatial specifications

Description

Gerrymandering is a central problem for many representative democracies. Formally, gerrymandering is the manipulation of spatial boundaries to provide political advantage to a particular group (Warf, 2006). The term often

Gerrymandering is a central problem for many representative democracies. Formally, gerrymandering is the manipulation of spatial boundaries to provide political advantage to a particular group (Warf, 2006). The term often refers to political district design, where the boundaries of political districts are “unnaturally” manipulated by redistricting officials to generate durable advantages for one group or party. Since free and fair elections are possibly the critical part of representative democracy, it is important for this cresting tide to have scientifically validated tools. This dissertation supports a current wave of reform by developing a general inferential technique to “localize” inferential bias measures, generating a new type of district-level score. The new method relies on the statistical intuition behind jackknife methods to construct relative local indicators. I find that existing statewide indicators of partisan bias can be localized using this technique, providing an estimate of how strongly a district impacts statewide partisan bias over an entire decade. When compared to measures of shape compactness (a common gerrymandering detection statistic), I find that weirdly-shaped districts have no consistent relationship with impact in many states during the 2000 and 2010 redistricting plan. To ensure that this work is valid, I examine existing seats-votes modeling strategies and develop a novel method for constructing seats-votes curves. I find that, while the empirical structure of electoral swing shows significant spatial dependence (even in the face of spatial heterogeneity), existing seats-votes specifications are more robust than anticipated to spatial dependence. Centrally, this dissertation contributes to the much larger social aim to resist electoral manipulation: that individuals & organizations suffer no undue burden on political access from partisan gerrymandering.

Contributors

Agent

Created

Date Created
  • 2017

151286-Thumbnail Image.png

Spatial optimization approaches for solving the continuous Weber and multi-Weber problems

Description

Facility location models are usually employed to assist decision processes in urban and regional planning. The focus of this research is extensions of a classic location problem, the Weber problem,

Facility location models are usually employed to assist decision processes in urban and regional planning. The focus of this research is extensions of a classic location problem, the Weber problem, to address continuously distributed demand as well as multiple facilities. Addressing continuous demand and multi-facilities represents major challenges. Given advances in geographic information systems (GIS), computational science and associated technologies, spatial optimization provides a possibility for improved problem solution. Essential here is how to represent facilities and demand in geographic space. In one respect, spatial abstraction as discrete points is generally assumed as it simplifies model formulation and reduces computational complexity. However, errors in derived solutions are likely not negligible, especially when demand varies continuously across a region. In another respect, although mathematical functions describing continuous distributions can be employed, such theoretical surfaces are generally approximated in practice using finite spatial samples due to a lack of complete information. To this end, the dissertation first investigates the implications of continuous surface approximation and explicitly shows errors in solutions obtained from fitted demand surfaces through empirical applications. The dissertation then presents a method to improve spatial representation of continuous demand. This is based on infill asymptotic theory, which indicates that errors in fitted surfaces tend to zero as the number of sample points increases to infinity. The implication for facility location modeling is that a solution to the discrete problem with greater demand point density will approach the theoretical optimum for the continuous counterpart. Therefore, in this research discrete points are used to represent continuous demand to explore this theoretical convergence, which is less restrictive and less problem altering compared to existing alternatives. The proposed continuous representation method is further extended to develop heuristics to solve the continuous Weber and multi-Weber problems, where one or more facilities can be sited anywhere in continuous space to best serve continuously distributed demand. Two spatial optimization approaches are proposed for the two extensions of the Weber problem, respectively. The special characteristics of those approaches are that they integrate optimization techniques and GIS functionality. Empirical results highlight the advantages of the developed approaches and the importance of solution integration within GIS.

Contributors

Agent

Created

Date Created
  • 2012

151638-Thumbnail Image.png

Transportation cordon pricing in the San Francisco Bay Area: analyzing equity implications for low-income commuters

Description

Cordon pricing strategies attempt to charge motorists for the marginal social costs of driving in heavily congested areas, lure them out of their vehicles and into other modes, and thereby

Cordon pricing strategies attempt to charge motorists for the marginal social costs of driving in heavily congested areas, lure them out of their vehicles and into other modes, and thereby reduce vehicle miles traveled and congestion-related externalities. These strategies are gaining policy-makers` attention worldwide. The benefits and costs of such strategies can potentially lead to a disproportionate and inequitable burden on lower income commuters, particularly those commuters with poor accessibility to alternative modes of transportation. Strategies designed to mitigate the impacts of cordon pricing for disadvantaged travelers, such as discount and exemptions, can reduce the effectiveness of the pricing strategy. Transit improvements using pricing fee revenues are another mitigation strategy, but can be wasteful and inefficient if not properly targeted toward those most disadvantaged and in need. This research examines these considerations and explores the implications for transportation planners working to balance goals of system effectiveness, efficiency, and equity. First, a theoretical conceptual model for analyzing the justice implications of cordon pricing is presented. Next, the Mobility Access and Pricing Study, a cordon pricing strategy examined by the San Francisco County Transportation Authority is analyzed utilizing a neighborhood-level accessibility-based approach. The fee-payment impacts for low-income transportation-disadvantaged commuters within the San Francisco Bay area are examined, utilizing Geographic Information Systems coupled with data from the Longitudinal Employment and Household Dynamics program of the US Census Bureau. This research questions whether the recommended blanket 50% discount for low-income travelers would unnecessarily reduce the overall efficiency and effectiveness of the cordon pricing system. It is proposed that reinvestment of revenue in transportation-improvement projects targeted at those most disproportionately impacted by tolling fees, low-income automobile-dependent peak-period commuters in areas with poor access to alternative modes, would be a more suitable mitigation strategy. This would not only help maintain the efficiency and effectiveness of the cordon pricing system, but would better address income, modal and spatial equity issues. The results of this study demonstrate how the spatial distribution of the toll-payment impacts may burden low-income residents in quite different ways, thereby warranting the inclusion of such analysis in transportation planning and practice.

Contributors

Agent

Created

Date Created
  • 2013

156060-Thumbnail Image.png

A new era of spatial interaction: potential and pitfalls

Description

As urban populations become increasingly dense, massive amounts of new 'big' data that characterize human activity are being made available and may be characterized as having a large volume of

As urban populations become increasingly dense, massive amounts of new 'big' data that characterize human activity are being made available and may be characterized as having a large volume of observations, being produced in real-time or near real-time, and including a diverse variety of information. In particular, spatial interaction (SI) data - a collection of human interactions across a set of origins and destination locations - present unique challenges for distilling big data into insight. Therefore, this dissertation identifies some of the potential and pitfalls associated with new sources of big SI data. It also evaluates methods for modeling SI to investigate the relationships that drive SI processes in order to focus on human behavior rather than data description.

A critical review of the existing SI modeling paradigms is first presented, which also highlights features of big data that are particular to SI data. Next, a simulation experiment is carried out to evaluate three different statistical modeling frameworks for SI data that are supported by different underlying conceptual frameworks. Then, two approaches are taken to identify the potential and pitfalls associated with two newer sources of data from New York City - bike-share cycling trips and taxi trips. The first approach builds a model of commuting behavior using a traditional census data set and then compares the results for the same model when it is applied to these newer data sources. The second approach examines how the increased temporal resolution of big SI data may be incorporated into SI models.

Several important results are obtained through this research. First, it is demonstrated that different SI models account for different types of spatial effects and that the Competing Destination framework seems to be the most robust for capturing spatial structure effects. Second, newer sources of big SI data are shown to be very useful for complimenting traditional sources of data, though they are not sufficient substitutions. Finally, it is demonstrated that the increased temporal resolution of new data sources may usher in a new era of SI modeling that allows us to better understand the dynamics of human behavior.

Contributors

Agent

Created

Date Created
  • 2017

156901-Thumbnail Image.png

Relationships between on-road FFCO₂ emission and socio-economics/urban form factors

Description

Fossil fuel CO2 (FFCO2) emissions are recognized as the dominant greenhouse gas driving climate change (Enting et. al., 1995; Conway et al., 1994; Francey et al., 1995; Bousquet et. al.,

Fossil fuel CO2 (FFCO2) emissions are recognized as the dominant greenhouse gas driving climate change (Enting et. al., 1995; Conway et al., 1994; Francey et al., 1995; Bousquet et. al., 1999). Transportation is a major component of FFCO2 emissions, especially in urban areas. An improved understanding of on-road FFCO2 emission at high spatial resolution is essential to both carbon science and mitigation policy. Though considerable research has been accomplished within a few high-income portions of the planet such as the United States and Western Europe, little work has attempted to comprehensively quantify high-resolution on-road FFCO2 emissions globally. Key questions for such a global quantification are: (1) What are the driving factors for on-road FFCO2 emissions? (2) How robust are the relationships? and (3) How do on-road FFCO2 emissions vary with urban form at fine spatial scales?

This study used urban form/socio-economic data combined with self-reported on-road FFCO2 emissions for a sample of global cities to estimate relationships within a multivariate regression framework based on an adjusted STIRPAT model. The on-road high-resolution (whole-city) regression FFCO2 model robustness was evaluated by introducing artificial error, conducting cross-validation, and assessing relationship sensitivity under various model specifications. Results indicated that fuel economy, vehicle ownership, road density and population density were statistically significant factors that correlate with on-road FFCO2 emissions. Of these four variables, fuel economy and vehicle ownership had the most robust relationships.

A second regression model was constructed to examine the relationship between global on-road FFCO2 emissions and urban form factors (described by population

ii

density, road density, and distance to activity centers) at sub-city spatial scales (1 km2). Results showed that: 1) Road density is the most significant (p<2.66e-037) predictor of on-road FFCO2 emissions at the 1 km2 spatial scale; 2) The correlation between population density and on-road FFCO2 emissions for interstates/freeways varies little by city type. For arterials, on-road FFCO2 emissions show a stronger relationship to population density in clustered cities (slope = 0.24) than dispersed cities (slope = 0.13). FFCO2 3) The distance to activity centers has a significant positive relationship with on-road FFCO2 emission for the interstate and freeway toad types, but an insignificant relationship with the arterial road type.

Contributors

Agent

Created

Date Created
  • 2018