Filtering by
- Creators: School of Life Sciences
A critical review of the existing SI modeling paradigms is first presented, which also highlights features of big data that are particular to SI data. Next, a simulation experiment is carried out to evaluate three different statistical modeling frameworks for SI data that are supported by different underlying conceptual frameworks. Then, two approaches are taken to identify the potential and pitfalls associated with two newer sources of data from New York City - bike-share cycling trips and taxi trips. The first approach builds a model of commuting behavior using a traditional census data set and then compares the results for the same model when it is applied to these newer data sources. The second approach examines how the increased temporal resolution of big SI data may be incorporated into SI models.
Several important results are obtained through this research. First, it is demonstrated that different SI models account for different types of spatial effects and that the Competing Destination framework seems to be the most robust for capturing spatial structure effects. Second, newer sources of big SI data are shown to be very useful for complimenting traditional sources of data, though they are not sufficient substitutions. Finally, it is demonstrated that the increased temporal resolution of new data sources may usher in a new era of SI modeling that allows us to better understand the dynamics of human behavior.
Factors that explain human mobility and active transportation include built environment and infrastructure features, though few studies incorporate specific geographic detail into examinations of mobility. Little is understood, for example, about the specific paths people take in urban areas or the influence of neighborhoods on their activity. Detailed analysis of human activity has been limited by the sampling strategies employed by conventional data sources. New crowdsourced datasets, or data gathered from smartphone applications, present an opportunity to examine factors that influence human activity in ways that have not been possible before; they typically contain more detail and are gathered more frequently than conventional sources. Questions remain, however, about the utility and representativeness of crowdsourced data. The overarching aim of this dissertation research is to identify how crowdsourced data can be used to better understand human mobility. Bicycling activity is used as a case study to examine human mobility because smartphone apps aimed at collecting bicycle routes are readily available and bicycling is under studied in comparison to other modes. The research herein aimed to contribute to the knowledge base on crowdsourced data and human mobility in three ways. First, the research examines how conventional (e.g., counts, travel surveys) and crowdsourced data correspond in representing bicycling activity. Results identified where the data correspond and differ significantly, which has implications for using crowdsourced data for planning and policy decisions. Second, the research examined the factors that influence cycling activity generated by smartphone cycling apps. The best predictors of activity were median weekly rent, percentage of residential land, and the number of people using two or more modes to commute in an area. Finally, the third part of the dissertation seeks to understand the impact of bicycle lanes and bicycle ridership on residential housing prices. Results confirmed that bicycle lanes in the neighborhood of a home positively influence sale prices, though ridership was marginally related to house price. This research demonstrates that knowledge obtained through crowdsourced data informs us about smaller geographic areas and details on where people bicycle, who uses bicycles, and the impact of the built environment on bicycling activity.
To determine if the disruption of the MMR pathway results in the reduced conservation of methylated adenines as well as an increased tolerance for mutations that result in the loss or gain of new GATC sites, we surveyed individual clones isolated from experimentally evolving wild-type and MMR-deficient (mutL- ;conferring an 150x increase in mutation rate) populations of E. coli with whole-genome sequencing. Initial analysis revealed a lack of mutations affecting methylation sites (GATC tetranucleotides) in wild-type clones. However, the inherent low mutation rates conferred by the wild-type background render this result inconclusive, due to a lack of statistical power, and reveal a need for a more direct measure of changes in methylation status. Thus as a first step to comparative methylomics, we benchmarked four different methylation-calling pipelines on three biological replicates of the wildtype progenitor strain for our evolved populations.
While it is understood that these methylated sites play a role in the MMR pathway, it is not fully understood the full extent of their effect on the genome. Thus the goal of this thesis was to better understand the forces which maintain the genome, specifically concerning m6A within the GATC motif.