Search Content

Matching Items (2)

Filtering by

Genre: Academic theses

Discovering and Mitigating Social Data Bias

Description

Exabytes of data are created online every day. This deluge of data is no more apparent than it is on social media. Naturally, finding ways to leverage this unprecedented source of human information is an active area of research. Social media platforms have become laboratories for conducting experiments about people at scales thought unimaginable only a few years ago.

Researchers and practitioners use social media to extract actionable patterns such as where aid should be distributed in a crisis. However, the validity of these patterns relies on having a representative dataset. As this dissertation shows, the data collected from social media is seldom representative of the activity of the site itself, and less so of human activity. This means that the results of many studies are limited by the quality of data they collect.

The finding that social media data is biased inspires the main challenge addressed by this thesis. I introduce three sets of methodologies to correct for bias. First, I design methods to deal with data collection bias. I offer a methodology which can find bias within a social media dataset. This methodology works by comparing the collected data with other sources to find bias in a stream. The dissertation also outlines a data collection strategy which minimizes the amount of bias that will appear in a given dataset. It introduces a crawling strategy which mitigates the amount of bias in the resulting dataset. Second, I introduce a methodology to identify bots and shills within a social media dataset. This directly addresses the concern that the users of a social media site are not representative. Applying these methodologies allows the population under study on a social media site to better match that of the real world. Finally, the dissertation discusses perceptual biases, explains how they affect analysis, and introduces computational approaches to mitigate them.

The results of the dissertation allow for the discovery and removal of different levels of bias within a social media dataset. This has important implications for social media mining, namely that the behavioral patterns and insights extracted from social media will be more representative of the populations under study.

ContributorsMorstatter, Fred (Author) / Liu, Huan (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Maciejewski, Ross (Committee member) / Carley, Kathleen M. (Committee member) / Arizona State University (Publisher)

Created2017

Action, Prediction, or Attention: Does the “Egocentric Temporal Order Bias” Support a Constructive Model of Perception?

Description

Temporal-order judgments can require integration of self-generated action-events and external sensory information. In a previous study, it was found that participants are biased to perceive one’s own action-events to occur prior to simultaneous external events. This phenomenon, named the “Egocentric Temporal Order Bias”, or ETO bias, was demonstrated as a 67% probability for participants to report self-generated events as occurring prior to simultaneous externally-determined events. These results were interpreted as supporting a feed-forward, constructive model of perception. However, the empirical data could support many potential mechanisms. The present study tests whether the ETO bias is driven by attentional differences, feed-forward predictability, or action. These findings support that participants exhibit a bias due to both feed-forward predictability and action, and a Bayesian analysis supports that these effects are quantitatively unique. Therefore, the results indicate that the ETO bias is largely driven by one’s own action, over and above feed-forward predictability.

ContributorsTang, Tim (Author) / Mcbeath, Michael K (Thesis advisor) / Brewer, Gene A. (Committee member) / Sanabria, Federico (Committee member) / Arizona State University (Publisher)

Created2020