Search Content

RAProp: ranking tweets by exploiting the tweet/user/web ecosystem

Description

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a…

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is used to power two novel methods of ranking tweets by propagating the reputation over an agreement graph based on tweets' content similarity. Additionally, I show how the agreement graph helps counter tweet spam. An evaluation of my method on 16~million tweets from the TREC 2011 Microblog Dataset shows that it doubles the precision over baseline Twitter Search and achieves higher precision than current state of the art method. I present a detailed internal empirical evaluation of RAProp in comparison to several alternative approaches proposed by me, as well as external evaluation in comparison to the current state of the art method.

ContributorsRavikumar, Srijith (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)

Created2013

Event analytics on social media: challenges and solutions

Description

Social media platforms such as Twitter, Facebook, and blogs have emerged as valuable

- in fact, the de facto - virtual town halls for people to discover, report, share and

communicate with others about various types of events. These events range from

widely-known events such as the U.S Presidential debate to smaller scale,…

Social media platforms such as Twitter, Facebook, and blogs have emerged as valuable

- in fact, the de facto - virtual town halls for people to discover, report, share and

communicate with others about various types of events. These events range from

widely-known events such as the U.S Presidential debate to smaller scale, local events

such as a local Halloween block party. During these events, we often witness a large

amount of commentary contributed by crowds on social media. This burst of social

media responses surges with the "second-screen" behavior and greatly enriches the

user experience when interacting with the event and people's awareness of an event.

Monitoring and analyzing this rich and continuous flow of user-generated content can

yield unprecedentedly valuable information about the event, since these responses

usually offer far more rich and powerful views about the event that mainstream news

simply could not achieve. Despite these benefits, social media also tends to be noisy,

chaotic, and overwhelming, posing challenges to users in seeking and distilling high

quality content from that noise.

In this dissertation, I explore ways to leverage social media as a source of information and analyze events based on their social media responses collectively. I develop, implement and evaluate EventRadar, an event analysis toolbox which is able to identify, enrich, and characterize events using the massive amounts of social media responses. EventRadar contains three automated, scalable tools to handle three core event analysis tasks: Event Characterization, Event Recognition, and Event Enrichment. More specifically, I develop ET-LDA, a Bayesian model and SocSent, a matrix factorization framework for handling the Event Characterization task, i.e., modeling characterizing an event in terms of its topics and its audience's response behavior (via ET-LDA), and the sentiments regarding its topics (via SocSent). I also develop DeMa, an unsupervised event detection algorithm for handling the Event Recognition task, i.e., detecting trending events from a stream of noisy social media posts. Last, I develop CrowdX, a spatial crowdsourcing system for handling the Event Enrichment task, i.e., gathering additional first hand information (e.g., photos) from the field to enrich the given event's context.

Enabled by EventRadar, it is more feasible to uncover patterns that have not been

explored previously and re-validating existing social theories with new evidence. As a

result, I am able to gain deep insights into how people respond to the event that they

are engaged in. The results reveal several key insights into people's various responding

behavior over the event's timeline such the topical context of people's tweets does not

always correlate with the timeline of the event. In addition, I also explore the factors

that affect a person's engagement with real-world events on Twitter and find that

people engage in an event because they are interested in the topics pertaining to

that event; and while engaging, their engagement is largely affected by their friends'

behavior.

ContributorsHu, Yuheng (Author) / Kambhampati, Subbarao (Thesis advisor) / Horvitz, Eric (Committee member) / Krumm, John (Committee member) / Liu, Huan (Committee member) / Sundaram, Hari (Committee member) / Arizona State University (Publisher)

Created2014

Modeling Fantasy Baseball Player Popularity Using Twitter Activity

Description

Social media is used by people every day to discuss the nuances of their lives. Major League Baseball (MLB) is a popular sport in the United States, and as such has generated a great deal of activity on Twitter. As fantasy baseball continues to grow in popularity, so does the…

Social media is used by people every day to discuss the nuances of their lives. Major League Baseball (MLB) is a popular sport in the United States, and as such has generated a great deal of activity on Twitter. As fantasy baseball continues to grow in popularity, so does the research into better algorithms for picking players. Most of the research done in this area focuses on improving the prediction of a player's individual performance. However, the crowd-sourcing power afforded by social media may enable more informed predictions about players' performances. Players are chosen by popularity and personal preferences by most amateur gamblers. While some of these trends (particularly the long-term ones) are captured by ranking systems, this research was focused on predicting the daily spikes in popularity (and therefore price or draft order) by comparing the number of mentions that the player received on Twitter compared to their previous mentions. In doing so, it was demonstrated that improved fantasy baseball predictions can be made through leveraging social media data.

ContributorsRuskin, Lewis John (Author) / Liu, Huan (Thesis director) / Montgomery, Douglas (Committee member) / Morstatter, Fred (Committee member) / Industrial, Systems (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

Categorizing and Discovering Social Bots

Description

Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot…

Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot detection, we are interested in bots on Twitter that tweet Arabic extremist-like phrases. A testing dataset is collected using the honeypot method, and five different heuristics are measured for their effectiveness in detecting bots. The model underperformed, but we have laid the ground-work for a vastly untapped focus on bot detection: extremist ideal diffusion through bots.

ContributorsKarlsrud, Mark C. (Author) / Liu, Huan (Thesis director) / Morstatter, Fred (Committee member) / Barrett, The Honors College (Contributor) / Computing and Informatics Program (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2015-05

Analysis of Twitter's Effect on Stock Prices

Description

Twitter has become a very popular social media site that is used daily by many people and organizations. This paper will focus on the financial aspect of Twitter, as a process will be shown to be able to mine data about specific companies' stock prices. This was done by writing…

Twitter has become a very popular social media site that is used daily by many people and organizations. This paper will focus on the financial aspect of Twitter, as a process will be shown to be able to mine data about specific companies' stock prices. This was done by writing a program to grab tweets about the stocks of the thirty companies in the Dow Jones.

ContributorsLarson, Grant Elliott (Author) / Davulcu, Hasan (Thesis director) / Ye, Jieping (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2014-05

Analysis of BoostOR: A Twitter Bot Detection Classification Algorithm

Description

The prevalence of bots, or automated accounts, on social media is a well-known problem. Some of the ways bots harm social media users include, but are not limited to, spreading misinformation, influencing topic discussions, and dispersing harmful links. Bots have affected the field of disaster relief on social media as…

The prevalence of bots, or automated accounts, on social media is a well-known problem. Some of the ways bots harm social media users include, but are not limited to, spreading misinformation, influencing topic discussions, and dispersing harmful links. Bots have affected the field of disaster relief on social media as well. These bots cause problems such as preventing rescuers from determining credible calls for help, spreading fake news and other malicious content, and generating large amounts of content which burdens rescuers attempting to provide aid in the aftermath of disasters. To address these problems, this research seeks to detect bots participating in disaster event related discussions and increase the recall, or number of bots removed from the network, of Twitter bot detection methods. The removal of these bots will also prevent human users from accidentally interacting with these bot accounts and being manipulated by them. To accomplish this goal, an existing bot detection classification algorithm known as BoostOR was employed. BoostOR is an ensemble learning algorithm originally modeled to increase bot detection recall in a dataset and it has the possibility to solve the social media bot dilemma where there may be several different types of bots in the data. BoostOR was first introduced as an adjustment to existing ensemble classifiers to increase recall. However, after testing the BoostOR algorithm on unobserved datasets, results showed that BoostOR does not perform as expected. This study attempts to improve the BoostOR algorithm by comparing it with a baseline classification algorithm, AdaBoost, and then discussing the intentional differences between the two. Additionally, this study presents the main factors which contribute to the shortcomings of the BoostOR algorithm and proposes a solution to improve it. These recommendations should ensure that the BoostOR algorithm can be applied to new and unobserved datasets in the future.

ContributorsDavis, Matthew William (Author) / Liu, Huan (Thesis director) / Nazer, Tahora H. (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Department of Information Systems (Contributor) / Barrett, The Honors College (Contributor)

Created2018-12

Analysis of the Aftereffects of Terror Attacks on Social Media

Description

Social media has become a direct and effective means of transmitting personal opinions into the cyberspace. The use of certain key-words and their connotations in tweets portray a meaning that goes beyond the screen and affects behavior. During terror attacks or worldwide crises, people turn to social media as a…

Social media has become a direct and effective means of transmitting personal opinions into the cyberspace. The use of certain key-words and their connotations in tweets portray a meaning that goes beyond the screen and affects behavior. During terror attacks or worldwide crises, people turn to social media as a means of managing their anxiety, a mechanism of Terror Management Theory (TMT). These opinions have distinct impacts on the emotions that people express both online and offline through both positive and negative sentiments. This paper focuses on using sentiment analysis on twitter hash-tags during five major terrorist attacks that created a significant response on social media, which collectively show the effects that 140-character tweets have on perceptions in social media. The purpose of analyzing the sentiments of tweets after terror attacks allows for the visualization of the effect of key-words and the possibility of manipulation by the use of emotional contagion. Through sentiment analysis, positive, negative and neutral emotions were portrayed in the tweets. The keywords detected also portray characteristics about terror attacks which would allow for future analysis and predictions in regards to propagating a specific emotion on social media during future crisis.

ContributorsHarikumar, Swathikrishna (Author) / Davulcu, Hasan (Thesis director) / Bodford, Jessica (Committee member) / Computer Science and Engineering Program (Contributor) / Department of Information Systems (Contributor) / Barrett, The Honors College (Contributor)

Created2016-12

Section 230 Reform: A Mirror into the Divisive Socio-Political Landscape in America

Description

Over the past couple of years, the focus on the prevalence of hate-speech and misinformation on the internet has increased. Lawmakers feel that repealing or reforming Section 230 of the Communication Decency Act is the way to go, considering that the law has been used to protect companies from any…

Over the past couple of years, the focus on the prevalence of hate-speech and misinformation on the internet has increased. Lawmakers feel that repealing or reforming Section 230 of the Communication Decency Act is the way to go, considering that the law has been used to protect companies from any liability in the past. In this podcast series, I will be explaining what Section 230 is, how it affects us, and what changes are being proposed. In doing so, I wish to shed a light on how the problems of the internet are not solely in the hands of social media giants and a 26-word long law, but all its users that make up our global community.

ContributorsAvi, Pratyush (Author) / Schmidt, Peter (Thesis director) / Voorhees, Matthew (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Twitter Patterns in the Politics of Social Mobilization: #BlackLivesMatter Case Study

Description

The role of technology in shaping modern society has become increasingly important in the context of current democratic politics, especially when examined through the lens of social media. Twitter is a prominent social media platform used as a political medium, contributing to political movements such as #OccupyWallStreet, #MeToo, and…

The role of technology in shaping modern society has become increasingly important in the context of current democratic politics, especially when examined through the lens of social media. Twitter is a prominent social media platform used as a political medium, contributing to political movements such as #OccupyWallStreet, #MeToo, and #BlackLivesMatter. Using the #BlackLivesMatter movement as an illustrative case to establish patterns in Twitter usage, this thesis aims to answer the question “to what extent is Twitter an accurate representation of “real life” in terms of performative activism and user engagement?” The discussion of Twitter is contextualized by research on Twitter’s use in politics, both as a mobilizing force and potential to divide and mislead. Using intervals of time between 2014 – 2020, Twitter data containing #BlackLivesMatter is collected and analyzed. The discussion of findings centers around the role of performative activism in social mobilization on twitter. The analysis shows patterns in the data that indicates performative activism can skew the real picture of civic engagement, which can impact the way in which public opinion affects future public policy and mobilization.

ContributorsTutelman, Laura (Author) / Voorhees, Matthew (Thesis director) / Kawski, Matthias (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

PayPal - Social Injustice Index

Description

Social injustice issues are a familiar, yet very arduous topic to define. This is because they are difficult to predict and tough to understand. Injustice issues negatively affect communities because they directly violate human rights and they span a wide range of areas. For instance, injustice issues can relate to…

Social injustice issues are a familiar, yet very arduous topic to define. This is because they are difficult to predict and tough to understand. Injustice issues negatively affect communities because they directly violate human rights and they span a wide range of areas. For instance, injustice issues can relate to unfair labor practices, racism, gender bias, politics etc. This leaves numerous individuals wondering how they can make sense of social injustice issues and perhaps take efforts to stop them from occurring in the future. In an attempt to understand the rather complicated nature of social injustice, this thesis takes a data driven approach to define a social injustice index for a specific country, India. The thesis is an attempt to quantify and track social injustice through social media to see the current social climate. This was accomplished by developing a web scraper to collect hate speech data from Twitter. The tweets collected were then classified by their level of hate and presented on a choropleth map of India. Ultimately, a user viewing the ‘India Social Injustice Index’ map should be able to simply view an index score for a desired state in India through a single click. This thesis hopes to make it simple for any user viewing the social injustice map to make better sense of injustice issues.

ContributorsDeosthali, Shefali (Author) / Chavez-Echeagaray, Maria Elena (Thesis director) / Mathews, Nicolle (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05

Filtering by