Search Content

Categorizing and Discovering Social Bots

Description

Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot…

Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot detection, we are interested in bots on Twitter that tweet Arabic extremist-like phrases. A testing dataset is collected using the honeypot method, and five different heuristics are measured for their effectiveness in detecting bots. The model underperformed, but we have laid the ground-work for a vastly untapped focus on bot detection: extremist ideal diffusion through bots.

ContributorsKarlsrud, Mark C. (Author) / Liu, Huan (Thesis director) / Morstatter, Fred (Committee member) / Barrett, The Honors College (Contributor) / Computing and Informatics Program (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2015-05

Predicting Trends on Twitter with Time Series Analysis

Description

Twitter, the microblogging platform, has grown in prominence to the point that the topics that trend on the network are often the subject of the news and other traditional media. By predicting trends on Twitter, it could be possible to predict the next major topic of interest to the public.…

Twitter, the microblogging platform, has grown in prominence to the point that the topics that trend on the network are often the subject of the news and other traditional media. By predicting trends on Twitter, it could be possible to predict the next major topic of interest to the public. With this motivation, this paper develops a model for trends leveraging previous work with k-nearest-neighbors and dynamic time warping. The development of this model provides insight into the length and features of trends, and successfully generalizes to identify 74.3% of trends in the time period of interest. The model developed in this work provides understanding into why par- ticular words trend on Twitter.

ContributorsMarshall, Grant A (Author) / Liu, Huan (Thesis director) / Morstatter, Fred (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2015-05

Comparison of sentiment analysis systems and an application in signed link prediction

Description

Social media sites are platforms in which individuals discuss a wide range of topics and share a huge amount of information about themselves and their interests. So much of this information is encoded through unstructured text that users post on the these types of sites. There has been a considerable…

Social media sites are platforms in which individuals discuss a wide range of topics and share a huge amount of information about themselves and their interests. So much of this information is encoded through unstructured text that users post on the these types of sites. There has been a considerable amount of work done in respect to sentiment analysis on these sites to infer users' opinions and preferences. However there is a gap where it may be difficult to infer how a user feels about particular pages or topics that they have not conveyed their sentiment for in a observable form. Collaborative filtering is a common method used to solve this problem with user data, but has only infrequently been used with sentiment information in order to make inferences about users preferences. In this paper we extend previous work on leveraging sentiment in collaborative filtering, specifically to approximate user sentiment and subsequently their vote for candidates in an online election. Sentiment is shown to be an effective tool for making these types of predictions in the absence of other more explicit user preference information. In addition to this, we present an evaluation of sentiment analysis methods and tools that are used in state of the art sentiment analysis systems in order to understand which of these methods to leverage in our experiments.

ContributorsBaird, James Daniel (Author) / Liu, Huan (Thesis director) / Wang, Suhang (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

An Assessment of the Performance of Machine Learning Techniques When Applied to Trajectory Optimization

Description

Prior research has confirmed that supervised learning is an effective alternative to computationally costly numerical analysis. Motivated by NASA's use of abort scenario matrices to aid in mission operations and planning, this paper applies supervised learning to trajectory optimization in an effort to assess the accuracy of a less time-consuming…

Prior research has confirmed that supervised learning is an effective alternative to computationally costly numerical analysis. Motivated by NASA's use of abort scenario matrices to aid in mission operations and planning, this paper applies supervised learning to trajectory optimization in an effort to assess the accuracy of a less time-consuming method of producing the magnitude of delta-v vectors required to abort from various points along a Near Rectilinear Halo Orbit. Although the utility of the study is limited, the accuracy of the delta-v predictions made by a Gaussian regression model is fairly accurate after a relatively swift computation time, paving the way for more concentrated studies of this nature in the future.

ContributorsSmallwood, Sarah Lynn (Author) / Peet, Matthew (Thesis director) / Liu, Huan (Committee member) / Mechanical and Aerospace Engineering Program (Contributor) / School of Earth and Space Exploration (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Analysis of BoostOR: A Twitter Bot Detection Classification Algorithm

Description

The prevalence of bots, or automated accounts, on social media is a well-known problem. Some of the ways bots harm social media users include, but are not limited to, spreading misinformation, influencing topic discussions, and dispersing harmful links. Bots have affected the field of disaster relief on social media as…

The prevalence of bots, or automated accounts, on social media is a well-known problem. Some of the ways bots harm social media users include, but are not limited to, spreading misinformation, influencing topic discussions, and dispersing harmful links. Bots have affected the field of disaster relief on social media as well. These bots cause problems such as preventing rescuers from determining credible calls for help, spreading fake news and other malicious content, and generating large amounts of content which burdens rescuers attempting to provide aid in the aftermath of disasters. To address these problems, this research seeks to detect bots participating in disaster event related discussions and increase the recall, or number of bots removed from the network, of Twitter bot detection methods. The removal of these bots will also prevent human users from accidentally interacting with these bot accounts and being manipulated by them. To accomplish this goal, an existing bot detection classification algorithm known as BoostOR was employed. BoostOR is an ensemble learning algorithm originally modeled to increase bot detection recall in a dataset and it has the possibility to solve the social media bot dilemma where there may be several different types of bots in the data. BoostOR was first introduced as an adjustment to existing ensemble classifiers to increase recall. However, after testing the BoostOR algorithm on unobserved datasets, results showed that BoostOR does not perform as expected. This study attempts to improve the BoostOR algorithm by comparing it with a baseline classification algorithm, AdaBoost, and then discussing the intentional differences between the two. Additionally, this study presents the main factors which contribute to the shortcomings of the BoostOR algorithm and proposes a solution to improve it. These recommendations should ensure that the BoostOR algorithm can be applied to new and unobserved datasets in the future.

ContributorsDavis, Matthew William (Author) / Liu, Huan (Thesis director) / Nazer, Tahora H. (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Department of Information Systems (Contributor) / Barrett, The Honors College (Contributor)

Created2018-12

Using Machine Learning Models to Detect Fake News, Bots, and Rumors on Social Media

Description

In this paper, I introduce the fake news problem and detail how it has been exacerbated through social media. I explore current practices for fake news detection using natural language processing and current benchmarks in ranking the efficacy of various language models. Using a Twitter-specific benchmark, I attempt to reproduce the scores of…

In this paper, I introduce the fake news problem and detail how it has been exacerbated through social media. I explore current practices for fake news detection using natural language processing and current benchmarks in ranking the efficacy of various language models. Using a Twitter-specific benchmark, I attempt to reproduce the scores of six language models demonstrating their effectiveness in seven tweet classification tasks. I explain the successes and challenges in reproducing these results and provide analysis for the future implications of fake news research.

ContributorsChang, Ariz Bay (Author) / Liu, Huan (Thesis director) / Tahir, Anique (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Twitch Streamer-Game Recommender System

Description

Abstract
Matrix Factorization techniques have been proven to be more effective in recommender systems than standard user based or item based methods. Using this knowledge, Funk SVD and SVD++ are compared by the accuracy of their predictions of Twitch streamer data.

Introduction
As watching video games is becoming more popular, those interested are…

Abstract
Matrix Factorization techniques have been proven to be more effective in recommender systems than standard user based or item based methods. Using this knowledge, Funk SVD and SVD++ are compared by the accuracy of their predictions of Twitch streamer data.

Introduction
As watching video games is becoming more popular, those interested are becoming interested in Twitch.tv, an online platform for guests to watch streamers play video games and interact with them. A streamer is an person who broadcasts them-self playing a video game or some other thing for an audience (the guests of the website.) The site allows the guest to first select the game/category to view and then displays currently active streamers for the guest to select and watch. Twitch records the games that a streamer plays along with the amount of time that a streamer spends streaming that game. This is how the score is generated for a streamer’s game. These three terms form the streamer-game-score (user-item-rating) tuples that we use to train out models.
The our problem’s solution is similar to the purpose of the Netflix prize; however, as opposed to suggesting a user a movie, the goal is to suggest a user a game. We built a model to predict the score that a streamer will have for a game. The score field in our data is fundamentally different from a movie rating in Netflix because the way a user influences a game’s score is by actively streaming it, not by giving it an score based off opinion. The dataset being used it the Twitch.tv dataset provided by Isaac Jones [1]. Also, the only data used in training the models is in the form of the streamer-game-score (user-item-rating) tuples. It will be known if these data points with limited information will be able to give an accurate prediction of a streamer’s score for a game. SVD and SVD++ are the baseis of the models being trained and tested. Scikit’s Surprise library in Python3 is used for the implementation of the models.

ContributorsAitken, Connor Dalton (Author) / Liu, Huan (Thesis director) / Jones, Isaac (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Filtering by

Categorizing and Discovering Social Bots

Predicting Trends on Twitter with Time Series Analysis

Comparison of sentiment analysis systems and an application in signed link prediction

An Assessment of the Performance of Machine Learning Techniques When Applied to Trajectory Optimization

Analysis of BoostOR: A Twitter Bot Detection Classification Algorithm

Using Machine Learning Models to Detect Fake News, Bots, and Rumors on Social Media

Twitch Streamer-Game Recommender System