Filtering by
- Creators: Barrett, The Honors College
Sentiment analysis, which is a notably method in text mining, can be used to extract the sentiment from people’s opinion. It then provides us with valuable perception on a topic from the public’s attitude, which create more opportunities for deeper analysis and prediction.
The thesis aims to investigate public’s sentiment towards Bitcoin through analyzing 10 million Bitcoin related tweets and assigning sentiment points on tweets, then using sentiment fluctuation as a factor to predict future crypto fluctuation. Price prediction is achieved by using a machine learning model called Recurrent Neural Network which automatically learns the pattern and generate following results with memory. The analysis revels slight connection between sentiment and crypto currency and the Neural Network model showed a strong connection between sentiment score and future price prediction.
As online media, including social media platforms, become the primary and go-to resource for traditional communication, news and the spread of information is more present and accessible to consumers than ever before. This research focuses on analyzing Twitter data on the ongoing Russian-Ukrainian War to understand the significance of social media during this period in comparison to previous conflicts. The significance of social media and political conflict will be examined through Twitter user analysis and sentiment analysis. This case study will conduct sentiment analysis on a random sample of tweets from a given dataset, followed by user analysis and classification methods. The data will explore the implications for understanding public opinion on the conflict, the strengths and limitations of Twitter as a data source, and the next steps for future research. Highlighting the implications of the research findings will allow consumers and political stakeholders to make more informed decisions in the future.
This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence Aware Dictionary for Sentiment Reasoning (VADER) natural language processing package to determine numerical polarities which represented positivity or negativity for a given stock ticker. These generated polarities were paired with stock metrics typically observed by stock analysts as the feature set for a Logistic Regression machine learning model. The model was trained on roughly 1500 major stocks to determine a binary classification between a “Buy” or “Not Buy” rating for each stock, and the results of the model were inserted into the back-end of the Agora Web UI which emulates search engine behavior specifically for stocks found in NYSE and NASDAQ. The model reported an accuracy of 82.5% and for most major stocks, the model’s prediction correlated with stock analysts’ ratings. Given the volatility of the stock market and the propensity for hive-mind behavior in online forums, the performance of the Logistic Regression model would benefit from incorporating historical stock data and more sources of opinion to balance any subjectivity in the model.
This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence Aware Dictionary for Sentiment Reasoning (VADER) natural language processing package to determine numerical polarities which represented positivity or negativity for a given stock ticker. These generated polarities were paired with stock metrics typically observed by stock analysts as the feature set for a Logistic Regression machine learning model. The model was trained on roughly 1500 major stocks to determine a binary classification between a “Buy” or “Not Buy” rating for each stock, and the results of the model were inserted into the back-end of the Agora Web UI which emulates search engine behavior specifically for stocks found in NYSE and NASDAQ. The model reported an accuracy of 82.5% and for most major stocks, the model’s prediction correlated with stock analysts’ ratings. Given the volatility of the stock market and the propensity for hive-mind behavior in online forums, the performance of the Logistic Regression model would benefit from incorporating historical stock data and more sources of opinion to balance any subjectivity in the model.
As I researched and conducted initial analysis for this project, I quickly ran into a few roadblocks that lead to me needing to pivot off of certain ideas and adapt my initial plans to fit what was actually being done in the current marketing environment. In reality, most businesses are not up for taking the risk of explicitly giving real metrics of their products and services to customers. Due to this, my thesis evolved into finding other ways that companies would use logical appeals to represent their products and comparatively analyze how these companies choose to represent themselves on a social media platform.
Since it doesn’t hurt to attempt to utilize feature extracted values to improve a model (if things don’t work out, one can always use their original features), the question may arise: how could the results of feature extraction on values such as sentiment affect a model’s ability to predict the movement of the stock market? This paper attempts to shine some light on to what the answer could be by deriving TextBlob sentiment values from Twitter data, and using Granger Causality Tests and logistic and linear regression to test if there exist a correlation or causation between the stock market and features extracted from public sentiment.