Search Content

Design and Analysis of Algorithmic Trading Automation

Description

With the coming advances of computational power, algorithmic trading has become one of the primary strategies to trading on the stock market. To understand why and how these strategies have been effective, this project has taken a look at the complete process of creating tools and applications to analyze and…

With the coming advances of computational power, algorithmic trading has become one of the primary strategies to trading on the stock market. To understand why and how these strategies have been effective, this project has taken a look at the complete process of creating tools and applications to analyze and predict stock prices in order to perform low-frequency trading. The project is composed of three main components. The first component is integrating several public resources to acquire and process financial trading data and store it in order to complete the other components. Alpha Vantage API, a free open source application, provides an accurate and comprehensive dataset of features for each stock ticker requested. The second component is researching, prototyping, and implementing various trading algorithms in code. We began by focusing on the Mean Reversion algorithm as a proof of concept algorithm to develop meaningful trading strategies and identify patterns within our datasets. To augment our market prediction power (“alpha”), we implemented a Long Short-Term Memory recurrent neural network. Neural Networks are an incredibly effective but often complex tool used frequently in data science when traditional methods are found lacking. Following the implementation, the last component is to optimize, analyze, compare, and contrast all of the algorithms and identify key features to conclude the overall effectiveness of each algorithm. We were able to identify conclusively which aspects of each algorithm provided better alpha and create an entire pipeline to automate this process for live trading implementation. An additional reason for automation is to provide an educational framework such that any who may be interested in quantitative finance in the future can leverage this project to gain further insight.

ContributorsYurowkin, Alexander (Co-author) / Kumar, Rohit (Co-author) / Welfert, Bruno (Thesis director) / Li, Baoxin (Committee member) / Economics Program in CLAS (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Using Bag of Words Approach for Classifying Native Arizona Snakes in Images as Venomous or Non-Venomous

Description

Uninformed people frequently kill snakes without knowing whether they are venomous or harmless, fearing for their safety. To prevent unnecessary killings and to encourage people to be safe around venomous snakes, a proper identification is important. This work seeks to preserve wild native Arizona snakes and promote a general interest…

Uninformed people frequently kill snakes without knowing whether they are venomous or harmless, fearing for their safety. To prevent unnecessary killings and to encourage people to be safe around venomous snakes, a proper identification is important. This work seeks to preserve wild native Arizona snakes and promote a general interest in them by using a bag of features approach for classifying native Arizona snakes in images as venomous or non-venomous. The image category classifier was implemented in MATLAB and trained on a set of 245 images of native Arizona snakes (171 non-venomous, 74 venomous). To test this approach, 10-fold cross-validation was performed and the average accuracy was 0.7772. While this approach is functional, the results could be improved, ideally with a higher average accuracy, in order to be reliable. In false positives, the features may have been associated with the color or pattern, which is similar between venomous and non-venomous snakes due to mimicry. Polymorphic traits, color morphs, variation, and juveniles that may exhibit different colors can cause false negatives and misclassification. Future work involves pre-training image processing such as improving the brightness and contrast or converting to grayscale, interactively specifying or generating regions of interest for feature detection, and targeting reducing the false negative rate and improve the true positive rate. Further study is needed with a larger and balanced image set to evaluate its performance. This work may potentially serve as a tool for herpetologists to assist in their field research and to classify large image sets.

ContributorsIp, Melissa A (Author) / Li, Baoxin (Thesis director) / Chandakkar, Parag (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

Comparative Evaluation of Generative Machine Learning Models for Jazz Improvisation using Numerical Metrics

Description

Standardization is sorely lacking in the field of musical machine learning. This thesis project endeavors to contribute to this standardization by training three machine learning models on the same dataset and comparing them using the same metrics. The music-specific metrics utilized provide more relevant information for diagnosing the shortcomings of…

Standardization is sorely lacking in the field of musical machine learning. This thesis project endeavors to contribute to this standardization by training three machine learning models on the same dataset and comparing them using the same metrics. The music-specific metrics utilized provide more relevant information for diagnosing the shortcomings of each model.

ContributorsHilliker, Jacob (Author) / Li, Baoxin (Thesis director) / Libman, Jeffrey (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2021-12

Using Facebook to Examine Smoking Behavior through ""Quit Smoking"" Support Groups

Description

Background: As the growth of social media platforms continues, the use of the constantly increasing amount of freely available, user-generated data they receive becomes of great importance. One apparent use of this content is public health surveillance; such as for increasing understanding of substance abuse. In this study, Facebook was…

Background: As the growth of social media platforms continues, the use of the constantly increasing amount of freely available, user-generated data they receive becomes of great importance. One apparent use of this content is public health surveillance; such as for increasing understanding of substance abuse. In this study, Facebook was used to monitor nicotine addiction through the public support groups users can join to aid their quitting process. Objective: The main objective of this project was to gain a better understanding of the mechanisms of nicotine addiction online and provide content analysis of Facebook posts obtained from "quit smoking" support groups. Methods: Using the Facebook Application Programming Interface (API) for Python, a sample of 9,970 posts were collected in October 2015. Information regarding the user's name and the number of likes and comments they received on their post were also included. The posts crawled were then manually classified by one annotator into one of three categories: positive, negative, and neutral. Where positive posts are those that describe current quits, negative posts are those that discuss relapsing, and neutral posts are those that were not be used to train the classifiers, which include posts where users have yet to attempt a quit, ads, random questions, etc. For this project, the performance of two machine learning algorithms on a corpus of manually labeled Facebook posts were compared. The classification goal was to test the plausibility of creating a natural language processing machine learning classifier which could be used to distinguish between relapse (labeled negative) and quitting success (labeled positive) posts from a set of smoking related posts. Results: From the corpus of 9,970 posts that were manually labeled: 6,254 (62.7%) were labeled positive, 1,249 (12.5%) were labeled negative, and 2467 (24.8%) were labeled neutral. Since the posts labeled neutral are those which are irrelevant to the classification task, 7,503 posts were used to train the classifiers: 83.4% positive and 16.6% negative. The SVM classifier was 84.1% accurate and 84.1% precise, had a recall of 1, and an F-score of 0.914. The MNB classifier was 82.8% accurate and 82.8% precise, had a recall of 1, and an F-score of 0.906. Conclusions: From the Facebook surveillance results, a small peak is given into the behavior of those looking to quit smoking. Ultimately, what makes Facebook a great tool for public health surveillance is that it has an extremely large and diverse user base with information that is easily obtainable. This, and the fact that so many people are actually willing to use Facebook support groups to aid their quitting processes demonstrates that it can be used to learn a lot about quitting and smoking behavior.

ContributorsMolina, Daniel Antonio (Author) / Li, Baoxin (Thesis director) / Tian, Qiongjie (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Automatic Scoring System for Professional Cornhole Tournaments

Description

Cornhole, traditionally seen as tailgate entertainment, has rapidly risen in popularity since the launching of the American Cornhole League in 2016. However, it lacks robust quality control over large tournaments, since many of the matches are scored and refereed by the players themselves. In the past, there have been issues…

Cornhole, traditionally seen as tailgate entertainment, has rapidly risen in popularity since the launching of the American Cornhole League in 2016. However, it lacks robust quality control over large tournaments, since many of the matches are scored and refereed by the players themselves. In the past, there have been issues where entire competition brackets have had to be scrapped and replayed because scores were not handled correctly. The sport is in need of a supplementary scoring solution that can provide quality control and accuracy over large matches where there aren’t enough referees present to score games. Drawing from the ACL regulations as well as personal experience and testimony from ACL Pro players, a list of requirements was generated for a potential automatic scoring system. Then, a market analysis of existing scoring solutions was done, and it found that there are no solutions on the market that can automatically score a cornhole game. Using the problem requirements and previous attempts to solve the scoring problem, a list of concepts was generated and evaluated against each other to determine which scoring system design should be developed. After determining that the chosen concept was the best way to approach the problem, the problem requirements and cornhole rules were further refined into a set of physical assumptions and constraints about the game itself. This informed the choice, structure, and implementation of the algorithms that score the bags. The prototype concept was tested on their own, and areas of improvement were found. Lastly, based on the results of the tests and what was learned from the engineering process, a roadmap was set out for the future development of the automatic scoring system into a full, market-ready product.

ContributorsGillespie, Reagan (Author) / Sugar, Thomas (Thesis director) / Li, Baoxin (Committee member) / Barrett, The Honors College (Contributor) / Engineering Programs (Contributor) / Dean, W.P. Carey School of Business (Contributor)

Created2023-05

D.I.Y. Smartcube: Tracking the Face Turns of a Rubik's Cube Using Embedded Speakers

Description

Speedsolving, the art of solving twisty puzzles like the Rubik's Cube as fast as possible, has recently benefitted from the arrival of smartcubes which have special hardware for tracking the cube's face turns and transmitting them via Bluetooth. However, due to their embedded electronics, existing smartcubes cannot be used in…

Speedsolving, the art of solving twisty puzzles like the Rubik's Cube as fast as possible, has recently benefitted from the arrival of smartcubes which have special hardware for tracking the cube's face turns and transmitting them via Bluetooth. However, due to their embedded electronics, existing smartcubes cannot be used in competition, reducing their utility in personal speedcubing practice. This thesis proposes a sound-based design for tracking the face turns of a standard, non-smart speedcube consisting of an audio processing receiver in software and a small physical speaker configured as a transmitter. Special attention has been given to ensuring that installing the transmitter requires only a reversible centercap replacement on the original cube. This allows the cube to benefit from smartcube features during practice, while still maintaining compliance with competition regulations. Within a controlled test environment, the software receiver perfectly detected a variety of transmitted move sequences. Furthermore, all components required for the physical transmitter were demonstrated to fit within the centercap of a Gans 356 speedcube.

ContributorsHale, Joseph (Author) / Heinrichs, Robert (Thesis director) / Li, Baoxin (Committee member) / Barrett, The Honors College (Contributor) / Software Engineering (Contributor) / School of International Letters and Cultures (Contributor)

Created2022-05

Deployable Web GUI for LLM Applications

Description

The scientific manuscript review stage is a key part of the modern scientific process. It involves rigorous evaluation of new papers by peers to assess the significance of contributions in a particular area of study and ensure that papers meet high standards. This process helps maintain the quality and credibility…

The scientific manuscript review stage is a key part of the modern scientific process. It involves rigorous evaluation of new papers by peers to assess the significance of contributions in a particular area of study and ensure that papers meet high standards. This process helps maintain the quality and credibility of research. However, some reviews can be toxic or overly discouraging, leading to unintentional psychological damage (such as anxiety or depression) to paper authors and detracting from the constructive tone of the review space. This Thesis/Creative Project was completed alongside a capstone project. Our capstone project aims to address this issue. The goal is to fine tune a Large Language Model (LLM) that can first accurately identify toxic sentences within a paper review. Then, the LLM will revise any toxic sentences in a way that maintains the criticism but delivers it in a more friendly or encouraging tone. To effectively use this LLM, it requires a Graphical User Interface (GUI) so that end-users (such as editors, associate editors, reviewers) can easily interact with it. This allows them to update the wording of the review in an effective manner while maintaining scientific integrity. While the GUI provides a user-friendly interface for interacting with the LLM, there are some technical challenges in running a LLM application in a web-based framework. LLMs are computationally expensive to run. They require significant GPU RAM, which can be a limiting factor, especially in a web-based framework with limited resources. One potential solution to this problem is model quantization, which can reduce the memory footprint of the model. However, this introduces the problem of model drift, as the model’s performance may decrease when quantized. This needs to be measured to ensure the model continues to provide accurate results.

ContributorsRamalingame, Hari (Author) / Banerjee, Imon (Thesis director) / Li, Baoxin (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)

Created2024-05

Filtering by

Design and Analysis of Algorithmic Trading Automation

Using Bag of Words Approach for Classifying Native Arizona Snakes in Images as Venomous or Non-Venomous

Comparative Evaluation of Generative Machine Learning Models for Jazz Improvisation using Numerical Metrics

Using Facebook to Examine Smoking Behavior through ""Quit Smoking"" Support Groups

Automatic Scoring System for Professional Cornhole Tournaments

D.I.Y. Smartcube: Tracking the Face Turns of a Rubik's Cube Using Embedded Speakers

Deployable Web GUI for LLM Applications