Filtering by
- All Subjects: Machine Learning
- Creators: Computer Science and Engineering Program
- Member of: Theses and Dissertations
- Status: Published
2018, Google researchers published the BERT (Bidirectional Encoder Representations from Transformers) model, which has since served as a starting point for hundreds of NLP (Natural Language Processing) related experiments and other derivative models. BERT was trained on masked-language modelling (sentence prediction) but its capabilities extend to more common NLP tasks, such as language inference and text classification. Naralytics is a company that seeks to use natural language in order to be able to categorize users who create text into multiple categories – which is a modified version of classification. However, the text that Naralytics seeks to pull from exceed the maximum token length of 512 tokens that BERT supports – so this report discusses the research towards multiple BERT derivatives that seek to address this problem – and then implements a solution that addresses the multiple concerns that are attached to this kind of model.
Historically, the predominant strategy for evaluating baseball pitchers has been through statistics created directly from the offensive production against the pitcher, such as ERA. Such statistics are inherently relative to the abilities and competition level of the opposing offense and the field defense, which the pitcher has no control over, making it difficult to compare pitchers across leagues. In this paper, I use cutting edge pitch-tracking data to develop a pitch evaluation model that is intrinsic to the attributes of the pitches themselves, and not influenced directly by the outcomes of each individual pitch. I train four different classifiers to predict the probability of each pitch belonging to different subsets of outcomes, then multiply the probability of each outcome by that outcome’s average run value to arrive at an expected run value for the pitch. I compare the performance of each classifier to a baseline, examine the most impactful features, and compare the top pitchers identified by the model to those identified by a different baseball statistics resource, ultimately concluding that three of the four classification models are productive and that the overall intrinsic evaluation model accurately identifies the sports top performers.
This thesis project focuses on the creation and assessment of the "Simple Stocks" app, a straightforward investment tool specifically developed for people who are new to investing and find it challenging to comprehend the complexities of the stock market. We identified a significant gap in the availability of easy-to-understand resources and information for beginner investors, which led us to design an app that provides clear and simple data, professional advice from financial analysts, and an advanced machine learning feature to predict stock trends. The "Simple Stocks" app also incorporates a voting feature, allowing users to see what other investors think about specific stocks. This functionality not only helps users make informed decisions but also encourages a sense of community, as users can learn from each other's experiences and opinions. By creating a supportive environment, the app promotes a more approachable and enjoyable experience for those who are new to investing. Following the successful release of the "Simple Stocks'' app on the App Store, our current objectives include expanding the user base and looking into various ways to generate income. One possible approach is to collaborate with other companies and establish an advertising-based revenue model, which would benefit both parties by attracting more users and increasing profits.
The field of quantum computing is an exciting area of research that allows quantum mechanics such as superposition, interference, and entanglement to be utilized in solving complex computing problems. One real world application of quantum computing involves applying it to machine learning problems. In this thesis, I explore the effects of choosing different circuit ansatz and optimizers on the performance of a variational quantum classifier tasked with binary classification.