Filtering by
- Creators: School of Mathematical and Statistical Sciences
- Resource Type: Text
(LC-MS/MS) is used to identify and quantify peptides and proteins. LC-MS/MS produces mass spectra, which must be searched by one or more engines, which employ
algorithms to match spectra to theoretical spectra derived from a reference database.
These engines identify and characterize proteins and their component peptides. By
training a convolutional neural network on a dataset of over 6 million MS/MS spectra
derived from human proteins, we aim to create a tool that can quickly and effectively
identify spectra as peptides prior to database searching. This can significantly reduce search space and thus run time for database searches, thereby accelerating LCMS/MS-based proteomics data acquisition. Additionally, by training neural networks
on labels derived from the search results of three different database search engines, we
aim to examine and compare which features are best identified by individual search
engines, a neural network, or a combination of these.
With the rapid increase of technological capabilities, particularly in processing power and speed, the usage of machine learning is becoming increasingly widespread, especially in fields where real-time assessment of complex data is extremely valuable. This surge in popularity of machine learning gives rise to an abundance of potential research and projects on further broadening applications of artificial intelligence. From these opportunities comes the purpose of this thesis. Our work seeks to meaningfully increase our understanding of current capabilities of machine learning and the problems they can solve. One extremely popular application of machine learning is in data prediction, as machines are capable of finding trends that humans often miss. Our effort to this end was to examine the CVE dataset and attempt to predict future entries with Random Forests. The second area of interest lies within the great promise being demonstrated by neural networks in the field of autonomous driving. We sought to understand the research being put out by the most prominent bodies within this field and to implement a model on one of the largest standing datasets, Berkeley DeepDrive 100k. This thesis describes our efforts to build, train, and optimize a Random Forest model on the CVE dataset and a convolutional neural network on the Berkeley DeepDrive 100k dataset. We document these efforts with the goal of growing our knowledge on (and usage of) machine learning in these topics.
For my Honors Thesis, I decided to create an Artificial Intelligence Project to predict Fantasy NFL Football Points of players and team's defense. I created a Tensorflow Keras AI Regression model and created a Flask API that holds the AI model, and a Django Try-It Page for the user to use the model. These services are hosted on ASU's AWS service. In my Flask API, it actively gathers data from Pro-Football-Reference, then calculates the fantasy points. Let’s say the current year is 2022, then the model analyzes each player and trains on all data from available from 2000 to 2020 data, tests the data on 2021 data, and predicts for 2022 year. The Django Website asks the user to input the current year, then the user clicks the submit button runs the AI model, and the process explained earlier. Next, the user enters the player's name for the point prediction and the website predicts the last 5 rows with 4 being the previous fantasy points and the 5th row being the prediction.