Matching Items (2)
Filtering by

Clear all filters

148431-Thumbnail Image.png
Description

Created predictive models using R to determine significant variables that help determine whether someone will default on their loans using a data set of almost 900,000 loan applicants.

ContributorsMazza, Rachel Marie (Author) / Schneider, Laurence (Thesis director) / Sha, Xiqing (Committee member) / School of Accountancy (Contributor) / Department of Information Systems (Contributor) / Barrett, The Honors College (Contributor)
Created2021-05
Description

Sports analytics refers to the implementation of data science and analytics techniques within the sports industry. Several sports analysts and team managers have utilized analytical tools to boost overall team and player performance, often through the analysis of historical data. One of the most common techniques employed in sports analytics

Sports analytics refers to the implementation of data science and analytics techniques within the sports industry. Several sports analysts and team managers have utilized analytical tools to boost overall team and player performance, often through the analysis of historical data. One of the most common techniques employed in sports analytics is that of data mining–the extensive practice of analyzing data in order to extract and deliver insights and findings. Data mining projects are frequently guided with the six-step Cross Industry Standard Process for Data Mining (CRISP-DM) framework. One such sport that has extensively used data science and analytics, and data mining specifically, is that of Formula One (F1). Given the sports’ reliance on technology, race engineers working for F1 constructors often develop statistical models analyzing historical race performance to derive insight of drivers’ success. For the purposes of this project, the perspective of a race engineer working for the F1 constructor McLaren was considered. As the constructor is seeking to gain a competitive advantage for the upcoming F1 season, race performance data concerning previous seasons was collected and analyzed as part of a larger data mining project utilizing the CRISP-DM framework. Statistical models, such as linear regression and random forest, were developed to predict the number of points scored by McLaren racers and the variables most strongly contributed to such scored points. The final results point to specific lap times having to be aimed for as the most important variable in determining the number of points gained, although specific locations also seem prone to McLaren race success. These results in turn will be utilized to develop race strategies for the upcoming season to ensure McLaren has high efficiency against its competitors.

ContributorsImam, Amir (Author) / Simon, Alan (Thesis director) / Sha, Xiqing (Committee member) / Barrett, The Honors College (Contributor) / Department of Information Systems (Contributor)
Created2023-05