Matching Items (2)
132157-Thumbnail Image.png
Description
The findings of this project show that through the use of principal component analysis and K-Means clustering, NBA players can be algorithmically classified in distinct clusters, representing a player archetype. Individual player data for the 2018-2019 regular season was collected for 150 players, and this included regular per game statistics,

The findings of this project show that through the use of principal component analysis and K-Means clustering, NBA players can be algorithmically classified in distinct clusters, representing a player archetype. Individual player data for the 2018-2019 regular season was collected for 150 players, and this included regular per game statistics, such as rebounds, assists, field goals, etc., and advanced statistics, such as usage percentage, win shares, and value over replacement players. The analysis was achieved using the statistical programming language R on the integrated development environment RStudio. The principal component analysis was computed first in order to produce a set of five principal components, which explain roughly 82.20% of the total variance within the player data. These five principal components were then used as the parameters the players were clustered against in the K-Means clustering algorithm implemented in R. It was determined that eight clusters would best represent the groupings of the players, and eight clusters were created with a unique set of players belonging to each one. Each cluster was analyzed based on the players making up the cluster and a player archetype was established to define each of the clusters. The reasoning behind the player archetypes given to each cluster was explained, providing details as to why the players were clustered together and the main data features that influenced the clustering results. Besides two of the clusters, the archetypes were proven to be independent of the player's position. The clustering results can be expanded on in the future to include a larger sample size of players, and it can be used to make inferences regarding NBA roster construction. The clustering can highlight key weaknesses in rosters and show which combinations of player archetypes lead to team success.
ContributorsElam, Mason Matthew (Author) / Armbruster, Dieter (Thesis director) / Gel, Esma (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
Description

In the U.S., the annual NCAA college basketball tournament, known as March Madness, draws in millions of people trying to predict who will win. Just one problem: no one has ever created a perfect bracket. By using a player-based rating system that updates throughout the season, a “predictive model” can

In the U.S., the annual NCAA college basketball tournament, known as March Madness, draws in millions of people trying to predict who will win. Just one problem: no one has ever created a perfect bracket. By using a player-based rating system that updates throughout the season, a “predictive model” can be created to accurately predict teams with the best shot of winning the championship, and even show which players had the most impact on a single team in college basketball.

ContributorsKearney, Matthew (Author) / Schneider, Laurence (Thesis director) / McIntosh, Daniel (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2023-05