A Study of Win Expectancy Estimators in Major League Baseball

Iversen, Joshua Allen

In recent years, advanced metrics have dominated the game of Major League Baseball. One such metric, the Pythagorean Win-Loss Formula, is commonly used by fans, reporters, analysts and teams alike to use a team’s runs scored and runs allowed to…

In recent years, advanced metrics have dominated the game of Major League Baseball. One such metric, the Pythagorean Win-Loss Formula, is commonly used by fans, reporters, analysts and teams alike to use a team’s runs scored and runs allowed to estimate their expected winning percentage. However, this method is not perfect, and shows notable room for improvement. One such area that could be improved is its ability to be affected drastically by a single blowout game, a game in which one team significantly outscores their opponent.<br/>We hypothesize that meaningless runs scored in blowouts are harming the predictive power of Pythagorean Win-Loss and similar win expectancy statistics such as the Linear Formula for Baseball and BaseRuns. We developed a win probability-based cutoff approach that tallied the score of each game once a certain win probability threshold was passed, effectively removing those meaningless runs from a team’s season-long runs scored and runs allowed totals. These truncated totals were then inserted into the Pythagorean Win-Loss and Linear Formulas and tested against the base models.<br/>The preliminary results show that, while certain runs are more meaningful than others depending on the situation in which they are scored, the base models more accurately predicted future record than our truncated versions. For now, there is not enough evidence to either confirm or reject our hypothesis. In this paper, we suggest several potential improvement strategies for the results.<br/>At the end, we address how these results speak to the importance of responsibility and restraint when using advanced statistics within reporting.

Copyright Statement