Filtering by
- All Subjects: journalism
- Creators: Reed, Sada
- Creators: Kurland, Brett
In recent years, advanced metrics have dominated the game of Major League Baseball. One such metric, the Pythagorean Win-Loss Formula, is commonly used by fans, reporters, analysts and teams alike to use a team’s runs scored and runs allowed to estimate their expected winning percentage. However, this method is not perfect, and shows notable room for improvement. One such area that could be improved is its ability to be affected drastically by a single blowout game, a game in which one team significantly outscores their opponent.<br/>We hypothesize that meaningless runs scored in blowouts are harming the predictive power of Pythagorean Win-Loss and similar win expectancy statistics such as the Linear Formula for Baseball and BaseRuns. We developed a win probability-based cutoff approach that tallied the score of each game once a certain win probability threshold was passed, effectively removing those meaningless runs from a team’s season-long runs scored and runs allowed totals. These truncated totals were then inserted into the Pythagorean Win-Loss and Linear Formulas and tested against the base models.<br/>The preliminary results show that, while certain runs are more meaningful than others depending on the situation in which they are scored, the base models more accurately predicted future record than our truncated versions. For now, there is not enough evidence to either confirm or reject our hypothesis. In this paper, we suggest several potential improvement strategies for the results.<br/>At the end, we address how these results speak to the importance of responsibility and restraint when using advanced statistics within reporting.
The process to answer these questions began by compiling a list of 166 journalists who could provide valuable insight into the current state of sports journalism. Targeted specifically were those journalists who were either currently or had spent extensive time as a beat reporter, as a crucial aspect of the study hinged on the exploration of the role of analytics in day-to-day coverage. Of those 166 journalists, 93 made themselves available through either Twitter direct message or email. Once contacted, 47 of those journalists responded, eventually leading to 27 phone interviews and 7 email interviews.
Each interview began with the journalist establishing a baseline for what they thought the role of analytics should be in the coverage of their respective sports. From there, the conversation often took a linear turn as journalists talked about the experiences in their career that led them to that conclusion, what moments had shifted their overall opinions of analytics, their best approaches for utilizing analytics in both articles and interviews, their favorite and least favorite analytical measures, the gaps that remain in analytics, and the future of the industry as a whole.
Each interview was transcribed, and a number of compelling themes emerged. The many different themes were organized into three different groups, past, present and future, where they were further expanded on to best display the many concepts illustrated in this thesis. Among the themes explored include how journalists use coaches and players to validate statistics, what strategies work best when including analytics in conversations with athletes, how to find story ideas through analytics and the issues plaguing the analytics community. Once themes had been identified, the percentage of journalists who had indicated agreement with the themes were calculated. Thus, themes investigated were represented statistically as well as by a quote from a journalist addressing the idea.
Across 34 interviews with some of the country’s most established and well-respected voices, many of the pressing issues facing analytics in sports journalism today were explored, including the melding of analytical and narrative writing, how best to use analytics in question asking, and the “holy grail” of analytical data. Across interviews, a host of interesting strategies and ideas emerged as journalists examined how the industry reached its current point, what practices are currently most effective, and where the industry is headed. The perspective gained from this thesis gives insight into many of the lesser-discussed elements of journalism, imparting a deeper understanding of the challenges that lay ahead for sports journalism through an examination for how far the industry has come. While analytics and their usage in sports journalism remains a difficult concept to fully encapsulate, this thesis hopefully gives a better look at their complex and ever-evolving relationship.
In the quest to showcase this, it was necessary to document how baseball prospers from numbers and numbers prosper from baseball. The relationship between the two is mutualistic. Furthermore, an all-encompassing historical look at how data and statistics in baseball have matured was a critical portion of the paper. With a metric such as batting average going from a radical new measure that posed a threat to the status quo, to a fiercely cherished statistic that was suddenly being unseated by advanced analytics, it shows the creation of new and destruction of old has been incessant. Innovators like Pete Palmer, Dick Cramer and Bill James played a large role in this process in the 1980s. Computers aided their effort and when paired with the Internet, unleashed the ability to crunch data to an even larger sector of the population. The unveiling of Statcast at the commencement of the 2015 season showed just how much potential there is for measuring previously unquantifiable baseball acts.
Essentially, there will always be people who mourn the presence of data and statistics in baseball. Despite this, the evolution story indicates baseball and numbers will be intertwined into the future, likely to an even greater extent than ever before, as technology and new philosophies become increasingly integrated into front offices and clubhouses.