Filtering by
- All Subjects: Analytics
- Creators: Wilson, Jeffrey
- Creators: Davulcu, Hasan
- Status: Published
This research explores the problem of the why so few of the published algorithms enter production and furthermore, fewer end up generating sustained value. The dissertation proposes a ‘Design for Deployment’ (DFD) framework to successfully build machine learning analytics so they can be deployed to generate sustained value. The framework emphasizes and elaborates the often neglected but immensely important latter steps of an analytics process: ‘Evaluation’ and ‘Deployment’. A representative evaluation framework is proposed that incorporates the temporal-shifts and dynamism of real-world scenarios. Additionally, the recommended infrastructure allows analytics projects to pivot rapidly when a particular venture does not materialize. Deployment needs and apprehensions of the industry are identified and gaps addressed through a 4-step process for sustainable deployment. Lastly, the need for analytics as a functional area (like finance and IT) is identified to maximize the return on machine-learning deployment.
The framework and process is demonstrated in semiconductor manufacturing – it is highly complex process involving hundreds of optical, electrical, chemical, mechanical, thermal, electrochemical and software processes which makes it a highly dynamic non-stationary system. Due to the 24/7 uptime requirements in manufacturing, high-reliability and fail-safe are a must. Moreover, the ever growing volumes mean that the system must be highly scalable. Lastly, due to the high cost of change, sustained value proposition is a must for any proposed changes. Hence the context is ideal to explore the issues involved. The enterprise use-cases are used to demonstrate the robustness of the framework in addressing challenges encountered in the end-to-end process of productizing machine learning analytics in dynamic read-world scenarios.
We attempted to apply a novel approach to stock market predictions. The Logistic Regression machine learning algorithm (Joseph Berkson) was applied to analyze news article headlines as represented by a bag-of-words (tri-gram and single-gram) representation in an attempt to predict the trends of stock prices based on the Dow Jones Industrial Average. The results showed that a tri-gram bag led to a 49% trend accuracy, a 1% increase when compared to the single-gram representation’s accuracy of 48%.
College athletics are a multi-billion dollar industry featuring hard-working student-athletes competing at a high level for national championships across a variety of different sports. Across the college sports landscape, coaches and players are always seeking an edge they can gain in order to obtain a competitive advantage over their opponents. While this may sound nefarious, the vast amounts of data about these games and student-athletes can be used to glean insights about the sports themselves in order to help student-athletes be more successful. Data analytics can be used to make sense of the available data by creating models and using other tools available that can predict how student-athletes and their teams will do in the future based on the data gathered from how they have performed in the past. Colleges and universities across the country compete in a vast array of sports. As a result of these differences, the sports with the largest amounts of data available will be the more popular college sports, such as football, men’s and women’s basketball, baseball and softball. Arizona State University, as a member of the Pac-12 conference, has a storied athletic tradition and decades of history in all of these sports, providing a large amount of data that can be used to analyze student-athlete success in these sports and help predict future success. However, data is available from numerous other college athletic programs that could provide a much larger sample to help predict with greater accuracy why certain teams and student-athletes are more successful than others. The explosion of analytics across the sports world has resulted in a new focus on utilizing statistical techniques to improve all aspects of different sports. Sports science has influenced medical departments, and model-building has been used to determine optimal in-game strategy and predict the outcomes of future games based on team strength. It is this latter approach that has become the focus of this paper, with football being used as a subject due to its vast popularity and massive supply of easily accessible data.
College athletics are a multi-billion dollar industry featuring hard-working student-athletes competing at a high level for national championships across a variety of different sports. Across the college sports landscape, coaches and players are always seeking an edge they can gain in order to obtain a competitive advantage over their opponents. While this may sound nefarious, the vast amounts of data about these games and student-athletes can be used to glean insights about the sports themselves in order to help student-athletes be more successful. Data analytics can be used to make sense of the available data by creating models and using other tools available that can predict how student-athletes and their teams will do in the future based on the data gathered from how they have performed in the past. Colleges and universities across the country compete in a vast array of sports. As a result of these differences, the sports with the largest amounts of data available will be the more popular college sports, such as football, men’s and women’s basketball, baseball and softball. Arizona State University, as a member of the Pac-12 conference, has a storied athletic tradition and decades of history in all of these sports, providing a large amount of data that can be used to analyze student-athlete success in these sports and help predict future success. However, data is available from numerous other college athletic programs that could provide a much larger sample to help predict with greater accuracy why certain teams and student-athletes are more successful than others. The explosion of analytics across the sports world has resulted in a new focus on utilizing statistical techniques to improve all aspects of different sports. Sports science has influenced medical departments, and model-building has been used to determine optimal in-game strategy and predict the outcomes of future games based on team strength. It is this latter approach that has become the focus of this paper, with football being used as a subject due to its vast popularity and massive supply of easily accessible data.