Filtering by
- All Subjects: Data Analytics
- Creators: Department of Information Systems
- Resource Type: Text
- Status: Published
sports, banking, and other disciplines. We use predictive analytics and modeling to
determine the impact of certain factors that increase the probability of a successful
fourth down conversion in the Power 5 conferences. The logistic regression models
predict the likelihood of going for fourth down with a 64% or more probability based on
2015-17 data obtained from ESPN’s college football API. Offense type though important
but non-measurable was incorporated as a random effect. We found that distance to go,
play type, field position, and week of the season were key leading covariates in
predictability. On average, our model performed as much as 14% better than coaches
in 2018.
This was achieved by first using offline explorer, an application that can download websites, to gather job postings from Dice.com that were searched by a pre-defined list of technical skills. Next came the parsing of the downloaded postings to extract and clean the data that was required and filling a database with that cleaned data. Then the companies were matched up with their corresponding industries. This was done using their NAICS (North American Industry Classification System) codes. The descriptions were then analyzed, and a group of soft skills was chosen based on the results of Word2Vec (a group of models that assists in creating word embeddings). A master table was then created by combining all of the tables in the database. The master table was then filtered down to exclude posts that required too much experience. Lastly, the web app was created using node.js as the back-end. This web app allows the user to choose their desired criteria and navigate through the postings that meet their criteria.
The goal of this project is to develop a deeper understanding of how machine learning pertains to the business world and how business professionals can capitalize on its capabilities. It explores the end-to-end process of integrating a machine and the tradeoffs and obstacles to consider. This topic is extremely pertinent today as the advent of big data increases and the use of machine learning and artificial intelligence is expanding across industries and functional roles. The approach I took was to expand on a project I championed as a Microsoft intern where I facilitated the integration of a forecasting machine learning model firsthand into the business. I supplement my findings from the experience with research on machine learning as a disruptive technology. This paper will not delve into the technical aspects of coding a machine model, but rather provide a holistic overview of developing the model from a business perspective. My findings show that, while the advantages of machine learning are large and widespread, a lack of visibility and transparency into the algorithms behind machine learning, the necessity for large amounts of data, and the overall complexity of creating accurate models are all tradeoffs to consider when deciding whether or not machine learning is suitable for a certain objective. The results of this paper are important in order to increase the understanding of any business professional on the capabilities and obstacles of integrating machine learning into their business operations.
Created predictive models using R to determine significant variables that help determine whether someone will default on their loans using a data set of almost 900,000 loan applicants.
Through research, interviews, and analysis, our paper provides the local community with a resource that offers a comprehensive collection of insight into the Mirabella at ASU Life Plan Community and the projected impact it will have on the City of Tempe and Arizona State University.
Sports analytics refers to the implementation of data science and analytics techniques within the sports industry. Several sports analysts and team managers have utilized analytical tools to boost overall team and player performance, often through the analysis of historical data. One of the most common techniques employed in sports analytics is that of data mining–the extensive practice of analyzing data in order to extract and deliver insights and findings. Data mining projects are frequently guided with the six-step Cross Industry Standard Process for Data Mining (CRISP-DM) framework. One such sport that has extensively used data science and analytics, and data mining specifically, is that of Formula One (F1). Given the sports’ reliance on technology, race engineers working for F1 constructors often develop statistical models analyzing historical race performance to derive insight of drivers’ success. For the purposes of this project, the perspective of a race engineer working for the F1 constructor McLaren was considered. As the constructor is seeking to gain a competitive advantage for the upcoming F1 season, race performance data concerning previous seasons was collected and analyzed as part of a larger data mining project utilizing the CRISP-DM framework. Statistical models, such as linear regression and random forest, were developed to predict the number of points scored by McLaren racers and the variables most strongly contributed to such scored points. The final results point to specific lap times having to be aimed for as the most important variable in determining the number of points gained, although specific locations also seem prone to McLaren race success. These results in turn will be utilized to develop race strategies for the upcoming season to ensure McLaren has high efficiency against its competitors.