Filtering by
- All Subjects: Data Analytics
- Creators: Department of Information Systems
sports, banking, and other disciplines. We use predictive analytics and modeling to
determine the impact of certain factors that increase the probability of a successful
fourth down conversion in the Power 5 conferences. The logistic regression models
predict the likelihood of going for fourth down with a 64% or more probability based on
2015-17 data obtained from ESPN’s college football API. Offense type though important
but non-measurable was incorporated as a random effect. We found that distance to go,
play type, field position, and week of the season were key leading covariates in
predictability. On average, our model performed as much as 14% better than coaches
in 2018.
This was achieved by first using offline explorer, an application that can download websites, to gather job postings from Dice.com that were searched by a pre-defined list of technical skills. Next came the parsing of the downloaded postings to extract and clean the data that was required and filling a database with that cleaned data. Then the companies were matched up with their corresponding industries. This was done using their NAICS (North American Industry Classification System) codes. The descriptions were then analyzed, and a group of soft skills was chosen based on the results of Word2Vec (a group of models that assists in creating word embeddings). A master table was then created by combining all of the tables in the database. The master table was then filtered down to exclude posts that required too much experience. Lastly, the web app was created using node.js as the back-end. This web app allows the user to choose their desired criteria and navigate through the postings that meet their criteria.
The goal of this project is to develop a deeper understanding of how machine learning pertains to the business world and how business professionals can capitalize on its capabilities. It explores the end-to-end process of integrating a machine and the tradeoffs and obstacles to consider. This topic is extremely pertinent today as the advent of big data increases and the use of machine learning and artificial intelligence is expanding across industries and functional roles. The approach I took was to expand on a project I championed as a Microsoft intern where I facilitated the integration of a forecasting machine learning model firsthand into the business. I supplement my findings from the experience with research on machine learning as a disruptive technology. This paper will not delve into the technical aspects of coding a machine model, but rather provide a holistic overview of developing the model from a business perspective. My findings show that, while the advantages of machine learning are large and widespread, a lack of visibility and transparency into the algorithms behind machine learning, the necessity for large amounts of data, and the overall complexity of creating accurate models are all tradeoffs to consider when deciding whether or not machine learning is suitable for a certain objective. The results of this paper are important in order to increase the understanding of any business professional on the capabilities and obstacles of integrating machine learning into their business operations.
Created predictive models using R to determine significant variables that help determine whether someone will default on their loans using a data set of almost 900,000 loan applicants.
This thesis includes three separate documents: a) a comprehensive document detailing the methods and analysis of the creative factors tied to series success, b) an hour long pilot script based on this data, and c) an industry-standard pitch deck for a TV show created with data insights. In a larger sense, the aim of this study is to take the first steps in remedying information asymmetry between streaming services and content creators. If streaming services were more transparent with their data and communicated to their creators what has been proven to work in the past, showrunners and staff writers could have a new tool to increase the competitiveness of their series and aid in show renewal each year.