Matching Items (194)

132421-Thumbnail Image.png

Predicting Mechanical Failure of Vacuum Pumps Using Accelerometer Data

Description

The objective of this paper is to find and describe trends in the fast Fourier transformed accelerometer data that can be used to predict the mechanical failure of large vacuum pumps used in industrial settings, such as providing drinking water.

The objective of this paper is to find and describe trends in the fast Fourier transformed accelerometer data that can be used to predict the mechanical failure of large vacuum pumps used in industrial settings, such as providing drinking water. Using three-dimensional plots of the data, this paper suggests how a model can be developed to predict the mechanical failure of vacuum pumps.

Contributors

Agent

Created

Date Created
2019-05

132832-Thumbnail Image.png

Exploring the Relation Between NAV and Price of ETFs in Financial Markets

Description

Exchange traded funds (ETFs) in many ways are similar to more traditional closed-end mutual funds, although thee differ in a crucial way. ETFs rely on a creation and redemption feature to achieve their functionality and this mechanism is designed to

Exchange traded funds (ETFs) in many ways are similar to more traditional closed-end mutual funds, although thee differ in a crucial way. ETFs rely on a creation and redemption feature to achieve their functionality and this mechanism is designed to minimize the deviations that occur between the ETF’s listed price and the net asset value of the ETF’s underlying assets. However while this does cause ETF deviations to be generally lower than their mutual fund counterparts, as our paper explores this process does not eliminate these deviations completely. This article builds off an earlier paper by Engle and Sarkar (2006) that investigates these properties of premiums (discounts) of ETFs from their fair market value. And looks to see if these premia have changed in the last 10 years. Our paper then diverges from the original and takes a deeper look into the standard deviations of these premia specifically.

Our findings show that over 70% of an ETFs standard deviation of premia can be explained through a linear combination consisting of two variables: a categorical (Domestic[US], Developed, Emerging) and a discrete variable (time-difference from US). This paper also finds that more traditional metrics such as market cap, ETF price volatility, and even 3rd party market indicators such as the economic freedom index and investment freedom index are insignificant predictors of an ETFs standard deviation of premia when combined with the categorical variable. These findings differ somewhat from existing literature which indicate that these factors should have a significant impact on the predictive ability of an ETFs standard deviation of premia.

Contributors

Agent

Created

Date Created
2019-05

132730-Thumbnail Image.png

The Use of Simulation in a Foundry Setting

Description

Woodland/Alloy Casting, Inc. is an aluminum foundry known for providing high-quality molds to their customers in industries such as aviation, electrical, defense, and nuclear power. However, as the company has grown larger during the past three years, they have begun

Woodland/Alloy Casting, Inc. is an aluminum foundry known for providing high-quality molds to their customers in industries such as aviation, electrical, defense, and nuclear power. However, as the company has grown larger during the past three years, they have begun to struggle with the on-time delivery of their orders. Woodland prides itself on their high-grade process that includes core processing, the molding process, cleaning process, and heat-treat process. To create each mold, it has to flow through each part of the system flawlessly. Throughout this process, significant bottlenecks occur that limit the number of molds leaving the system. To combat this issue, this project uses a simulation of the foundry to test how best to schedule their work to optimize the use of their resources. Simulation can be an effective tool when testing for improvements in systems where making changes to the physical system is too expensive. ARENA is a simulation tool that allows for manipulation of resources and process while also allowing both random and selected schedules to be run through the foundry’s production process. By using an ARENA simulation to test different scheduling techniques, the risk of missing production runs is minimized during the experimental period so that many different options can be tested to see how they will affect the production line. In this project, several feasible scheduling techniques are compared in simulation to determine which schedules allow for the highest number of molds to be completed.

Contributors

Created

Date Created
2019-05

133482-Thumbnail Image.png

Utilizing Machine Learning Methods to Model Cryptocurrency

Description

Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using

Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using rule induction methods. The dataset is cleaned by removing entries with missing data. The new column is created to measure price difference to create a more accurate analysis on the change in price. Eight relevant variables are selected using cross validation: the total number of bitcoins, the total size of the blockchains, the hash rate, mining difficulty, revenue from mining, transaction fees, the cost of transactions and the estimated transaction volume. The in-sample data is modeled using a simple tree fit, first with one variable and then with eight. Using all eight variables, the in-sample model and data have a correlation of 0.6822657. The in-sample model is improved by first applying bootstrap aggregation (also known as bagging) to fit 400 decision trees to the in-sample data using one variable. Then the random forests technique is applied to the data using all eight variables. This results in a correlation between the model and data of 9.9443413. The random forests technique is then applied to an Ethereum dataset, resulting in a correlation of 9.6904798. Finally, an out-of-sample model is created for Bitcoin and Ethereum using random forests, with a benchmark correlation of 0.03 for financial data. The correlation between the training model and the testing data for Bitcoin was 0.06957639, while for Ethereum the correlation was -0.171125. In conclusion, it is confirmed that cryptocurrencies can have accurate in-sample models by applying the random forests method to a dataset. However, out-of-sample modeling is more difficult, but in some cases better than typical forms of financial data. It should also be noted that cryptocurrency data has similar properties to other related financial datasets, realizing future potential for system modeling for cryptocurrency within the financial world.

Contributors

Agent

Created

Date Created
2018-05

133570-Thumbnail Image.png

Regression Analysis on Colony Collapse Disorder in the United States

Description

In the last decade, the population of honey bees across the globe has declined sharply leaving scientists and bee keepers to wonder why? Amongst all nations, the United States has seen some of the greatest declines in the last 10

In the last decade, the population of honey bees across the globe has declined sharply leaving scientists and bee keepers to wonder why? Amongst all nations, the United States has seen some of the greatest declines in the last 10 plus years. Without a definite explanation, Colony Collapse Disorder (CCD) was coined to explain the sudden and sharp decline of the honey bee colonies that beekeepers were experiencing. Colony collapses have been rising higher compared to expected averages over the years, and during the winter season losses are even more severe than what is normally acceptable. There are some possible explanations pointing towards meteorological variables, diseases, and even pesticide usage. Despite the cause of CCD being unknown, thousands of beekeepers have reported their losses, and even numbers of infected colonies and colonies under certain stressors in the most recent years. Using the data that was reported to The United States Department of Agriculture (USDA), as well as weather data collected by The National Centers for Environmental Information (NOAA) and the National Centers for Environmental Information (NCEI), regression analysis was used to investigate honey bee colonies to find relationships between stressors in honey bee colonies and meteorological variables, and colony collapses during the winter months. The regression analysis focused on the winter season, or quarter 4 of the year, which includes the months of October, November, and December. In the model, the response variables was the percentage of colonies lost in quarter 4. Through the model, it was concluded that certain weather thresholds and the percentage increase of colonies under certain stressors were related to colony loss.

Contributors

Agent

Created

Date Created
2018-05

134418-Thumbnail Image.png

Assessing the Economic Prosperity of Persons with Disabilities in American Cities

Description

We seek a comprehensive measurement for the economic prosperity of persons with disabilities. We survey the current literature and identify the major economic indicators used to describe the socioeconomic standing of persons with disabilities. We then develop a methodology for

We seek a comprehensive measurement for the economic prosperity of persons with disabilities. We survey the current literature and identify the major economic indicators used to describe the socioeconomic standing of persons with disabilities. We then develop a methodology for constructing a statistically valid composite index of these indicators, and build this index using data from the 2014 American Community Survey. Finally, we provide context for further use and development of the index and describe an example application of the index in practice.

Contributors

Agent

Created

Date Created
2017-05

134373-Thumbnail Image.png

Analytics of the Prospect Draft in Major League Baseball

Description

Our research encompassed the prospect draft in baseball and looked at what type of player teams drafted to maximize value. We wanted to know which position returned the best value to the team that drafted them, and which level is

Our research encompassed the prospect draft in baseball and looked at what type of player teams drafted to maximize value. We wanted to know which position returned the best value to the team that drafted them, and which level is safer to draft players from, college or high school. We decided to look at draft data from 2006-2010 for the first ten rounds of players selected. Because there is only a monetary cap on players drafted in the first ten rounds we restricted our data to these players. Once we set up the parameters we compiled a spreadsheet of these players with both their signing bonuses and their wins above replacement (WAR). This allowed us to see how much a team was spending per win at the major league level. After the data was compiled we made pivot tables and graphs to visually represent our data and better understand the numbers. We found that the worst position that MLB teams could draft would be high school second baseman. They returned the lowest WAR of any player that we looked at. In general though high school players were more costly to sign and had lower WARs than their college counterparts making them, on average, a worse pick value wise. The best position you could pick was college shortstops. They had the trifecta of the best signability of all players, along with one of the highest WARs and lowest signing bonuses. These were three of the main factors that you want with your draft pick and they ranked near the top in all three categories. This research can help give guidelines to Major League teams as they go to select players in the draft. While there are always going to be exceptions to trends, by following the enclosed research teams can minimize risk in the draft.

Contributors

Agent

Created

Date Created
2017-05

134603-Thumbnail Image.png

Relationship Between College Baseball Conferences and Average Offensive Production of Major League Baseball Players

Description

Beginning with the publication of Moneyball by Michael Lewis in 2003, the use of sabermetrics \u2014 the application of statistical analysis to baseball records - has exploded in major league front offices. Executives Billy Beane, Paul DePoedesta, and Theo Epstein

Beginning with the publication of Moneyball by Michael Lewis in 2003, the use of sabermetrics \u2014 the application of statistical analysis to baseball records - has exploded in major league front offices. Executives Billy Beane, Paul DePoedesta, and Theo Epstein are notable figures that have been successful in incorporating sabermetrics to their team's philosophy, resulting in playoff appearances and championship success. The competitive market of baseball, once dominated by the collusion of owners, now promotes innovative thought to analytically develop competitive advantages. The tiered economic payrolls of Major League Baseball (MLB) has created an environment in which large-market teams are capable of "buying" championships through the acquisition of the best available talent in free agency, and small-market teams are pushed to "build" championships through the drafting and systematic farming of high-school and college level players. The use of sabermetrics promotes both models of success \u2014 buying and building \u2014 by unbiasedly determining a player's productivity. The objective of this paper is to develop a regression-based predictive model that can be used by Majors League Baseball teams to forecast the MLB career average offensive performance of college baseball players from specific conferences. The development of this model required multiple tasks: I. Data was obtained from The Baseball Cube, a baseball records database providing both College and MLB data. II. Modifications to the data were applied to adjust for year-to-year formatting, a missing variable for seasons played, the presence of missing values, and to correct league identifiers. III. Evaluation of multiple offensive productivity models capable of handling the obtained dataset and regression forecasting technique. IV. SAS software was used to create the regression models and analyze the residuals for any irregularities or normality violations. The results of this paper find that there is a relationship between Division 1 collegiate baseball conferences and average career offensive productivity in Major Leagues Baseball, with the SEC having the most accurate reflection of performance.

Contributors

Agent

Created

Date Created
2017-05

133957-Thumbnail Image.png

Statistical Properties of Coherent Structures in Two Dimensional Turbulence

Description

Coherent vortices are ubiquitous structures in natural flows that affect mixing and transport of substances and momentum/energy. Being able to detect these coherent structures is important for pollutant mitigation, ecological conservation and many other aspects. In recent years, mathematical criteria

Coherent vortices are ubiquitous structures in natural flows that affect mixing and transport of substances and momentum/energy. Being able to detect these coherent structures is important for pollutant mitigation, ecological conservation and many other aspects. In recent years, mathematical criteria and algorithms have been developed to extract these coherent structures in turbulent flows. In this study, we will apply these tools to extract important coherent structures and analyze their statistical properties as well as their implications on kinematics and dynamics of the flow. Such information will aide representation of small-scale nonlinear processes that large-scale models of natural processes may not be able to resolve.

Contributors

Created

Date Created
2018-05

133983-Thumbnail Image.png

Jump Dynamics

Description

There are multiple mathematical models for alignment of individuals moving within a group. In a first class of models, individuals tend to relax their velocity toward the average velocity of other nearby neighbors. These types of models are motivated by

There are multiple mathematical models for alignment of individuals moving within a group. In a first class of models, individuals tend to relax their velocity toward the average velocity of other nearby neighbors. These types of models are motivated by the flocking behavior exhibited by birds. Another class of models have been introduced to describe rapid changes of individual velocity, referred to as jump, which better describes behavior of smaller agents (e.g. locusts, ants). In the second class of model, individuals will randomly choose to align with another nearby individual, matching velocities. There are several open questions concerning these two type of behavior: which behavior is the most efficient to create a flock (i.e. to converge toward the same velocity)? Will flocking still emerge when the number of individuals approach infinity? Analysis of these models show that, in the homogeneous case where all individuals are capable of interacting with each other, the variance of the velocities in both the jump model and the relaxation model decays to 0 exponentially for any nonzero number of individuals. This implies the individuals in the system converge to an absorbing state where all individuals share the same velocity, therefore individuals converge to a flock even as the number of individuals approach infinity. Further analysis focused on the case where interactions between individuals were determined by an adjacency matrix. The second eigenvalues of the Laplacian of this adjacency matrix (denoted ƛ2) provided a lower bound on the rate of decay of the variance. When ƛ2 is nonzero, the system is said to converge to a flock almost surely. Furthermore, when the adjacency matrix is generated by a random graph, such that connections between individuals are formed with probability p (where 01/N. ƛ2 is a good estimator of the rate of convergence of the system, in comparison to the value of p used to generate the adjacency matrix..

Contributors

Agent

Created

Date Created
2018-05