Search Content

An exploration of statistical modelling methods on simulation data case study: biomechanical predator-prey simulations

Description

Modern, advanced statistical tools from data mining and machine learning have become commonplace in molecular biology in large part because of the “big data” demands of various kinds of “-omics” (e.g., genomics, transcriptomics, metabolomics, etc.). However, in other fields of biology where empirical data sets are conventionally smaller, more…

Modern, advanced statistical tools from data mining and machine learning have become commonplace in molecular biology in large part because of the “big data” demands of various kinds of “-omics” (e.g., genomics, transcriptomics, metabolomics, etc.). However, in other fields of biology where empirical data sets are conventionally smaller, more traditional statistical methods of inference are still very effective and widely used. Nevertheless, with the decrease in cost of high-performance computing, these fields are starting to employ simulation models to generate insights into questions that have been elusive in the laboratory and field. Although these computational models allow for exquisite control over large numbers of parameters, they also generate data at a qualitatively different scale than most experts in these fields are accustomed to. Thus, more sophisticated methods from big-data statistics have an opportunity to better facilitate the often-forgotten area of bioinformatics that might be called “in-silicomics”.

As a case study, this thesis develops methods for the analysis of large amounts of data generated from a simulated ecosystem designed to understand how mammalian biomechanics interact with environmental complexity to modulate the outcomes of predator–prey interactions. These simulations investigate how other biomechanical parameters relating to the agility of animals in predator–prey pairs are better predictors of pursuit outcomes. Traditional modelling techniques such as forward, backward, and stepwise variable selection are initially used to study these data, but the number of parameters and potentially relevant interaction effects render these methods impractical. Consequently, new modelling techniques such as LASSO regularization are used and compared to the traditional techniques in terms of accuracy and computational complexity. Finally, the splitting rules and instances in the leaves of classification trees provide the basis for future simulation with an economical number of additional runs. In general, this thesis shows the increased utility of these sophisticated statistical techniques with simulated ecological data compared to the approaches traditionally used in these fields. These techniques combined with methods from industrial Design of Experiments will help ecologists extract novel insights from simulations that combine habitat complexity, population structure, and biomechanics.

ContributorsSeto, Christian (Author) / Pavlic, Theodore (Thesis advisor) / Li, Jing (Committee member) / Yan, Hao (Committee member) / Arizona State University (Publisher)

Created2018

Optimized Line Calling Strategies in Ultimate Frisbee

Description

Ultimate Frisbee or "Ultimate," is a fast growing field sport that is being played competitively at universities across the country. Many mid-tier college teams have the goal of winning as many games as possible, however they also need to grow their program by training and retaining new players. The purpose…

Ultimate Frisbee or "Ultimate," is a fast growing field sport that is being played competitively at universities across the country. Many mid-tier college teams have the goal of winning as many games as possible, however they also need to grow their program by training and retaining new players. The purpose of this project was to create a prototype statistical tool that maximizes a player line-up's probability of scoring the next point, while having as equal playing time across all experienced and novice players as possible. Game, player, and team data was collected for 25 different games played over the course of 4 tournaments during Fall 2017 and early Spring 2018 using the UltiAnalytics iPad application. "Amount of Top 1/3 Players" was the measure of equal playing time, and "Line Efficiency" and "Line Interaction" represented a line's probability of scoring. After running a logistic regression, Line Efficiency was found to be the more accurate predictor of scoring outcome than Line Interaction. An "Equal PT Measure vs. Line Efficiency" graph was then created and the plot showed what the optimal lines were depending on what the user's preferences were at that point in time. Possible next steps include testing the model and refining it as needed.

ContributorsSpence, Andrea Nicole (Author) / McCarville, Daniel R. (Thesis director) / Pavlic, Theodore (Committee member) / Industrial, Systems and Operations Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Modeling Fantasy Baseball Player Popularity Using Twitter Activity

Description

Social media is used by people every day to discuss the nuances of their lives. Major League Baseball (MLB) is a popular sport in the United States, and as such has generated a great deal of activity on Twitter. As fantasy baseball continues to grow in popularity, so does the…

Social media is used by people every day to discuss the nuances of their lives. Major League Baseball (MLB) is a popular sport in the United States, and as such has generated a great deal of activity on Twitter. As fantasy baseball continues to grow in popularity, so does the research into better algorithms for picking players. Most of the research done in this area focuses on improving the prediction of a player's individual performance. However, the crowd-sourcing power afforded by social media may enable more informed predictions about players' performances. Players are chosen by popularity and personal preferences by most amateur gamblers. While some of these trends (particularly the long-term ones) are captured by ranking systems, this research was focused on predicting the daily spikes in popularity (and therefore price or draft order) by comparing the number of mentions that the player received on Twitter compared to their previous mentions. In doing so, it was demonstrated that improved fantasy baseball predictions can be made through leveraging social media data.

ContributorsRuskin, Lewis John (Author) / Liu, Huan (Thesis director) / Montgomery, Douglas (Committee member) / Morstatter, Fred (Committee member) / Industrial, Systems (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

SoundSwarm: An Interactive Exploration of 3-Dimensional and Behavioral Modeled Sound

Description

This paper outlines the development of a software application that explores the plausibility and potential of interacting with three-dimensional sound sources within a virtual environment. The intention of the software application is to allow a user to become engaged with a collection of sound sources that can be perceived both…

This paper outlines the development of a software application that explores the plausibility and potential of interacting with three-dimensional sound sources within a virtual environment. The intention of the software application is to allow a user to become engaged with a collection of sound sources that can be perceived both graphically and audibly within a spatial, three-dimensional context. The three-dimensional sound perception is driven primarily by a binaural implementation of a higher order ambisonics framework while graphics and other data are processed by openFrameworks, an interactive media framework for C++. Within the application, sound sources have been given behavioral functions such as flocking or orbit patterns, animating their positions within the environment. The author will summarize the design process and rationale for creating such a system and the chosen approach to implement the software application. The paper will also provide background approaches to spatial audio, gesture and virtual reality embodiment, and future possibilities for the existing project.

ContributorsBurnett, Garrett (Author) / Paine, Garth (Thesis director) / Pavlic, Theodore (Committee member) / School of Humanities, Arts, and Cultural Studies (Contributor) / School of Arts, Media and Engineering (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Developing an Educational Manufacturing Simulation

Description

Simulation games are widely used in engineering education, especially for industrial engineering and operations management. A well-made simulation game aids in achieving learning objectives for students and minimal additional teaching by an instructor. Many simulation games exist for engineering education, but newer technologies now exist that improve the overall experience…

Simulation games are widely used in engineering education, especially for industrial engineering and operations management. A well-made simulation game aids in achieving learning objectives for students and minimal additional teaching by an instructor. Many simulation games exist for engineering education, but newer technologies now exist that improve the overall experience of developing and using these games. Although current solutions teach concepts adequately, poorly-maintained platforms distract from the key learning objectives, detracting from the value of the activities. A backend framework was created to facilitate an educational, competitive, participatory simulation of a manufacturing system that is intended to be easy to maintain, deploy, and expand.

ContributorsChandler, Robert Keith (Author) / Clough, Michael (Thesis director) / Pavlic, Theodore (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-12

Mechanisms for quorum sensing in Temnothorax

Description

Temnothorax ants are a model species for studying collective decision-making. When presented with multiple nest sites, they are able to collectively select the best one and move the colony there. When a scout encounters a nest site, she will spend some time exploring it. In theory she should explore the…

Temnothorax ants are a model species for studying collective decision-making. When presented with multiple nest sites, they are able to collectively select the best one and move the colony there. When a scout encounters a nest site, she will spend some time exploring it. In theory she should explore the site for long enough to determine both its quality and an estimate of the number of ants there. This ensures that she selects a good nest site and that there are enough scouts who know about the new nest site to aid her in relocating the colony. It also helps to ensure that the colony reaches a consensus rather than dividing between nest sites. When a nest site reaches a certain threshold of ants, a quorum has been reached and the colony is committed to that nest site. If a scout visits a good nest site where a quorum has not been reached, she will lead a tandem run to bring another scout there so that they can learn the way and later aid in recruitment. At a site where a quorum has been reached, scouts will instead perform transports to carry ants and brood there from the old nest. One piece that is missing in all of this is the mechanism. How is a quorum sensed? One hypothesis is that the encounter rate (average number of encounters with nest mates per second) that an ant experiences at a nest site allows her to estimate the population at that site and determine whether a quorum has been reached. In this study, encounter rate and entrance time were both shown to play a role in whether an ant decided to lead a tandem run or perform a transport. Encounter rate was shown to have a significant impact on how much time an ant spent at a nest site before making her decision, and encounter rates significantly increased as migrations progressed. It was also shown to individual ants did not differ from each other in their encounter rates, visit lengths, or entrance times preceding their first transports or tandem runs, studied across four different migrations. Ants were found to spend longer on certain types of encounters, but excluding certain types of encounters from the encounter rate was not found to change the correlations that were observed. It was also found that as the colony performed more migrations, it became significantly faster at moving to the new nest.

ContributorsJohnson, Christal Marie (Author) / Pratt, Stephen (Thesis director) / Pavlic, Theodore (Committee member) / Shaffer, Zachary (Committee member) / Barrett, The Honors College (Contributor) / School of Life Sciences (Contributor)

Created2013-05

Optimization of Incoming Inspection

Description

The first step in process improvement is to scope the problem, next is measure the current process, but if data is not readily available and cannot be manually collected, then a measurement system must be implemented. General Dynamics Mission Systems (GDMS) is a lean company that is always seeking to…

The first step in process improvement is to scope the problem, next is measure the current process, but if data is not readily available and cannot be manually collected, then a measurement system must be implemented. General Dynamics Mission Systems (GDMS) is a lean company that is always seeking to improve. One of their current bottlenecks is the incoming inspection department. This department is responsible for finding defects on parts purchased and is critical to the high reliability product produced by GDMS. To stay competitive and hold their market share, a decision was made to optimize incoming inspection. This proved difficult because no data is being collected. Early steps in many process improvement methodologies, such as Define, Measure, Analyze, Improve and Control (DMAIC), include data collection; however, no measurement system was in place, resulting in no available data for improvement. The solution to this problem was to design and implement a Management Information System (MIS) that will track a variety of data. This will provide the company with data that will be used for analysis and improvement. The first stage of the MIS was developed in Microsoft Excel with Visual Basic for Applications because of the low cost and overall effectiveness of the software. Excel allows update to be made quickly, and allows GDMS to collect data immediately. Stage two would be moving the MIS to a more practicable software, such as Access or MySQL. This thesis is only focuses on stage one of the MIS, and GDMS will proceed with stage two.

ContributorsDiaz, Angel (Author) / McCarville, Daniel R. (Thesis director) / Pavlic, Theodore (Committee member) / Industrial, Systems (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

Statistical Analysis of Power Differences between Experimental Design Software Packages

Description

Based on findings of previous studies, there was speculation that two well-known experimental design software packages, JMP and Design Expert, produced varying power outputs given the same design and user inputs. For context and scope, another popular experimental design software package, Minitab® Statistical Software version 17, was added to the…

Based on findings of previous studies, there was speculation that two well-known experimental design software packages, JMP and Design Expert, produced varying power outputs given the same design and user inputs. For context and scope, another popular experimental design software package, Minitab® Statistical Software version 17, was added to the comparison. The study compared multiple test cases run on the three software packages with a focus on 2k and 3K factorial design and adjusting the standard deviation effect size, number of categorical factors, levels, number of factors, and replicates. All six cases were run on all three programs and were attempted to be run at one, two, and three replicates each. There was an issue at the one replicate stage, however—Minitab does not allow for only one replicate full factorial designs and Design Expert will not provide power outputs for only one replicate unless there are three or more factors. From the analysis of these results, it was concluded that the differences between JMP 13 and Design Expert 10 were well within the margin of error and likely caused by rounding. The differences between JMP 13, Design Expert 10, and Minitab 17 on the other hand indicated a fundamental difference in the way Minitab addressed power calculation compared to the latest versions of JMP and Design Expert. This was found to be likely a cause of Minitab’s dummy variable coding as its default instead of the orthogonal coding default of the other two. Although dummy variable and orthogonal coding for factorial designs do not show a difference in results, the methods affect the overall power calculations. All three programs can be adjusted to use either method of coding, but the exact instructions for how are difficult to find and thus a follow-up guide on changing the coding for factorial variables would improve this issue.

ContributorsArmstrong, Julia Robin (Author) / McCarville, Daniel R. (Thesis director) / Montgomery, Douglas (Committee member) / Industrial, Systems (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

A Strategy for Improved Traffic Flow

Description

Commuting is a significant cost in time and in travel expenses for working individuals and a major contributor to emissions in the United States. This project focuses on increasing the efficiency of an intersection through the use of "light metering." Light metering involves a series of lights leading up to…

Commuting is a significant cost in time and in travel expenses for working individuals and a major contributor to emissions in the United States. This project focuses on increasing the efficiency of an intersection through the use of "light metering." Light metering involves a series of lights leading up to an intersection forcing cars to stop further away from the final intersection in smaller queues instead of congregating in a large queue before the final intersection. The simulation software package AnyLogic was used to model a simple two-lane intersection with and without light metering. It was found that light metering almost eliminates start-up delay by preventing a long queue to form in front of the modeled intersection. Shorter queue lengths and reduction in the start-up delays prevents cycle failure and significantly reduces the overall delay for the intersection. However, frequent deceleration and acceleration for a few of the cars occurs before each light meter. This solution significantly reduces the traffic density before the intersection and the overall delay but does not appear to be a better emission alternative due to an increase in acceleration. Further research would need to quantify the difference in emissions for this model compared to a standard intersection.

ContributorsGlavin, Erin (Author) / Pavlic, Theodore (Thesis director) / Sefair, Jorge (Committee member) / Industrial, Systems and Operations Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Graph Analysis of Arctic Ice

Description

Polar ice masses can be valuable indicators of trends in global climate. In an effort to better understand the dynamics of Arctic ice, this project analyzes sea ice concentration anomaly data collected over gridded regions (cells) and builds graphs based upon high correlations between cells. These graphs offer the opportunity…

Polar ice masses can be valuable indicators of trends in global climate. In an effort to better understand the dynamics of Arctic ice, this project analyzes sea ice concentration anomaly data collected over gridded regions (cells) and builds graphs based upon high correlations between cells. These graphs offer the opportunity to use metrics such as clustering coefficients and connected components to isolate representative trends in ice masses. Based upon this analysis, the structure of sea ice graphs differs at a statistically significant level from random graphs, and several regions show erratically decreasing trends in sea ice concentration.

ContributorsWallace-Patterson, Chloe Rae (Author) / Syrotiuk, Violet (Thesis director) / Colbourn, Charles (Committee member) / Montgomery, Douglas (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2013-05