Matching Items (6)
Filtering by

Clear all filters

134294-Thumbnail Image.png
Description
Global violent conflict has become an increasing problem in recent decades, especially in the African continent. Civil wars, terrorism, riots, and political violence has wrought havoc not only on civilian lives, but also on economic foundations. Trade networks are a way to measure these economic foundations. To summarize trade networks

Global violent conflict has become an increasing problem in recent decades, especially in the African continent. Civil wars, terrorism, riots, and political violence has wrought havoc not only on civilian lives, but also on economic foundations. Trade networks are a way to measure these economic foundations. To summarize trade networks clustering coefficient as well as trade quantity/value summation measures are used. To understand effects of global trade on violent conflict, Pearson product-moment correlations are utilized. This work details a comparison of African national economies and violent conflict events using clustering coefficient, trade summation measures and Pearson correlation coefficient.
ContributorsKadambi, Sagarika Sanjay (Author) / Maciejewski, Ross (Thesis director) / Shutters, Shade (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2017-05
137197-Thumbnail Image.png
Description
This work explores the development of a visual analytics tool for geodemographic exploration in an online environment. We mine 78 million records from the United States white pages, link the location data to demographic data (specifically income) from the United States Census Bureau, and allow users to interactively compare distributions

This work explores the development of a visual analytics tool for geodemographic exploration in an online environment. We mine 78 million records from the United States white pages, link the location data to demographic data (specifically income) from the United States Census Bureau, and allow users to interactively compare distributions of names with regards to spatial location similarity and income. In order to enable interactive similarity exploration, we explore methods of pre-processing the data as well as on-the-fly lookups. As data becomes larger and more complex, the development of appropriate data storage and analytics solutions has become even more critical when enabling online visualization. We discuss problems faced in implementation, design decisions and directions for future work.
ContributorsIbarra, Jose Luis (Author) / Maciejewski, Ross (Thesis director) / Mack, Elizabeth (Committee member) / Longley, Paul (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2014-05
137156-Thumbnail Image.png
Description
Due to the popularity of the movie industry, a film's opening weekend box-office performance is of great interest not only to movie studios, but to the general public, as well. In hopes of maximizing a film's opening weekend revenue, movie studios invest heavily in pre-release advertisement. The most visible advertisement

Due to the popularity of the movie industry, a film's opening weekend box-office performance is of great interest not only to movie studios, but to the general public, as well. In hopes of maximizing a film's opening weekend revenue, movie studios invest heavily in pre-release advertisement. The most visible advertisement is the movie trailer, which, in no more than two minutes and thirty seconds, serves as many people's first introduction to a film. The question, however, is how can we be confident that a trailer will succeed in its promotional task, and bring about the audience a studio expects? In this thesis, we use machine learning classification techniques to determine the effectiveness of a movie trailer in the promotion of its namesake. We accomplish this by creating a predictive model that automatically analyzes the audio and visual characteristics of a movie trailer to determine whether or not a film's opening will be successful by earning at least 35% of a film's production budget during its first U.S. box office weekend. Our predictive model performed reasonably well, achieving an accuracy of 68.09% in a binary classification. Accuracy increased to 78.62% when including genre in our predictive model.
ContributorsWilliams, Terrance D'Mitri (Author) / Pon-Barry, Heather (Thesis director) / Zafarani, Reza (Committee member) / Maciejewski, Ross (Committee member) / Barrett, The Honors College (Contributor) / Computer Science and Engineering Program (Contributor)
Created2014-05
134339-Thumbnail Image.png
Description
Implementing a distributed algorithm is more complicated than implementing a non-distributed algorithm. This is because distributed systems involve coordination of different processes each of which has a partial view of the global system state. The only way to share information in a distributed system is by message passing. Task that

Implementing a distributed algorithm is more complicated than implementing a non-distributed algorithm. This is because distributed systems involve coordination of different processes each of which has a partial view of the global system state. The only way to share information in a distributed system is by message passing. Task that are straightforward in a non-distributed system, like deciding on the value of a global system state, can be quite complicated to achieve in a distributed system [1]. On top of the difficulties caused by the distributed nature of the computations, distributed systems typically need to be able to operate normally even if some of the nodes in the system are faulty which further adds to the uncertainty that processes have about the global state. Many factors make the implementation of a distributed algorithms difficult. Design patterns [2] are useful in simplifying the development of general algorithms. A design pattern describes a high level solution to a common, abstract problem that many systems may face. Common structural, creational, and behavioral problems are identified and elegantly solved by design patterns. By identifying features that an algorithm uses, and framing each feature as one of the common problems that a specific design pattern solves, designing a robust implementation of an algorithm becomes more manageable. In this way, design patterns can aid the implementation of algorithms. Unfortunately, design patterns are typically not discussed when developing distributed algorithms. Because correctly developing a distributed algorithm is difficult, many papers (eg. [1], [3], [4]) focus on verifying the correctness of the developed algorithm. Papers that are more practical ([5], [6]) establish the correctness of their algorithm and that their algorithm is efficient enough to be practical. However, papers on distributed algorithms usually make little mention of design patterns. The goal of this work was to gain experience implementing distributed systems including learning the application of design patterns and the application of related technical topics. This was achieved by implementing a currently unpublished algorithm that is tentatively called Bakery Consensus. Bakery Consensus is a replicated state-machine protocol that can tolerate servers with Byzantine faults, but assumes non-faulty clients. The algorithm also establishes non-skipping timestamps for each operation completed by the replicated state-machine. The design of the structure, communication, and creation of the different system parts depended heavily upon the book Design Patterns [2]. After implementing the system, the success of the in implementing its various parts was based upon their ability to satisfy the SOLID [7] principles as well as their ability to establish low coupling and high cohesion [8]. The rest of this paper is organized as follows. We begin by providing background information about distributed algorithms, including replicated state-machine protocols and the Bakery Consensus algorithm. Section 3 gives a background on several design patterns and software engineering principles that were used in the development process. Section 4 discusses the well designed parts of the system that used design patterns, and how these design patterns were chosen. Section 5 discusses well designed system parts that relied upon other technical topics. Section 6 discusses system parts that need redesign. The conclusion summarizes what was accomplished by the implementation process and the lessons learned about design patterns for distributed algorithms.
ContributorsStoutenburg, Tristan Kaleb (Author) / Bazzi, Rida (Thesis director) / Richa, Andrea (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2017-05
148313-Thumbnail Image.png
Description

Education has been at the forefront of many issues in Arizona over the past several years with concerns over lack of funding sparking the Red for Ed movement. However, despite the push for educational change, there remain many barriers to education including a lack of visibility for how Arizona schools

Education has been at the forefront of many issues in Arizona over the past several years with concerns over lack of funding sparking the Red for Ed movement. However, despite the push for educational change, there remain many barriers to education including a lack of visibility for how Arizona schools are performing at a legislative district level. While there are sources of information released at a school district level, many of these are limited and can become obscure to legislators when such school districts lie on the boundary between 2 different legislative districts. Moreover, much of this information is in the form of raw spreadsheets and is often fragmented between government websites and educational organizations. As such, a visualization dashboard that clearly identifies schools and their relative performance within each legislative district would be an extremely valuable tool to legislative bodies and the Arizona public. Although this dashboard and research are rough drafts of a larger concept, they would ideally increase transparency regarding public information about these districts and allow legislators to utilize the dashboard as a tool for greater understanding and more effective policymaking.

ContributorsColyar, Justin Dallas (Author) / Michael, Katina (Thesis director) / Maciejewski, Ross (Committee member) / Tate, Luke (Committee member) / Computer Science and Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2021-05
Description
Multi-view learning, a subfield of machine learning that aims to improve model performance by training on multiple views of the data, has been studied extensively in the past decades. It is typically applied in contexts where the input features naturally form multiple groups or views. An example of a naturally

Multi-view learning, a subfield of machine learning that aims to improve model performance by training on multiple views of the data, has been studied extensively in the past decades. It is typically applied in contexts where the input features naturally form multiple groups or views. An example of a naturally multi-view context is a data set of websites, where each website is described not only by the text on the page, but also by the text of hyperlinks pointing to the page. More recently, various studies have demonstrated the initial success of applying multi-view learning on single-view data with multiple artificially constructed views. However, there lacks a systematic study regarding the effectiveness of such artificially constructed views. To bridge this gap, this thesis begins by providing a high-level overview of multi-view learning with the co-training algorithm. Co-training is a classic semi-supervised learning algorithm that takes advantage of both labelled and unlabelled examples in the data set for training. Then, the thesis presents a web-based tool developed in Python allowing users to experiment with and compare the performance of multiple view construction approaches on various data sets. The supported view construction approaches in the web-based tool include subsampling, Optimal Feature Set Partitioning, and the genetic algorithm. Finally, the thesis presents an empirical comparison of the performance of these approaches, not only against one another, but also against traditional single-view models. The findings show that a simple subsampling approach combined with co-training often outperforms both the other view construction approaches, as well as traditional single-view methods.
ContributorsAksoy, Kaan (Author) / Maciejewski, Ross (Thesis director) / He, Jingrui (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2019-12