Matching Items (26)

Filtering by

Clear all filters

131386-Thumbnail Image.png

Input-Elicitation Methods for Crowdsourced Human Computation

Description

Collecting accurate collective decisions via crowdsourcing
is challenging due to cognitive biases, varying
worker expertise, and varying subjective scales. This
work investigates new ways to determine collective decisions
by prompting users to provide input in multiple
formats. A crowdsourced task

Collecting accurate collective decisions via crowdsourcing
is challenging due to cognitive biases, varying
worker expertise, and varying subjective scales. This
work investigates new ways to determine collective decisions
by prompting users to provide input in multiple
formats. A crowdsourced task is created that aims
to determine ground-truth by collecting information in
two different ways: rankings and numerical estimates.
Results indicate that accurate collective decisions can
be achieved with less people when ordinal and cardinal
information is collected and aggregated together
using consensus-based, multimodal models. We also
show that presenting users with larger problems produces
more valuable ordinal information, and is a more
efficient way to collect an aggregate ranking. As a result,
we suggest input-elicitation to be more widely considered
for future work in crowdsourcing and incorporated
into future platforms to improve accuracy and efficiency.

Contributors

Agent

Created

Date Created
2020-05

133339-Thumbnail Image.png

Prescription Information Extraction from Electronic Health Records using BiLSTM-CRF and Word Embeddings

Description

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important aspect within these records is the presence of prescription information. Existing techniques for extracting prescription information — which includes medication names, dosages, frequencies, reasons for taking, and mode of administration — from unstructured text have focused on the application of rule- and classifier-based methods. While state-of-the-art systems can be effective in extracting many types of information, they require significant effort to develop hand-crafted rules and conduct effective feature engineering. This paper presents the use of a bidirectional LSTM with CRF tagging model initialized with precomputed word embeddings for extracting prescription information from sentences without requiring significant feature engineering. The experimental results, run on the i2b2 2009 dataset, achieve an F1 macro measure of 0.8562, and scores above 0.9449 on four of the six categories, indicating significant potential for this model.

Contributors

Agent

Created

Date Created
2018-05

133901-Thumbnail Image.png

Data Management Behind Machine Learning

Description

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering its impact from a data management perspective on both the

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering its impact from a data management perspective on both the software and hardware level. To begin this study, the universally accepted model of an artificial neuron is broken down into its key components and then analyzed for functionality by relating back to its biological counterpart. The role of a neuron is then described in the context of a neural network, with equal emphasis placed on how it individually undergoes training and then for an entire network. Using the technique of supervised learning, the neural network is trained with three main factors for housing price classification, including its total number of rooms, bathrooms, and square footage. Once trained with most of the generated data set, it is tested for accuracy by introducing the remainder of the data-set and observing how closely its computed output for each set of inputs compares to the target value. From a programming perspective, the artificial neuron is implemented in C so that it would be more closely tied to the operating system and therefore make the collected profiler data more precise during the program's execution. The program is designed to break down each stage of the neuron's training process into distinct functions. In addition to utilizing more functional code, the struct data type is used as the underlying data structure for this project to not only represent the neuron but for implementing the neuron's training and test data. Once fully trained, the neuron's test results are then graphed to visually depict how well the neuron learned from its sample training set. Finally, the profiler data is analyzed to describe how the program operated from a data management perspective on the software and hardware level.

Contributors

Agent

Created

Date Created
2018-05

Behavior Trees + Finite State Machines: A Hybrid Game AI Framework

Description

One of the core components of many video games is their artificial intelligence. Through AI, a game can tell stories, generate challenges, and create encounters for the player to overcome. Even though AI has continued to advance through the implementation

One of the core components of many video games is their artificial intelligence. Through AI, a game can tell stories, generate challenges, and create encounters for the player to overcome. Even though AI has continued to advance through the implementation of neural networks and machine learning, game AI tends to implement a series of states or decisions instead to give the illusion of intelligence. Despite this limitation, games can still generate a wide range of experiences for the player. The Hybrid Game AI Framework is an AI system that combines the benefits of two commonly used approaches to developing game AI: Behavior Trees and Finite State Machines. Developed in the Unity Game Engine and the C# programming language, this AI Framework represents the research that went into studying modern approaches to game AI and my own attempt at implementing the techniques learned. Object-oriented programming concepts such as inheritance, abstraction, and low coupling are utilized with the intent to create game AI that's easy to implement and expand upon. The final goal was to create a flexible yet structured AI data structure while also minimizing drawbacks by combining Behavior Trees and Finite State Machines.

Contributors

Created

Date Created
2018-05

132769-Thumbnail Image.png

An Analysis of IoT and AI Applications in Retail Business Environments

Description

This thesis examines the applications of the Internet of Things and Artificial Intelligence within small-to-medium sized retail businesses. These technologies have become a common aspect of a modern business environment, yet there remains a level of unfamiliarity with these concepts

This thesis examines the applications of the Internet of Things and Artificial Intelligence within small-to-medium sized retail businesses. These technologies have become a common aspect of a modern business environment, yet there remains a level of unfamiliarity with these concepts for business owners to fully utilize these tools. The complexity behind IoT and AI has been simplified to provide benefits for a brick and mortar business store in regards to security, logistics, profit optimization, operations, and analytics. While these technologies can contribute to a business’s success, they potentially come with a high and unattainable financial cost. In order to investigate which aspects of businesses can benefit the most from these technologies, interviews with small-to-medium business owners were conducted and paired with an analysis of published research. These interviews provided specific pain points and issues that could potentially be solved by these technologies. The analysis conducted in this thesis gives a detailed summary of this research and provides a business model for two small businesses to optimize their Internet of Things and Artificial Intelligence to solve these pain points, while staying in their financial budget.

Contributors

Agent

Created

Date Created
2019-05

132796-Thumbnail Image.png

Applications of Machine Learning to Animation and Computer Graphics for Optimized Real-Time Performance

Description

This thesis surveys and analyzes applications of machine learning techniques to the fields of animation and computer graphics. Data-driven techniques utilizing machine learning have in recent years been successfully applied to many subfields of animation and computer graphics. These include,

This thesis surveys and analyzes applications of machine learning techniques to the fields of animation and computer graphics. Data-driven techniques utilizing machine learning have in recent years been successfully applied to many subfields of animation and computer graphics. These include, but are not limited to, fluid dynamics, kinematics, and character modeling. I argue that such applications offer significant advantages which will be pivotal in advancing the fields of animation and computer graphics. Further, I argue these advantages are especially relevant in real-time implementations when working with finite computational resources.

Contributors

Agent

Created

Date Created
2019-05

132671-Thumbnail Image.png

Procedural Scene Generation from Natural Language

Description

While there are many existing systems which take natural language descriptions and use them to generate images or text, few systems exist to generate 3d renderings or environments based on natural language. Most of those systems are very limited in

While there are many existing systems which take natural language descriptions and use them to generate images or text, few systems exist to generate 3d renderings or environments based on natural language. Most of those systems are very limited in scope and require precise, predefined language to work, or large well tagged datasets for their models. In this project I attempt to apply concepts in NLP and procedural generation to a system which can generate a rough scene estimation of a natural language description in a 3d environment from a free use database of models. The primary objective of this system, rather than a completely accurate representation, is to generate a useful or interesting result. The use of such a system comes in assisting designers who utilize 3d scenes or environments for their work.

Contributors

Created

Date Created
2019-05

131311-Thumbnail Image.png

Predicting Outcome of a Pitch Given the Type of Pitch for any Baseball Scenario

Description

This thesis serves as a baseline for the potential for prediction through machine learning (ML) in baseball. Hopefully, it also will serve as motivation for future work to expand and reach the potential of sabermetrics, advanced Statcast data and machine

This thesis serves as a baseline for the potential for prediction through machine learning (ML) in baseball. Hopefully, it also will serve as motivation for future work to expand and reach the potential of sabermetrics, advanced Statcast data and machine learning. The problem this thesis attempts to solve is predicting the outcome of a pitch. Given proper pitch data and situational data, is it possible to predict the result or outcome of a pitch? The result or outcome refers to the specific outcome of a pitch, beyond ball or strike, but if the hitter puts the ball in play for a double, this thesis shows how I attempted to predict that type of outcome. Before diving into my methods, I take a deep look into sabermetrics, advanced statistics and the history of the two in Major League Baseball. After this, I describe my implemented machine learning experiment. First, I found a dataset that is suitable for training a pitch prediction model, I then analyzed the features and used some feature engineering to select a set of 16 features, and finally, I trained and tested a pair of ML models on the data. I used a decision tree classifier and random forest classifier to test the data. I attempted to us a long short-term memory to improve my score, but came up short. Each classifier performed at around 60% accuracy. I also experimented using a neural network approach with a long short-term memory (LSTM) model, but this approach requires more feature engineering to beat the simpler classifiers. In this thesis, I show examples of five hitters that I test the models on and the accuracy for each hitter. This work shows promise that advanced classification models (likely requiring more feature engineering) can provide even better prediction outcomes, perhaps with 70% accuracy or higher! There is much potential for future work and to improve on this thesis, mainly through the proper construction of a neural network, more in-depth feature analysis/selection/extraction, and data visualization.

Contributors

Agent

Created

Date Created
2020-05

131892-Thumbnail Image.png

Automated Vulnerability/Adversary Testing Using AI/ML Algorithms

Description

Vulnerability testing/evaluation is a regular task for cyber-security groups. Conducting tasks like this can take up a great amount of time and may not be perfect. Automating these tasks helps speed up the rate at which experts can test systems.

Vulnerability testing/evaluation is a regular task for cyber-security groups. Conducting tasks like this can take up a great amount of time and may not be perfect. Automating these tasks helps speed up the rate at which experts can test systems. However, script based or static programs that run automatically often do not have the versatility required to properly replace human analysis. With the advances in Artificial Intelligence and Machine Learning, a utility can be developed that would allow for the creation of penetration testing plans rather than manually testing vulnerabilities. A variety of existing cyber-security programs and utilities provide an API layer that commonly interacts with the Python environment. With the commonality of AI/ML tools within the Python ecosystem, a plugin like interface can be developed to feed any AI/ML program real world data and receive a response/report in return. Using Python 2.7+, Python 3.6+, pymdptoolbox, and POMDPy, a program was developed that ingests real-world data from scanning tools and returned a suggested course of action to be used by analysts in order to perform a practical validation of the algorithms in a real world setting. This program was able to successfully navigate a test network and produce results that were expected to be found on the target machines without needing human analysis of the network. Using POMDP based systems for more cyber-security type tasks may be a valuable use case for future developments and help ease the burden faced in a rapid paced world.

Contributors

Agent

Created

Date Created
2020-05

131884-Thumbnail Image.png

AI-Based Autonomous Security Assessment Tool

Description

As automation research into penetration testing has developed, several methods have been proposed as suitable control mechanisms for use in pentesting frameworks. These include Markov Decision Processes (MDPs), partially observable Markov Decision Processes (POMDPs), and POMDPs utilizing reinforcement learning. Since

As automation research into penetration testing has developed, several methods have been proposed as suitable control mechanisms for use in pentesting frameworks. These include Markov Decision Processes (MDPs), partially observable Markov Decision Processes (POMDPs), and POMDPs utilizing reinforcement learning. Since much work has been done automating other aspects of the pentesting process using exploit frameworks and scanning tools, this is the next focal point in this field. This paper shows a fully-integrated solution comprised of a POMDP-based planning algorithm, the Nessus scanning utility, and MITRE's CALDERA pentesting platform. These are linked in order to create an autonomous AI attack platform with scanning, planning, and attack capabilities.

Contributors

Agent

Created

Date Created
2020-05