Matching Items (223)

136442-Thumbnail Image.png

Optimal Modeling of Knots in Wood

Description

A model has been developed to modify Euler-Bernoulli beam theory for wooden beams, using visible properties of wood knot-defects. Treating knots in a beam as a system of two ellipses

A model has been developed to modify Euler-Bernoulli beam theory for wooden beams, using visible properties of wood knot-defects. Treating knots in a beam as a system of two ellipses that change the local bending stiffness has been shown to improve the fit of a theoretical beam displacement function to edge-line deflection data extracted from digital imagery of experimentally loaded beams. In addition, an Ellipse Logistic Model (ELM) has been proposed, using L1-regularized logistic regression, to predict the impact of a knot on the displacement of a beam. By classifying a knot as severely positive or negative, vs. mildly positive or negative, ELM can classify knots that lead to large changes to beam deflection, while not over-emphasizing knots that may not be a problem. Using ELM with a regression-fit Young's Modulus on three-point bending of Douglass Fir, it is possible estimate the effects a knot will have on the shape of the resulting displacement curve.

Contributors

Created

Date Created
  • 2015-05

136386-Thumbnail Image.png

Query System for epiDMS and EnergyPlus

Description

With the development of technology, there has been a dramatic increase in the number of machine learning programs. These complex programs make conclusions and can predict or perform actions based

With the development of technology, there has been a dramatic increase in the number of machine learning programs. These complex programs make conclusions and can predict or perform actions based off of models from previous runs or input information. However, such programs require the storing of a very large amount of data. Queries allow users to extract only the information that helps for their investigation. The purpose of this thesis was to create a system with two important components, querying and visualization. Metadata was stored in Sedna as XML and time series data was stored in OpenTSDB as JSON. In order to connect the two databases, the time series ID was stored as a metric in the XML metadata. Queries should be simple, flexible, and return all data that fits the query parameters. The query language used was an extension of XQuery FLWOR that added time series parameters. Visualization should be easily understood and be organized in a way to easily find important information and details. Because of the possibility of a large amount of data being returned from a query, a multivariate heat map was used to visualize the time series results. The two programs that the system performed queries on was Energy Plus and Epidemic Simulation Data Management System. By creating such a system, it would be easier for people of the project's fields to find the relationship between metadata that leads to the desired results over time. Over the time of the thesis project, the overall software was completed, however the software must be optimized in order to take the enormous amount of data expected from the system.

Contributors

Agent

Created

Date Created
  • 2015-05

134706-Thumbnail Image.png

Open-Source Feature Selection Tool for Medical Imaging Diagnosis

Description

Open source image analytics and data mining software are widely available but can be overly-complicated and non-intuitive for medical physicians and researchers to use. The ASU-Mayo Clinic Imaging Informatics Lab

Open source image analytics and data mining software are widely available but can be overly-complicated and non-intuitive for medical physicians and researchers to use. The ASU-Mayo Clinic Imaging Informatics Lab has developed an in-house pipeline to process medical images, extract imaging features, and develop multi-parametric models to assist disease staging and diagnosis. The tools have been extensively used in a number of medical studies including brain tumor, breast cancer, liver cancer, Alzheimer's disease, and migraine. Recognizing the need from users in the medical field for a simplified interface and streamlined functionalities, this project aims to democratize this pipeline so that it is more readily available to health practitioners and third party developers.

Contributors

Agent

Created

Date Created
  • 2016-12

Learning Users Visual Preferences: Building a Recommendation System for Instagram

Description

Social media users are inundated with information. Especially on Instagram--a social media service based on sharing photos--where for many users, missing important posts is a common issue. By creating a

Social media users are inundated with information. Especially on Instagram--a social media service based on sharing photos--where for many users, missing important posts is a common issue. By creating a recommendation system which learns each user's preference and gives them a curated list of posts, the information overload issue can be mediated in order to enhance the user experience for Instagram users. This paper explores methods for creating such a recommendation system. The proposed method employs a learning model called ``Factorization Machines" which combines the advantages of linear models and latent factor models. In this work I derived features from Instagram post data, including the image, social data about the post, and information about the user who created the post. I also collect user-post interaction data describing which users ``liked" which posts, and this was used in models leveraging latent factors. The proposed model successfully improves the rate of interesting content seen by the user by anywhere from 2 to 12 times.

Contributors

Agent

Created

Date Created
  • 2016-12

135056-Thumbnail Image.png

Reddit Predicts Swings in the Stock Market: r/WorldNews and Using Machine Learning to Predict Changes in Stock Price

Description

In this paper, I will show that news headlines of global events can predict changes in stock price by using Machine Learning and eight years of data from r/WorldNews, a

In this paper, I will show that news headlines of global events can predict changes in stock price by using Machine Learning and eight years of data from r/WorldNews, a popular forum on Reddit.com. My data is confined to the top 25 daily posts on the forum, and due to the implicit filtering mechanism in the online community, these 25 posts are representative of the most popular news headlines and influential global events of the day. Hence, these posts shine a light on how large-scale social and political events affect the stock market. Using a Logistic Regression and a Naive Bayes classifier, I am able to predict with approximately 85% accuracy a binary change in stock price using term-feature vectors gathered from the news headlines. The accuracy, precision and recall results closely rival the best models in this field of research. In addition to the results, I will also describe the mathematical underpinnings of the two models; preceded by a general investigation of the intersection between the multiple academic disciplines related to this project. These range from social to computer science and from statistics to philosophy. The goal of this additional discussion is to further illustrate the interdisciplinary nature of the research and hopefully inspire a non-monolithic mindset when further investigations are pursued.

Contributors

Created

Date Created
  • 2016-12

133901-Thumbnail Image.png

Data Management Behind Machine Learning

Description

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering

This thesis dives into the world of artificial intelligence by exploring the functionality of a single layer artificial neural network through a simple housing price classification example while simultaneously considering its impact from a data management perspective on both the software and hardware level. To begin this study, the universally accepted model of an artificial neuron is broken down into its key components and then analyzed for functionality by relating back to its biological counterpart. The role of a neuron is then described in the context of a neural network, with equal emphasis placed on how it individually undergoes training and then for an entire network. Using the technique of supervised learning, the neural network is trained with three main factors for housing price classification, including its total number of rooms, bathrooms, and square footage. Once trained with most of the generated data set, it is tested for accuracy by introducing the remainder of the data-set and observing how closely its computed output for each set of inputs compares to the target value. From a programming perspective, the artificial neuron is implemented in C so that it would be more closely tied to the operating system and therefore make the collected profiler data more precise during the program's execution. The program is designed to break down each stage of the neuron's training process into distinct functions. In addition to utilizing more functional code, the struct data type is used as the underlying data structure for this project to not only represent the neuron but for implementing the neuron's training and test data. Once fully trained, the neuron's test results are then graphed to visually depict how well the neuron learned from its sample training set. Finally, the profiler data is analyzed to describe how the program operated from a data management perspective on the software and hardware level.

Contributors

Agent

Created

Date Created
  • 2018-05

134011-Thumbnail Image.png

Machine Learning Enabled Analytics for Health-Related Demographics: a Case Study Identifying Important Factors in Cardiac Disease

Description

Machine learning for analytics has exponentially increased in the past few years due to its ability to identify hidden insights in data. It also has a plethora of applications in

Machine learning for analytics has exponentially increased in the past few years due to its ability to identify hidden insights in data. It also has a plethora of applications in healthcare ranging from improving image recognition in CT scans to extracting semantic meaning from thousands of medical form PDFs. Currently in the BioElectrical Systems and Technology Lab, there is a biosensor in development that retrieves and analyzes data manually. In a proof of concept, this project uses the neural network architecture to automatically parse and classify a cardiac disease data set as well as explore health related factors impacting cardiac disease in patients of all ages.

Contributors

Created

Date Created
  • 2018-05

134107-Thumbnail Image.png

Analyzing LinkedIn Profiles Using Machine Learning

Description

Understanding the necessary skills required to work in an industry is a difficult task with many potential uses. By being able to predict the industry of a person based on

Understanding the necessary skills required to work in an industry is a difficult task with many potential uses. By being able to predict the industry of a person based on their skills, professional social networks could make searching better with automated tagging, advertisers can target more carefully, and students can better find a career path that fits their skillset. The aim in this project is to apply deep learning to the world of professional networking. Deep Learning is a type of machine learning that has recently been making breakthroughs in the analysis of complex datasets that previously were not of much use. Initially the goal was to apply deep learning to the skills-to-company relationship, but a lack of quality data required a change to the skills-to-industry relationship. To accomplish the new goal, a database of LinkedIn profiles that are part of various industries was gathered and processed. From this dataset a model was created to take a list of skills and output an industry that people with those skills work in. Such a model has value in the insights that it forms allowing candidates to: determine what industry fits a skillset, identify key skills for industries, and locate which industries possible candidates may best fit in. Various models were trained and tested on a skill to industry dataset. The model was able to learn similarities between industries, and predict the most likely industries for each profiles skillset.

Contributors

Agent

Created

Date Created
  • 2017-12

131260-Thumbnail Image.png

Machine Learning: A Sentiment Analysis of Customer Reviews

Description

Machine learning is the process of training a computer with algorithms to learn from data and make informed predictions. In a world where large amounts of data are constantly collected,

Machine learning is the process of training a computer with algorithms to learn from data and make informed predictions. In a world where large amounts of data are constantly collected, machine learning is an important tool to analyze this data to find patterns and learn useful information from it. Machine learning applications expand to numerous fields; however, I chose to focus on machine learning with a business perspective for this thesis, specifically e-commerce.

The e-commerce market utilizes information to target customers and drive business. More and more online services have become available, allowing consumers to make purchases and interact with an online system. For example, Amazon is one of the largest Internet-based retail companies. As people shop through this website, Amazon gathers huge amounts of data on its customers from personal information to shopping history to viewing history. After purchasing a product, the customer may leave reviews and give a rating based on their experience. Performing analytics on all of this data can provide insights into making more informed business and marketing decisions that can lead to business growth and also improve the customer experience.
For this thesis, I have trained binary classification models on a publicly available product review dataset from Amazon to predict whether a review has a positive or negative sentiment. The sentiment analysis process includes analyzing and encoding the human language, then extracting the sentiment from the resulting values. In the business world, sentiment analysis provides value by revealing insights into customer opinions and their behaviors. In this thesis, I will explain how to perform a sentiment analysis and analyze several different machine learning models. The algorithms for which I compared the results are KNN, Logistic Regression, Decision Trees, Random Forest, Naïve Bayes, Linear Support Vector Machines, and Support Vector Machines with an RBF kernel.

Contributors

Agent

Created

Date Created
  • 2020-05

131274-Thumbnail Image.png

Improving upon the State-of-the-Art in Multimodal Emotional Recognition in Dialogue

Description

Emotion recognition in conversation has applications within numerous domains such as affective computing and medicine. Recent methods for emotion recognition jointly utilize conversational data over several modalities including audio, video,

Emotion recognition in conversation has applications within numerous domains such as affective computing and medicine. Recent methods for emotion recognition jointly utilize conversational data over several modalities including audio, video, and text. However, state-of-the-art frameworks for this task do not focus on the feature extraction and feature fusion steps of this process. This thesis aims to improve the state-of-the-art method by incorporating two components to better accomplish these steps. By doing so, we are able to produce improved representations for the text modality and better model the relationships between all modalities. This paper proposes two methods which focus on these concepts and provide improved accuracy over the state-of-the-art framework for multimodal emotion recognition in dialogue.

Contributors

Agent

Created

Date Created
  • 2020-05