Search Content

Ensemble Learning on Deep Neural Networks for Image Caption Generation

Description

Capturing the information in an image into a natural language sentence is

considered a difficult problem to be solved by computers. Image captioning involves not just detecting objects from images but understanding the interactions between the objects to be translated into relevant captions. So, expertise in the fields of computer vision…

Capturing the information in an image into a natural language sentence is

considered a difficult problem to be solved by computers. Image captioning involves not just detecting objects from images but understanding the interactions between the objects to be translated into relevant captions. So, expertise in the fields of computer vision paired with natural language processing are supposed to be crucial for this purpose. The sequence to sequence modelling strategy of deep neural networks is the traditional approach to generate a sequential list of words which are combined to represent the image. But these models suffer from the problem of high variance by not being able to generalize well on the training data.

The main focus of this thesis is to reduce the variance factor which will help in generating better captions. To achieve this, Ensemble Learning techniques have been explored, which have the reputation of solving the high variance problem that occurs in machine learning algorithms. Three different ensemble techniques namely, k-fold ensemble, bootstrap aggregation ensemble and boosting ensemble have been evaluated in this thesis. For each of these techniques, three output combination approaches have been analyzed. Extensive experiments have been conducted on the Flickr8k dataset which has a collection of 8000 images and 5 different captions for every image. The bleu score performance metric, which is considered to be the standard for evaluating natural language processing (NLP) problems, is used to evaluate the predictions. Based on this metric, the analysis shows that ensemble learning performs significantly better and generates more meaningful captions compared to any of the individual models used.

ContributorsKatpally, Harshitha (Author) / Bansal, Ajay (Thesis advisor) / Acuna, Ruben (Committee member) / Gonzalez-Sanchez, Javier (Committee member) / Arizona State University (Publisher)

Created2019

Lambda Starship: A Video Game for Teaching Functional Programming with Lisp

Description

The functional programming paradigm is able to provide clean and concise solutions to many common programming problems, as well as promote safer, more testable code by encouraging an isolation of state-modifying behavior. Functional programming is finding its way into traditionally object-oriented and imperative languages, most notably with the introduction of…

The functional programming paradigm is able to provide clean and concise solutions to many common programming problems, as well as promote safer, more testable code by encouraging an isolation of state-modifying behavior. Functional programming is finding its way into traditionally object-oriented and imperative languages, most notably with the introduction of Java 8 and in LINQ for C#. However, no functional programming language has achieved widespread adoption, meaning that students without a formal computer science background who learn technology on-demand for personal projects or for business may not come across functional programming in a significant way. Programmers need a reason to spend time learning these concepts to not miss out on the subtle but profound benefits they provide. I propose the use of a video game as an environment in which learning functional programming is the player's goal. In this carefully constructed video game, learning functional programming is the key to progression. Players will be motivated to learn and will be given an immediate chance to test and demonstrate their understanding. The game, named Lambda Starship (stylized as (lambda () starship)), is a 3D first-person video game. It takes place in a spaceship that, due to extreme magnetic interference, has lost all on-board software while leaving the hardware completely intact. The player is tasked to write software using functional programming paradigms to replace the old software and bring the spaceship back to a working state. Throughout the process, the player is guided by an in-game manual and other descriptive resources. The game is implemented in Unity and scripted using C#. The game's educational and entertainment value was evaluated with a study case. 24 undergraduate students at Arizona State University (ASU) played the game and were surveyed detailing their experience. During play, user statistics were recorded automatically, providing a data-driven way to analyze where players struggled with the concepts introduced in the game. Reception was neutral or positive in both the entertainment and educational sides of the game. A few players expressed concerns about the manual in its form factor and engagement value.

ContributorsCompton, Tyler Alexander (Author) / Gonzalez-Sanchez, Javier (Thesis director) / Bansal, Srividya (Committee member) / Software Engineering (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

The Future of Brain-Computer Interaction: A Potential Brain-Aiding Device of the Future

Description

Brains and computers have been interacting since the invention of the computer. These two entities have worked together to accomplish a monumental set of goals, from landing man on the moon to helping to understand how the universe works on the most microscopic levels, and everything in between. As the…

Brains and computers have been interacting since the invention of the computer. These two entities have worked together to accomplish a monumental set of goals, from landing man on the moon to helping to understand how the universe works on the most microscopic levels, and everything in between. As the years have gone on, the extent and depth of interaction between brains and computers have consistently widened, to the point where computers help brains with their thinking in virtually infinite everyday situations around the world. The first purpose of this research project was to conduct a brief review for the purposes of gaining a sound understanding of how both brains and computers operate at fundamental levels, and what it is about these two entities that allow them to work evermore seamlessly as the years go on. Next, a history of interaction between brains and computers was developed, which expanded upon the first task and helped to contribute to visions of future brain-computer interaction (BCI). The subsequent and primary task of this research project was to develop a theoretical framework for a potential brain-aiding device of the future. This was done by conducting an extensive literature review regarding the most advanced BCI technology in modern times and expanding upon the findings to argue feasibility of the future device and its components. Next, social predictions regarding the acceptance and use of the new technology were made by designing and executing a survey based on the Unified Theory of the Acceptance and Use of Technology (UTAUT). Finally, general economic predictions were inferred by examining several relationships between money and computers over time.

ContributorsThum, Giuseppe Edwardo (Author) / Gaffar, Ashraf (Thesis director) / Gonzalez-Sanchez, Javier (Committee member) / College of Integrative Sciences and Arts (Contributor) / Barrett, The Honors College (Contributor)

Created2017-05

Modeling and Design Analysis of Facial Expressions of Humanoid Social Robots Using Deep Learning Techniques

Description

A lot of research can be seen in the field of social robotics that majorly concentrate on various aspects of social robots including design of mechanical parts and their move- ment, cognitive speech and face recognition capabilities. Several robots have been developed with the intention of being social, like humans,…

A lot of research can be seen in the field of social robotics that majorly concentrate on various aspects of social robots including design of mechanical parts and their move- ment, cognitive speech and face recognition capabilities. Several robots have been developed with the intention of being social, like humans, without much emphasis on how human-like they actually look, in terms of expressions and behavior. Fur- thermore, a substantial disparity can be seen in the success of results of any research involving ”humanizing” the robots’ behavior, or making it behave more human-like as opposed to research into biped movement, movement of individual body parts like arms, fingers, eyeballs, or human-like appearance itself. The research in this paper in- volves understanding why the research on facial expressions of social humanoid robots fails where it is not accepted completely in the current society owing to the uncanny valley theory. This paper identifies the problem with the current facial expression research as information retrieval problem. This paper identifies the current research method in the design of facial expressions of social robots, followed by using deep learning as similarity evaluation technique to measure the humanness of the facial ex- pressions developed from the current technique and further suggests a novel solution to the facial expression design of humanoids using deep learning.

ContributorsMurthy, Shweta (Author) / Gaffar, Ashraf (Thesis advisor) / Ghazarian, Arbi (Committee member) / Gonzalez-Sanchez, Javier (Committee member) / Arizona State University (Publisher)

Created2017

A comparative analysis of graph vs relational database for instructional module development system

Description

In today's data-driven world, every datum is connected to a large amount of data. Relational databases have been proving itself a pioneer in the field of data storage and manipulation since 1970s. But more recently they have been challenged by NoSQL graph databases in handling data models which have an…

In today's data-driven world, every datum is connected to a large amount of data. Relational databases have been proving itself a pioneer in the field of data storage and manipulation since 1970s. But more recently they have been challenged by NoSQL graph databases in handling data models which have an inherent graphical representation. Graph databases with the ability to store physical relationships between two nodes and native graph processing technique have been doing exceptionally well in graph data storage and management for applications like recommendation engines, biological modeling, network modeling, social media applications, etc.

Instructional Module Development System (IMODS) is a web-based software system that guides STEM instructors through the complex task of curriculum design, ensures tight alignment between various components of a course (i.e., learning objectives, content, assessments), and provides relevant information about research-based pedagogical and assessment strategies. The data model of IMODS is highly connected and has an inherent graphical representation between all its entities with numerous relationships between them. This thesis focuses on developing an algorithm to determine completeness of course design developed using IMODS. As part of this research objective, the study also analyzes the data model for best fit database to run these algorithms. As part of this thesis, two separate applications abstracting the data model of IMODS have been developed - one with Neo4j (graph database) and another with PostgreSQL (relational database). The research objectives of the thesis are as follows: (i) evaluate the performance of Neo4j and PostgreSQL in handling complex queries that will be fired throughout the life cycle of the course design process; (ii) devise an algorithm to determine the completeness of a course design developed using IMODS. This thesis presents the process of creating data model for PostgreSQL and converting it into a graph data model to be abstracted by Neo4j, creating SQL and CYPHER scripts for undertaking experiments on both platforms, testing and elaborate analysis of the results and evaluation of the databases in the context of IMODS.

ContributorsSaha, Abir Lal (Author) / Bansal, Srividya (Thesis advisor) / Bansal, Ajay (Committee member) / Gonzalez-Sanchez, Javier (Committee member) / Arizona State University (Publisher)

Created2017

Next-Generation Smart Cars: Towards a More Intelligent Interactive Infotainment System

Description

Today, in a world of automation, the impact of Artificial Intelligence can be seen in every aspect of our lives. Starting from smart homes to self-driving cars everything is run using intelligent, adaptive technologies. In this thesis, an attempt is made to analyze the correlation between driving quality and its…

Today, in a world of automation, the impact of Artificial Intelligence can be seen in every aspect of our lives. Starting from smart homes to self-driving cars everything is run using intelligent, adaptive technologies. In this thesis, an attempt is made to analyze the correlation between driving quality and its impact on the use of car infotainment system and vice versa and hence the driver distraction. Various internal and external driving factors have been identified to understand the dependency and seriousness of driver distraction caused due to the car infotainment system. We have seen a number UI/UX changes, speech recognition advancements in cars to reduce distraction. But reducing the number of casualties on road is still a persisting problem in hand as the cognitive load on the driver is considered to be one of the primary reasons for distractions leading to casualties. In this research, a pathway has been provided to move towards building an artificially intelligent, adaptive and interactive infotainment which is trained to behave differently by analyzing the driving quality without the intervention of the driver. The aim is to not only shift focus of the driver from screen to street view, but to also change the inherent behavior of the infotainment system based on the driving statistics at that point in time without the need for driver intervention.

ContributorsSuresh, Seema (Author) / Gaffar, Ashraf (Thesis advisor) / Sodemann, Angela (Committee member) / Gonzalez-Sanchez, Javier (Committee member) / Arizona State University (Publisher)

Created2017

Feature Adaptive Ray Tracing of Subdivision Surfaces

Description

Subdivision surfaces have gained more and more traction since it became the standard surface representation in the movie industry for many years. And Catmull-Clark subdivision scheme is the most popular one for handling polygonal meshes. After its introduction, Catmull-Clark surfaces have been extended to several eminent ways, including the handling…

Subdivision surfaces have gained more and more traction since it became the standard surface representation in the movie industry for many years. And Catmull-Clark subdivision scheme is the most popular one for handling polygonal meshes. After its introduction, Catmull-Clark surfaces have been extended to several eminent ways, including the handling of boundaries, infinitely sharp creases, semi-sharp creases, and hierarchically defined detail. For ray tracing of subdivision surfaces, a common way is to construct spatial bounding volume hierarchies on top of input control mesh. However, a high-level refined subdivision surface not only requires a substantial amount of memory storage, but also causes slow and inefficient ray tracing. In this thesis, it presents a new way to improve the efficiency of ray tracing of subdivision surfaces, while the quality is not as good as general methods.

ContributorsKe, Shujian (Author) / Amresh, Ashish (Thesis advisor) / Femiani, John (Committee member) / Gonzalez-Sanchez, Javier (Committee member) / Arizona State University (Publisher)

Created2017

Affect-driven self-adaptation: a manufacturing vision with a software product line paradigm

Description

Affect signals what humans care about and is involved in rational decision-making and action selection. Many technologies may be improved by the capability to recognize human affect and to respond adaptively by appropriately modifying their operation. This capability, named affect-driven self-adaptation, benefits systems as diverse as learning environments, healthcare applications,…

Affect signals what humans care about and is involved in rational decision-making and action selection. Many technologies may be improved by the capability to recognize human affect and to respond adaptively by appropriately modifying their operation. This capability, named affect-driven self-adaptation, benefits systems as diverse as learning environments, healthcare applications, and video games, and indeed has the potential to improve systems that interact intimately with users across all sectors of society. The main challenge is that existing approaches to advancing affect-driven self-adaptive systems typically limit their applicability by supporting the creation of one-of-a-kind systems with hard-wired affect recognition and self-adaptation capabilities, which are brittle, costly to change, and difficult to reuse. A solution to this limitation is to leverage the development of affect-driven self-adaptive systems with a manufacturing vision.

This dissertation demonstrates how using a software product line paradigm can jumpstart the development of affect-driven self-adaptive systems with that manufacturing vision. Applying a software product line approach to the affect-driven self-adaptive domain provides a comprehensive, flexible and reusable infrastructure of components with mechanisms to monitor a user’s affect and his/her contextual interaction with a system, to detect opportunities for improvements, to select a course of action, and to effect changes. It also provides a domain-specific architecture and well-documented process guidelines, which facilitate an understanding of the organization of affect-driven self-adaptive systems and their implementation by systematically customizing the infrastructure to effectively address the particular requirements of specific systems.

The software product line approach is evaluated by applying it in the development of learning environments and video games that demonstrate the significant potential of the solution, across diverse development scenarios and applications.

The key contributions of this work include extending self-adaptive system modeling, implementing a reusable infrastructure, and leveraging the use of patterns to exploit the commonalities between systems in the affect-driven self-adaptation domain.

ContributorsGonzalez-Sanchez, Javier (Author) / Burleson, Winslow (Thesis advisor) / Collofello, James (Thesis advisor) / Garlan, David (Committee member) / Sarjoughian, Hessam S. (Committee member) / Atkinson, Robert (Committee member) / Arizona State University (Publisher)

Created2016

Affective Computing and the Association for Computing’s Code of Ethics

Description

Affective computing allows computers to monitor and influence people’s affects, in other words emotions. Currently, there is a lot of research exploring what can be done with this technology. There are many fields, such as education, healthcare, and marketing, that this technology can transform. However, it is important to question…

Affective computing allows computers to monitor and influence people’s affects, in other words emotions. Currently, there is a lot of research exploring what can be done with this technology. There are many fields, such as education, healthcare, and marketing, that this technology can transform. However, it is important to question what should be done. There are unique ethical considerations in regards to affective computing that haven't been explored. The purpose of this study is to understand the user’s perspective of affective computing in regards to the Association of Computing Machinery (ACM) Code of Ethics, to ultimately start developing a better understanding of these ethical concerns. For this study, participants were required to watch three different videos and answer a questionnaire, all while wearing an Emotiv EPOC+ EEG headset that measures their emotions. Using the information gathered, the study explores the ethics of affective computing through the user’s perspective.

ContributorsInjejikian, Angelica (Author) / Gonzalez-Sanchez, Javier (Thesis director) / Chavez-Echeagaray, Maria Elena (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

A Neural Network Model for a Tutoring Companion Supporting Students in a Programming with Java Course

Description

Feedback represents a vital component of the learning process and is especially important for Computer Science students. With class sizes that are often large, it can be challenging to provide individualized feedback to students. Consistent, constructive, supportive feedback through a tutoring companion can scaffold the learning process for students.

This work…

Feedback represents a vital component of the learning process and is especially important for Computer Science students. With class sizes that are often large, it can be challenging to provide individualized feedback to students. Consistent, constructive, supportive feedback through a tutoring companion can scaffold the learning process for students.

This work contributes to the construction of a tutoring companion designed to provide this feedback to students. It aims to bridge the gap between the messages the compiler delivers, and the support required for a novice student to understand the problem and fix their code. Particularly, it provides support for students learning about recursion in a beginning university Java programming course. Besides also providing affective support, a tutoring companion could be more effective when it is embedded into the environment that the student is already using, instead of an additional tool for the student to learn. The proposed Tutoring Companion is embedded into the Eclipse Integrated Development Environment (IDE).

This thesis focuses on the reasoning model for the Tutoring Companion and is developed using the techniques of a neural network. While a student uses the IDE, the Tutoring Companion collects 16 data points, including the presence of certain key words, cyclomatic complexity, and error messages from the compiler, every time it detects an event, such as a run attempt, debug attempt, or a request for help, in the IDE. This data is used as inputs to the neural network. The neural network produces a correlating single output code for the feedback to be provided to the student, which is displayed in the IDE.

The effectiveness of the approach is examined among 38 Computer Science students who solve a programming assignment while the Tutoring Companion assists them. Data is collected from these interactions, including all inputs and outputs for the neural network, and students are surveyed regarding their experience. Results suggest that students feel supported while working with the Companion and promising potential for using a neural network with an embedded companion in the future. Challenges in developing an embedded companion are discussed, as well as opportunities for future work.

ContributorsDay, Melissa (Author) / Gonzalez-Sanchez, Javier (Thesis advisor) / Bansal, Ajay (Committee member) / Mehlhase, Alexandra (Committee member) / Arizona State University (Publisher)

Created2019