Matching Items (2)
Filtering by

Clear all filters

133424-Thumbnail Image.png
Description
Effective communication and engineering are not a natural pairing. The incongruence is because engineering students are focused on making, designing and analyzing. Since these are the core functions of the field there is not a direct focus on developing communication skills. This honors thesis explores the role and expectations for

Effective communication and engineering are not a natural pairing. The incongruence is because engineering students are focused on making, designing and analyzing. Since these are the core functions of the field there is not a direct focus on developing communication skills. This honors thesis explores the role and expectations for student engineers within the undergraduate engineering education experience to present and communicate ideas. The researchers interviewed faculty about their perspective on students' abilities with respect to their presentation skills to inform the design of a workshop series of interventions intended to make engineering students better communicators.
ContributorsAlbin, Joshua Alexander (Co-author) / Brancati, Sara (Co-author) / Lande, Micah (Thesis director) / Martin, Thomas (Committee member) / Industrial, Systems and Operations Engineering Program (Contributor) / Software Engineering (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
Description

The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data

The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data which is very useful for solving problems in natural language processing and computer vision. There are other approaches to solving these problems which have been implemented in the past (i.e., convolutional neural networks and recurrent neural networks), but these architectures introduce the issue of the vanishing gradient problem when an input becomes too long (which essentially means the network loses its memory and halts learning) and have a slow training time in general. The transformer architecture’s features enable a much better “memory” and a faster training time, which makes it a more optimal architecture in solving problems. Most of this project will be spent producing a survey that captures the current state of research on the transformer, and any background material to understand it. First, I will do a keyword search of the most well cited and up-to-date peer reviewed publications on transformers to understand them conceptually. Next, I will investigate any necessary programming frameworks that will be required to implement the architecture. I will use this to implement a simplified version of the architecture or follow an easy to use guide or tutorial in implementing the architecture. Once the programming aspect of the architecture is understood, I will then Implement a transformer based on the academic paper “Attention is All You Need”. I will then slightly tweak this model using my understanding of the architecture to improve performance. Once finished, the details (i.e., successes, failures, process and inner workings) of the implementation will be evaluated and reported, as well as the fundamental concepts surveyed. The motivation behind this project is to explore the rapidly growing area of AI algorithms, and the transformer algorithm in particular was chosen because it is a major milestone for engineering with AI and software. Since their introduction, transformers have provided a very effective way of solving natural language processing, which has allowed any related applications to succeed with high speed while maintaining accuracy. Since then, this type of model can be applied to more cutting edge natural language processing applications, such as extracting semantic information from a text description and generating an image to satisfy it.

ContributorsCereghini, Nicola (Author) / Acuna, Ruben (Thesis director) / Bansal, Ajay (Committee member) / Barrett, The Honors College (Contributor) / Software Engineering (Contributor)
Created2023-05