Filtering by
- All Subjects: language modeling
- All Subjects: NVIDIA
- Creators: Software Engineering
- Creators: Coomber, Wesley Poblete
- Member of: Theses and Dissertations
The aim of this project is to understand the basic algorithmic components of the transformer deep learning architecture. At a high level, a transformer is a machine learning model based off of a recurrent neural network that adopts a self-attention mechanism, which can weigh significant parts of sequential input data which is very useful for solving problems in natural language processing and computer vision. There are other approaches to solving these problems which have been implemented in the past (i.e., convolutional neural networks and recurrent neural networks), but these architectures introduce the issue of the vanishing gradient problem when an input becomes too long (which essentially means the network loses its memory and halts learning) and have a slow training time in general. The transformer architecture’s features enable a much better “memory” and a faster training time, which makes it a more optimal architecture in solving problems. Most of this project will be spent producing a survey that captures the current state of research on the transformer, and any background material to understand it. First, I will do a keyword search of the most well cited and up-to-date peer reviewed publications on transformers to understand them conceptually. Next, I will investigate any necessary programming frameworks that will be required to implement the architecture. I will use this to implement a simplified version of the architecture or follow an easy to use guide or tutorial in implementing the architecture. Once the programming aspect of the architecture is understood, I will then Implement a transformer based on the academic paper “Attention is All You Need”. I will then slightly tweak this model using my understanding of the architecture to improve performance. Once finished, the details (i.e., successes, failures, process and inner workings) of the implementation will be evaluated and reported, as well as the fundamental concepts surveyed. The motivation behind this project is to explore the rapidly growing area of AI algorithms, and the transformer algorithm in particular was chosen because it is a major milestone for engineering with AI and software. Since their introduction, transformers have provided a very effective way of solving natural language processing, which has allowed any related applications to succeed with high speed while maintaining accuracy. Since then, this type of model can be applied to more cutting edge natural language processing applications, such as extracting semantic information from a text description and generating an image to satisfy it.
With the recent focus of attention towards remote work and mobile computing, the possibility of taking a powerful workstation wherever needed is enticing. However, even emerging laptops today struggle to compete with desktops in terms of cost, maintenance, and future upgrades. The price point of a powerful laptop is considerably higher compared to an equally powerful desktop computer, and most laptops are manufactured in a way that makes upgrading parts of the machine difficult or impossible, forcing a complete purchase in the event of failure or a component needing an upgrade. In the case where someone already owns a desktop computer and must be mobile, instead of needing to purchase a second device at full price, it may be possible to develop a low-cost computer that has just enough power to connect to the existing desktop and run all processing there, using the mobile device only as a user interface. This thesis will explore the development of a custom PCB that utilizes a Raspberry Pi Computer Module 4, as well as the development of a fork of the Open Source project Moonlight to stream a host machine's screen to a remote client. This implementation will be compared against other existing remote desktop solutions to analyze it's performance and quality.