Search Content

Thermal noise analysis of near-sensor image processing

Description

Commonly, image processing is handled on a CPU that is connected to the image sensor by a wire. In these far-sensor processing architectures, there is energy loss associated with sending data across an interconnect from the sensor to the CPU. In an effort to increase energy efficiency, near-sensor processing architectures…

Commonly, image processing is handled on a CPU that is connected to the image sensor by a wire. In these far-sensor processing architectures, there is energy loss associated with sending data across an interconnect from the sensor to the CPU. In an effort to increase energy efficiency, near-sensor processing architectures have been developed, in which the sensor and processor are stacked directly on top of each other. This reduces energy loss associated with sending data off-sensor. However, processing near the image sensor causes the sensor to heat up. Reports of thermal noise in near-sensor processing architectures motivated us to study how temperature affects image quality on a commercial image sensor and how thermal noise affects computer vision task accuracy. We analyzed image noise across nine different temperatures and three sensor configurations to determine how image noise responds to an increase in temperature. Ultimately, our team used this information, along with transient analysis of a stacked image sensor’s thermal behavior, to advise thermal management strategies that leverage the benefits of near-sensor processing and prevent accuracy loss at problematic temperatures.

ContributorsJones, Britton Steele (Author) / LiKamWa, Robert (Thesis director) / Jayasuriya, Suren (Committee member) / Watts College of Public Service & Community Solut (Contributor) / Electrical Engineering Program (Contributor, Contributor) / Barrett, The Honors College (Contributor)

Created2020-12

The Investigation of Low Cost Computer Vision Application for First Responder Co-robotics

Description

The use of Artificial Intelligence in assistive systems is growing in application and efficiency. From self-driving cars, to medical and surgical robots and industrial tasked unsupervised co-robots; the use of AI and robotics to eliminate human error in high-stress environments and perform automated tasks is something that is advancing society’s…

The use of Artificial Intelligence in assistive systems is growing in application and efficiency. From self-driving cars, to medical and surgical robots and industrial tasked unsupervised co-robots; the use of AI and robotics to eliminate human error in high-stress environments and perform automated tasks is something that is advancing society’s status quo. Not only has the understanding of co-robotics exploded in the industrial world, but in research as well. The National Science Foundation (NSF) defines co-robots as the following: “...a robot whose main purpose is to work with people or other robots to accomplish a goal” (NSF, 1). The latest iteration of their National Robotics Initiative, NRI-2.0, focuses on efforts of creating co-robots optimized for ‘scalability, customizability, lowering barriers to entry, and societal impact’(NSF, 1). While many avenues have been explored for the implementation of co-robotics to create more efficient processes and sustainable lifestyles, this project’s focus was on societal impact co-robotics in the field of human safety and well-being. Introducing a co-robotics and computer vision AI solution for first responder assistance would help bring awareness and efficiency to public safety. The use of real-time identification techniques would create a greater range of awareness for first responders in high-stress situations. A combination of environmental features collected through sensors (camera and radar) could be used to identify people and objects within certain environments where visual impairments and obstructions are high (eg. burning buildings, smoke-filled rooms, ect.). Information about situational conditions (environmental readings, locations of other occupants, etc.) could be transmitted to first responders in emergency situations, maximizing situational awareness. This would not only aid first responders in the evaluation of emergency situations, but it would provide useful data for the first responder that would help materialize the most effective course of action for said situation.

ContributorsScott, Kylel D (Author) / Benjamin, Victor (Thesis director) / Liu, Xiao (Committee member) / Engineering Programs (Contributor) / College of Integrative Sciences and Arts (Contributor) / Department of Information Systems (Contributor) / Barrett, The Honors College (Contributor)

Created2020-12

Video Captioning with Commonsense Knowledge Anchors

Description

It is not merely an aggregation of static entities that a video clip carries, but alsoa variety of interactions and relations among these entities. Challenges still remain for a video captioning system to generate natural language descriptions focusing on the prominent interest and aligning with the latent aspects beyond observations. This work presents…

It is not merely an aggregation of static entities that a video clip carries, but alsoa variety of interactions and relations among these entities. Challenges still remain for a video captioning system to generate natural language descriptions focusing on the prominent interest and aligning with the latent aspects beyond observations. This work presents a Commonsense knowledge Anchored Video cAptioNing (dubbed as CAVAN) approach. CAVAN exploits inferential commonsense knowledge to assist the training of video captioning model with a novel paradigm for sentence-level semantic alignment. Specifically, commonsense knowledge is queried to complement per training caption by querying a generic knowledge atlas ATOMIC, and form the commonsense- caption entailment corpus. A BERT based language entailment model trained from this corpus then serves as a commonsense discriminator for the training of video captioning model, and penalizes the model from generating semantically misaligned captions. With extensive empirical evaluations on MSR-VTT, V2C and VATEX datasets, CAVAN consistently improves the quality of generations and shows higher keyword hit rate. Experimental results with ablations validate the effectiveness of CAVAN and reveals that the use of commonsense knowledge contributes to the video caption generation.

ContributorsShao, Huiliang (Author) / Yang, Yezhou (Thesis advisor) / Jayasuriya, Suren (Committee member) / Xiao, Chaowei (Committee member) / Arizona State University (Publisher)

Created2022

Automated Tracking of a Production Line Through Computer Vision Analytics

Description

Recent advancements in machine learning methods have allowed companies to develop advanced computer vision aided production lines that take advantage of the raw and labeled data captured by high-definition cameras mounted at vantage points in their factory floor. We experiment with two different methods of developing one such system to…

Recent advancements in machine learning methods have allowed companies to develop advanced computer vision aided production lines that take advantage of the raw and labeled data captured by high-definition cameras mounted at vantage points in their factory floor. We experiment with two different methods of developing one such system to automatically track key components on a production line. By tracking the state of these key components using object detection we can accurately determine and report production line metrics like part arrival and start/stop times for key factory processes. We began by collecting and labeling raw image data from the cameras overlooking the factory floor. Using that data we trained two dedicated object detection models. Our training utilized transfer learning to start from a Faster R-CNN ResNet model trained on Microsoft’s COCO dataset. The first model we developed is a binary classifier that detects the state of a single object while the second model is a multiclass classifier that detects the state of two distinct objects on the factory floor. Both models achieved over 95% classification and localization accuracy on our test datasets. Having two additional classes did not affect the classification or localization accuracy of the multiclass model compared to the binary model.

ContributorsPaulson, Hunter (Author) / Ju, Feng (Thesis director) / Balasubramanian, Ramkumar (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Computer Science and Engineering Program (Contributor)

Created2022-05