ASU Electronic Theses and Dissertations
This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.
In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.
Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.
Filtering by
- All Subjects: Electrical Engineering
The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions is strongly desired. A latent topic models-based method is proposed to learn supra-segmental features from low-level acoustic descriptors. The derived features outperform state-of-the-art approaches over multiple databases. Cross-corpus studies are conducted to determine the ability of these features to generalize well across different databases. The proposed method is also applied to derive features from facial expressions; a multi-modal fusion overcomes the deficiencies of a speech only approach and further improves the recognition performance.
Besides affecting the acoustic properties of speech, emotions have a strong influence over speech articulation kinematics. A learning approach, which constrains a classifier trained over acoustic descriptors, to also model articulatory data is proposed here. This method requires articulatory information only during the training stage, thus overcoming the challenges inherent to large-scale data collection, while simultaneously exploiting the correlations between articulation kinematics and acoustic descriptors to improve the accuracy of emotion recognition systems.
Identifying context from ambient sounds in a lifelogging scenario requires feature extraction, segmentation and annotation techniques capable of efficiently handling long duration audio recordings; a complete framework for such applications is presented. The performance is evaluated on real world data and accompanied by a prototypical Android-based user interface.
The proposed methods are also assessed in terms of computation and implementation complexity. Software and field programmable gate array based implementations are considered for emotion recognition, while virtual platforms are used to model the complexities of lifelogging. The derived metrics are used to determine the feasibility of these methods for applications requiring real-time capabilities and low power consumption.
conventional cameras. The large field of view comes at a price of non-linear distortions
introduced near the boundaries of the images captured by such cameras. Despite this
drawback, they are being used increasingly in many applications of computer vision,
robotics, reconnaissance, astrophotography, surveillance and automotive applications.
The images captured from such cameras can be corrected for their distortion if the
cameras are calibrated and the distortion function is determined. Calibration also allows
fisheye cameras to be used in tasks involving metric scene measurement, metric
scene reconstruction and other simultaneous localization and mapping (SLAM) algorithms.
This thesis presents a calibration toolbox (FisheyeCDC Toolbox) that implements a collection of some of the most widely used techniques for calibration of fisheye cameras under one package. This enables an inexperienced user to calibrate his/her own camera without the need for a theoretical understanding about computer vision and camera calibration. This thesis also explores some of the applications of calibration such as distortion correction and 3D reconstruction.
We approach the problem by building a hardware prototype and characterize the end-to-end system bottlenecks of power and performance. The prototype has 6 IMX274 cameras and uses Nvidia Jetson TX2 development board for capture and computation. We found that capturing is bottlenecked by sensor power and data-rates across interfaces, whereas compute is limited by the total number of computations per frame. Our characterization shows that redundant capture and redundant computations lead to high power, huge memory footprint, and high latency. The existing systems lack hardware-software co-design aspects, leading to excessive data transfers across the interfaces and expensive computations within the individual subsystems. Finally, we propose mechanisms to optimize the system for low power and low latency. We emphasize the importance of co-design of different subsystems to reduce and reuse the data. For example, reusing the motion vectors of the ISP stage reduces the memory footprint of the stereo correspondence stage. Our estimates show that pipelining and parallelization on custom FPGA can achieve real time stitching.
tion source is a challenging task with vital applications including surveillance and robotics.
Recent NLOS reconstruction advances have been achieved using time-resolved measure-
ments. Acquiring these time-resolved measurements requires expensive and specialized
detectors and laser sources. In work proposes a data-driven approach for NLOS 3D local-
ization requiring only a conventional camera and projector. The localisation is performed
using a voxelisation and a regression problem. Accuracy of greater than 90% is achieved
in localizing a NLOS object to a 5cm × 5cm × 5cm volume in real data. By adopting
the regression approach an object of width 10cm to localised to approximately 1.5cm. To
generalize to line-of-sight (LOS) scenes with non-planar surfaces, an adaptive lighting al-
gorithm is adopted. This algorithm, based on radiosity, identifies and illuminates scene
patches in the LOS which most contribute to the NLOS light paths, and can factor in sys-
tem power constraints. Improvements ranging from 6%-15% in accuracy with a non-planar
LOS wall using adaptive lighting is reported, demonstrating the advantage of combining
the physics of light transport with active illumination for data-driven NLOS imaging.