Search Content

Matching Items (3)

Filtering by

All Subjects: Electrical Engineering
Genre: Doctoral Dissertation

Re-sonification of objects, events, and environments

Description

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating objects. In this work, methods of sound synthesis by re-sonification are considered. Re-sonification, herein, refers to the general process of analyzing, possibly transforming, and resynthesizing or reusing recorded sounds in meaningful ways, to convey information. Applied to soundscapes, re-sonification is presented as a means of conveying activity within an environment. Applied to the sounds of objects, this work examines modeling the perception of objects as well as their physical properties and the ability to simulate interactive events with such objects. To create soundscapes to re-sonify geographic environments, a method of automated soundscape design is presented. Using recorded sounds that are classified based on acoustic, social, semantic, and geographic information, this method produces stochastically generated soundscapes to re-sonify selected geographic areas. Drawing on prior knowledge, local sounds and those deemed similar comprise a locale's soundscape. In the context of re-sonifying events, this work examines processes for modeling and estimating the excitations of sounding objects. These include plucking, striking, rubbing, and any interaction that imparts energy into a system, affecting the resultant sound. A method of estimating a linear system's input, constrained to a signal-subspace, is presented and applied toward improving the estimation of percussive excitations for re-sonification. To work toward robust recording-based modeling and re-sonification of objects, new implementations of banded waveguide (BWG) models are proposed for object modeling and sound synthesis. Previous implementations of BWGs use arbitrary model parameters and may produce a range of simulations that do not match digital waveguide or modal models of the same design. Subject to linear excitations, some models proposed here behave identically to other equivalently designed physical models. Under nonlinear interactions, such as bowing, many of the proposed implementations exhibit improvements in the attack characteristics of synthesized sounds.

ContributorsFink, Alex M (Author) / Spanias, Andreas S (Thesis advisor) / Cook, Perry R. (Committee member) / Turaga, Pavan (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2013

Efficient perceptual super-resolution

Description

Super-Resolution (SR) techniques are widely developed to increase image resolution by fusing several Low-Resolution (LR) images of the same scene to overcome sensor hardware limitations and reduce media impairments in a cost-effective manner. When choosing a solution for the SR problem, there is always a trade-off between computational efficiency and High-Resolution (HR) image quality. Existing SR approaches suffer from extremely high computational requirements due to the high number of unknowns to be estimated in the solution of the SR inverse problem. This thesis proposes efficient iterative SR techniques based on Visual Attention (VA) and perceptual modeling of the human visual system. In the first part of this thesis, an efficient ATtentive-SELective Perceptual-based (AT-SELP) SR framework is presented, where only a subset of perceptually significant active pixels is selected for processing by the SR algorithm based on a local contrast sensitivity threshold model and a proposed low complexity saliency detector. The proposed saliency detector utilizes a probability of detection rule inspired by concepts of luminance masking and visual attention. The second part of this thesis further enhances on the efficiency of selective SR approaches by presenting an ATtentive (AT) SR framework that is completely driven by VA region detectors. Additionally, different VA techniques that combine several low-level features, such as center-surround differences in intensity and orientation, patch luminance and contrast, bandpass outputs of patch luminance and contrast, and difference of Gaussians of luminance intensity are integrated and analyzed to illustrate the effectiveness of the proposed selective SR frameworks. The proposed AT-SELP SR and AT-SR frameworks proved to be flexible by integrating a Maximum A Posteriori (MAP)-based SR algorithm as well as a fast two-stage Fusion-Restoration (FR) SR estimator. By adopting the proposed selective SR frameworks, simulation results show significant reduction on average in computational complexity with comparable visual quality in terms of quantitative metrics such as PSNR, SNR or MAE gains, and subjective assessment. The third part of this thesis proposes a Perceptually Weighted (WP) SR technique that incorporates unequal weighting parameters in the cost function of iterative SR problems. The proposed approach is inspired by the unequal processing of the Human Visual System (HVS) to different local image features in an image. Simulation results show an enhanced reconstruction quality and faster convergence rates when applied to the MAP-based and FR-based SR schemes.

ContributorsSadaka, Nabil (Author) / Karam, Lina J (Thesis advisor) / Spanias, Andreas S (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Abousleman, Glen P (Committee member) / Goryll, Michael (Committee member) / Arizona State University (Publisher)

Created2011

Image reconstruction, classification, and tracking for compressed sensing imaging and video

Description

Compressed sensing (CS) is a novel approach to collecting and analyzing data of all types. By exploiting prior knowledge of the compressibility of many naturally-occurring signals, specially designed sensors can dramatically undersample the data of interest and still achieve high performance. However, the generated data are pseudorandomly mixed and must be processed before use. In this work, a model of a single-pixel compressive video camera is used to explore the problems of performing inference based on these undersampled measurements. Three broad types of inference from CS measurements are considered: recovery of video frames, target tracking, and object classification/detection. Potential applications include automated surveillance, autonomous navigation, and medical imaging and diagnosis.

Recovery of CS video frames is far more complex than still images, which are known to be (approximately) sparse in a linear basis such as the discrete cosine transform. By combining sparsity of individual frames with an optical flow-based model of inter-frame dependence, the perceptual quality and peak signal to noise ratio (PSNR) of reconstructed frames is improved. The efficacy of this approach is demonstrated for the cases of \textit{a priori} known image motion and unknown but constant image-wide motion.

Although video sequences can be reconstructed from CS measurements, the process is computationally costly. In autonomous systems, this reconstruction step is unnecessary if higher-level conclusions can be drawn directly from the CS data. A tracking algorithm is described and evaluated which can hold target vehicles at very high levels of compression where reconstruction of video frames fails. The algorithm performs tracking by detection using a particle filter with likelihood given by a maximum average correlation height (MACH) target template model.

Motivated by possible improvements over the MACH filter-based likelihood estimation of the tracking algorithm, the application of deep learning models to detection and classification of compressively sensed images is explored. In tests, a Deep Boltzmann Machine trained on CS measurements outperforms a naive reconstruct-first approach.

Taken together, progress in these three areas of CS inference has the potential to lower system cost and improve performance, opening up new applications of CS video cameras.

ContributorsBraun, Henry Carlton (Author) / Turaga, Pavan K (Thesis advisor) / Spanias, Andreas S (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2016