Search Content

Augmented image classification using image registration techniques

Description

Advancements in computer vision and machine learning have added a new dimension to remote sensing applications with the aid of imagery analysis techniques. Applications such as autonomous navigation and terrain classification which make use of image classification techniques are challenging problems and research is still being carried out to find…

Advancements in computer vision and machine learning have added a new dimension to remote sensing applications with the aid of imagery analysis techniques. Applications such as autonomous navigation and terrain classification which make use of image classification techniques are challenging problems and research is still being carried out to find better solutions. In this thesis, a novel method is proposed which uses image registration techniques to provide better image classification. This method reduces the error rate of classification by performing image registration of the images with the previously obtained images before performing classification. The motivation behind this is the fact that images that are obtained in the same region which need to be classified will not differ significantly in characteristics. Hence, registration will provide an image that matches closer to the previously obtained image, thus providing better classification. To illustrate that the proposed method works, naïve Bayes and iterative closest point (ICP) algorithms are used for the image classification and registration stages respectively. This implementation was tested extensively in simulation using synthetic images and using a real life data set called the Defense Advanced Research Project Agency (DARPA) Learning Applied to Ground Robots (LAGR) dataset. The results show that the ICP algorithm does help in better classification with Naïve Bayes by reducing the error rate by an average of about 10% in the synthetic data and by about 7% on the actual datasets used.

ContributorsMuralidhar, Ashwini (Author) / Saripalli, Srikanth (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2011

Designing m-health modules with sensor interfaces for DSP education

Description

Advancements in mobile technologies have significantly enhanced the capabilities of mobile devices to serve as powerful platforms for sensing, processing, and visualization. Surges in the sensing technology and the abundance of data have enabled the use of these portable devices for real-time data analysis and decision-making in digital signal processing…

Advancements in mobile technologies have significantly enhanced the capabilities of mobile devices to serve as powerful platforms for sensing, processing, and visualization. Surges in the sensing technology and the abundance of data have enabled the use of these portable devices for real-time data analysis and decision-making in digital signal processing (DSP) applications. Most of the current efforts in DSP education focus on building tools to facilitate understanding of the mathematical principles. However, there is a disconnect between real-world data processing problems and the material presented in a DSP course. Sophisticated mobile interfaces and apps can potentially play a crucial role in providing a hands-on-experience with modern DSP applications to students. In this work, a new paradigm of DSP learning is explored by building an interactive easy-to-use health monitoring application for use in DSP courses. This is motivated by the increasing commercial interest in employing mobile phones for real-time health monitoring tasks. The idea is to exploit the computational abilities of the Android platform to build m-Health modules with sensor interfaces. In particular, appropriate sensing modalities have been identified, and a suite of software functionalities have been developed. Within the existing framework of the AJDSP app, a graphical programming environment, interfaces to on-board and external sensor hardware have also been developed to acquire and process physiological data. The set of sensor signals that can be monitored include electrocardiogram (ECG), photoplethysmogram (PPG), accelerometer signal, and galvanic skin response (GSR). The proposed m-Health modules can be used to estimate parameters such as heart rate, oxygen saturation, step count, and heart rate variability. A set of laboratory exercises have been designed to demonstrate the use of these modules in DSP courses. The app was evaluated through several workshops involving graduate and undergraduate students in signal processing majors at Arizona State University. The usefulness of the software modules in enhancing student understanding of signals, sensors and DSP systems were analyzed. Student opinions about the app and the proposed m-health modules evidenced the merits of integrating tools for mobile sensing and processing in a DSP curriculum, and familiarizing students with challenges in modern data-driven applications.

ContributorsRajan, Deepta (Author) / Spanias, Andreas (Thesis advisor) / Frakes, David (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2013

Re-sonification of objects, events, and environments

Description

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating…

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating objects. In this work, methods of sound synthesis by re-sonification are considered. Re-sonification, herein, refers to the general process of analyzing, possibly transforming, and resynthesizing or reusing recorded sounds in meaningful ways, to convey information. Applied to soundscapes, re-sonification is presented as a means of conveying activity within an environment. Applied to the sounds of objects, this work examines modeling the perception of objects as well as their physical properties and the ability to simulate interactive events with such objects. To create soundscapes to re-sonify geographic environments, a method of automated soundscape design is presented. Using recorded sounds that are classified based on acoustic, social, semantic, and geographic information, this method produces stochastically generated soundscapes to re-sonify selected geographic areas. Drawing on prior knowledge, local sounds and those deemed similar comprise a locale's soundscape. In the context of re-sonifying events, this work examines processes for modeling and estimating the excitations of sounding objects. These include plucking, striking, rubbing, and any interaction that imparts energy into a system, affecting the resultant sound. A method of estimating a linear system's input, constrained to a signal-subspace, is presented and applied toward improving the estimation of percussive excitations for re-sonification. To work toward robust recording-based modeling and re-sonification of objects, new implementations of banded waveguide (BWG) models are proposed for object modeling and sound synthesis. Previous implementations of BWGs use arbitrary model parameters and may produce a range of simulations that do not match digital waveguide or modal models of the same design. Subject to linear excitations, some models proposed here behave identically to other equivalently designed physical models. Under nonlinear interactions, such as bowing, many of the proposed implementations exhibit improvements in the attack characteristics of synthesized sounds.

ContributorsFink, Alex M (Author) / Spanias, Andreas S (Thesis advisor) / Cook, Perry R. (Committee member) / Turaga, Pavan (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2013

Audio processing and loudness estimation algorithms with iOS simulations

Description

The processing power and storage capacity of portable devices have improved considerably over the past decade. This has motivated the implementation of sophisticated audio and other signal processing algorithms on such mobile devices. Of particular interest in this thesis is audio/speech processing based on perceptual criteria. Specifically, estimation of parameters…

The processing power and storage capacity of portable devices have improved considerably over the past decade. This has motivated the implementation of sophisticated audio and other signal processing algorithms on such mobile devices. Of particular interest in this thesis is audio/speech processing based on perceptual criteria. Specifically, estimation of parameters from human auditory models, such as auditory patterns and loudness, involves computationally intensive operations which can strain device resources. Hence, strategies for implementing computationally efficient human auditory models for loudness estimation have been studied in this thesis. Existing algorithms for reducing computations in auditory pattern and loudness estimation have been examined and improved algorithms have been proposed to overcome limitations of these methods. In addition, real-time applications such as perceptual loudness estimation and loudness equalization using auditory models have also been implemented. A software implementation of loudness estimation on iOS devices is also reported in this thesis. In addition to the loudness estimation algorithms and software, in this thesis project we also created new illustrations of speech and audio processing concepts for research and education. As a result, a new suite of speech/audio DSP functions was developed and integrated as part of the award-winning educational iOS App 'iJDSP." These functions are described in detail in this thesis. Several enhancements in the architecture of the application have also been introduced for providing the supporting framework for speech/audio processing. Frame-by-frame processing and visualization functionalities have been developed to facilitate speech/audio processing. In addition, facilities for easy sound recording, processing and audio rendering have also been developed to provide students, practitioners and researchers with an enriched DSP simulation tool. Simulations and assessments have been also developed for use in classes and training of practitioners and students.

ContributorsKalyanasundaram, Girish (Author) / Spanias, Andreas S (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2013

New directions in sparse models for image analysis and restoration

Description

Effective modeling of high dimensional data is crucial in information processing and machine learning. Classical subspace methods have been very effective in such applications. However, over the past few decades, there has been considerable research towards the development of new modeling paradigms that go beyond subspace methods. This dissertation focuses…

Effective modeling of high dimensional data is crucial in information processing and machine learning. Classical subspace methods have been very effective in such applications. However, over the past few decades, there has been considerable research towards the development of new modeling paradigms that go beyond subspace methods. This dissertation focuses on the study of sparse models and their interplay with modern machine learning techniques such as manifold, ensemble and graph-based methods, along with their applications in image analysis and recovery. By considering graph relations between data samples while learning sparse models, graph-embedded codes can be obtained for use in unsupervised, supervised and semi-supervised problems. Using experiments on standard datasets, it is demonstrated that the codes obtained from the proposed methods outperform several baseline algorithms. In order to facilitate sparse learning with large scale data, the paradigm of ensemble sparse coding is proposed, and different strategies for constructing weak base models are developed. Experiments with image recovery and clustering demonstrate that these ensemble models perform better when compared to conventional sparse coding frameworks. When examples from the data manifold are available, manifold constraints can be incorporated with sparse models and two approaches are proposed to combine sparse coding with manifold projection. The improved performance of the proposed techniques in comparison to sparse coding approaches is demonstrated using several image recovery experiments. In addition to these approaches, it might be required in some applications to combine multiple sparse models with different regularizations. In particular, combining an unconstrained sparse model with non-negative sparse coding is important in image analysis, and it poses several algorithmic and theoretical challenges. A convex and an efficient greedy algorithm for recovering combined representations are proposed. Theoretical guarantees on sparsity thresholds for exact recovery using these algorithms are derived and recovery performance is also demonstrated using simulations on synthetic data. Finally, the problem of non-linear compressive sensing, where the measurement process is carried out in feature space obtained using non-linear transformations, is considered. An optimized non-linear measurement system is proposed, and improvements in recovery performance are demonstrated in comparison to using random measurements as well as optimized linear measurements.

ContributorsNatesan Ramamurthy, Karthikeyan (Author) / Spanias, Andreas (Thesis advisor) / Tsakalis, Konstantinos (Committee member) / Karam, Lina (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2013

Sparse methods in image understanding and computer vision

Description

Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse…

Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse modeling, data is represented as a sparse linear combination of atoms from a "dictionary" matrix. This dissertation focuses on understanding different aspects of sparse learning, thereby enhancing the use of sparse methods by incorporating tools from machine learning. With the growing need to adapt models for large scale data, it is important to design dictionaries that can model the entire data space and not just the samples considered. By exploiting the relation of dictionary learning to 1-D subspace clustering, a multilevel dictionary learning algorithm is developed, and it is shown to outperform conventional sparse models in compressed recovery, and image denoising. Theoretical aspects of learning such as algorithmic stability and generalization are considered, and ensemble learning is incorporated for effective large scale learning. In addition to building strategies for efficiently implementing 1-D subspace clustering, a discriminative clustering approach is designed to estimate the unknown mixing process in blind source separation. By exploiting the non-linear relation between the image descriptors, and allowing the use of multiple features, sparse methods can be made more effective in recognition problems. The idea of multiple kernel sparse representations is developed, and algorithms for learning dictionaries in the feature space are presented. Using object recognition experiments on standard datasets it is shown that the proposed approaches outperform other sparse coding-based recognition frameworks. Furthermore, a segmentation technique based on multiple kernel sparse representations is developed, and successfully applied for automated brain tumor identification. Using sparse codes to define the relation between data samples can lead to a more robust graph embedding for unsupervised clustering. By performing discriminative embedding using sparse coding-based graphs, an algorithm for measuring the glomerular number in kidney MRI images is developed. Finally, approaches to build dictionaries for local sparse coding of image descriptors are presented, and applied to object recognition and image retrieval.

ContributorsJayaraman Thiagarajan, Jayaraman (Author) / Spanias, Andreas (Thesis advisor) / Frakes, David (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2013

Kinematic analysis and quantitative evaluation for reach movements in stroke rehabilitation

Description

In this thesis, quantitative evaluation of quality of movement during stroke rehabilitation will be discussed. Previous research on stroke rehabilitation in hospital has been shown to be effective. In this thesis, we study various issues that arise when creating a home-based system that can be deployed in a patient's home.…

In this thesis, quantitative evaluation of quality of movement during stroke rehabilitation will be discussed. Previous research on stroke rehabilitation in hospital has been shown to be effective. In this thesis, we study various issues that arise when creating a home-based system that can be deployed in a patient's home. Limitation of motion capture due to reduced number of sensors leads to problems with design of kinematic features for quantitative evaluation. Also, the hierarchical three-level tasks of rehabilitation requires new design of kinematic features. In this thesis, the design of kinematic features for a home based stroke rehabilitation system will be presented. Results of the most challenging classifier are shown and proves the effectiveness of the design. Comparison between modern classification techniques and low computational cost threshold based classification with same features will also be shown.

ContributorsCheng, Long (Author) / Turaga, Pavan (Thesis advisor) / Arizona State University (Publisher)

Created2012

Upper body motion analysis using kinect for stroke rehabilitation at the home

Description

Motion capture using cost-effective sensing technology is challenging and the huge success of Microsoft Kinect has been attracting researchers to uncover the potential of using this technology into computer vision applications. In this thesis, an upper-body motion analysis in a home-based system for stroke rehabilitation using novel RGB-D camera -…

Motion capture using cost-effective sensing technology is challenging and the huge success of Microsoft Kinect has been attracting researchers to uncover the potential of using this technology into computer vision applications. In this thesis, an upper-body motion analysis in a home-based system for stroke rehabilitation using novel RGB-D camera - Kinect is presented. We address this problem by ﬁrst conducting a systematic analysis of the usability of Kinect for motion analysis in stroke rehabilitation. Then a hybrid upper body tracking approach is proposed which combines off-the-shelf skeleton tracking with a novel depth-fused mean shift tracking method. We proposed several kinematic features reliably extracted from the proposed inexpensive and portable motion capture system and classiﬁers that correlate torso movement to clinical measures of unimpaired and impaired. Experiment results show that the proposed sensing and analysis works reliably on measuring torso movement quality and is promising for end-point tracking. The system is currently being deployed for large-scale evaluations.

ContributorsDu, Tingfang (Author) / Turaga, Pavan (Thesis advisor) / Spanias, Andreas (Committee member) / Rikakis, Thanassis (Committee member) / Arizona State University (Publisher)

Created2012

Exploring video denoising using matrix completion

Description

Video denoising has been an important task in many multimedia and computer vision applications. Recent developments in the matrix completion theory and emergence of new numerical methods which can efficiently solve the matrix completion problem have paved the way for exploration of new techniques for some classical image processing tasks.…

Video denoising has been an important task in many multimedia and computer vision applications. Recent developments in the matrix completion theory and emergence of new numerical methods which can efficiently solve the matrix completion problem have paved the way for exploration of new techniques for some classical image processing tasks. Recent literature shows that many computer vision and image processing problems can be solved by using the matrix completion theory. This thesis explores the application of matrix completion in video denoising. A state-of-the-art video denoising algorithm in which the denoising task is modeled as a matrix completion problem is chosen for detailed study. The contribution of this thesis lies in both providing extensive analysis to bridge the gap in existing literature on matrix completion frame work for video denoising and also in proposing some novel techniques to improve the performance of the chosen denoising algorithm. The chosen algorithm is implemented for thorough analysis. Experiments and discussions are presented to enable better understanding of the problem. Instability shown by the algorithm at some parameter values in a particular case of low levels of pure Gaussian noise is identified. Artifacts introduced in such cases are analyzed. A novel way of grouping structurally-relevant patches is proposed to improve the algorithm. Experiments show that this technique is useful, especially in videos containing high amounts of motion. Based on the observation that matrix completion is not suitable for denoising patches containing relatively low amount of image details, a framework is designed to separate patches corresponding to low structured regions from a noisy image. Experiments are conducted by not subjecting such patches to matrix completion, instead denoising such patches in a different way. The resulting improvement in performance suggests that denoising low structured patches does not require a complex method like matrix completion and in fact it is counter-productive to subject such patches to matrix completion. These results also indicate the inherent limitation of matrix completion to deal with cases in which noise dominates the structural properties of an image. A novel method for introducing priorities to the ranked patches in matrix completion is also presented. Results showed that this method yields improved performance in general. It is observed that the artifacts in presence of low levels of pure Gaussian noise appear differently after introducing priorities to the patches and the artifacts occur at a wider range of parameter values. Results and discussion suggesting future ways to explore this problem are also presented.

ContributorsMaguluri, Hima Bindu (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Claveau, Claude (Committee member) / Arizona State University (Publisher)

Created2013

Efficient perceptual super-resolution

Description

Super-Resolution (SR) techniques are widely developed to increase image resolution by fusing several Low-Resolution (LR) images of the same scene to overcome sensor hardware limitations and reduce media impairments in a cost-effective manner. When choosing a solution for the SR problem, there is always a trade-off between computational efficiency and…

Super-Resolution (SR) techniques are widely developed to increase image resolution by fusing several Low-Resolution (LR) images of the same scene to overcome sensor hardware limitations and reduce media impairments in a cost-effective manner. When choosing a solution for the SR problem, there is always a trade-off between computational efficiency and High-Resolution (HR) image quality. Existing SR approaches suffer from extremely high computational requirements due to the high number of unknowns to be estimated in the solution of the SR inverse problem. This thesis proposes efficient iterative SR techniques based on Visual Attention (VA) and perceptual modeling of the human visual system. In the first part of this thesis, an efficient ATtentive-SELective Perceptual-based (AT-SELP) SR framework is presented, where only a subset of perceptually significant active pixels is selected for processing by the SR algorithm based on a local contrast sensitivity threshold model and a proposed low complexity saliency detector. The proposed saliency detector utilizes a probability of detection rule inspired by concepts of luminance masking and visual attention. The second part of this thesis further enhances on the efficiency of selective SR approaches by presenting an ATtentive (AT) SR framework that is completely driven by VA region detectors. Additionally, different VA techniques that combine several low-level features, such as center-surround differences in intensity and orientation, patch luminance and contrast, bandpass outputs of patch luminance and contrast, and difference of Gaussians of luminance intensity are integrated and analyzed to illustrate the effectiveness of the proposed selective SR frameworks. The proposed AT-SELP SR and AT-SR frameworks proved to be flexible by integrating a Maximum A Posteriori (MAP)-based SR algorithm as well as a fast two-stage Fusion-Restoration (FR) SR estimator. By adopting the proposed selective SR frameworks, simulation results show significant reduction on average in computational complexity with comparable visual quality in terms of quantitative metrics such as PSNR, SNR or MAE gains, and subjective assessment. The third part of this thesis proposes a Perceptually Weighted (WP) SR technique that incorporates unequal weighting parameters in the cost function of iterative SR problems. The proposed approach is inspired by the unequal processing of the Human Visual System (HVS) to different local image features in an image. Simulation results show an enhanced reconstruction quality and faster convergence rates when applied to the MAP-based and FR-based SR schemes.

ContributorsSadaka, Nabil (Author) / Karam, Lina J (Thesis advisor) / Spanias, Andreas S (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Abousleman, Glen P (Committee member) / Goryll, Michael (Committee member) / Arizona State University (Publisher)

Created2011

Filtering by