Search Content

Sparse methods in image understanding and computer vision

Description

Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse…

Image understanding has been playing an increasingly crucial role in vision applications. Sparse models form an important component in image understanding, since the statistics of natural images reveal the presence of sparse structure. Sparse methods lead to parsimonious models, in addition to being efficient for large scale learning. In sparse modeling, data is represented as a sparse linear combination of atoms from a "dictionary" matrix. This dissertation focuses on understanding different aspects of sparse learning, thereby enhancing the use of sparse methods by incorporating tools from machine learning. With the growing need to adapt models for large scale data, it is important to design dictionaries that can model the entire data space and not just the samples considered. By exploiting the relation of dictionary learning to 1-D subspace clustering, a multilevel dictionary learning algorithm is developed, and it is shown to outperform conventional sparse models in compressed recovery, and image denoising. Theoretical aspects of learning such as algorithmic stability and generalization are considered, and ensemble learning is incorporated for effective large scale learning. In addition to building strategies for efficiently implementing 1-D subspace clustering, a discriminative clustering approach is designed to estimate the unknown mixing process in blind source separation. By exploiting the non-linear relation between the image descriptors, and allowing the use of multiple features, sparse methods can be made more effective in recognition problems. The idea of multiple kernel sparse representations is developed, and algorithms for learning dictionaries in the feature space are presented. Using object recognition experiments on standard datasets it is shown that the proposed approaches outperform other sparse coding-based recognition frameworks. Furthermore, a segmentation technique based on multiple kernel sparse representations is developed, and successfully applied for automated brain tumor identification. Using sparse codes to define the relation between data samples can lead to a more robust graph embedding for unsupervised clustering. By performing discriminative embedding using sparse coding-based graphs, an algorithm for measuring the glomerular number in kidney MRI images is developed. Finally, approaches to build dictionaries for local sparse coding of image descriptors are presented, and applied to object recognition and image retrieval.

ContributorsJayaraman Thiagarajan, Jayaraman (Author) / Spanias, Andreas (Thesis advisor) / Frakes, David (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2013

New directions in sparse models for image analysis and restoration

Description

Effective modeling of high dimensional data is crucial in information processing and machine learning. Classical subspace methods have been very effective in such applications. However, over the past few decades, there has been considerable research towards the development of new modeling paradigms that go beyond subspace methods. This dissertation focuses…

Effective modeling of high dimensional data is crucial in information processing and machine learning. Classical subspace methods have been very effective in such applications. However, over the past few decades, there has been considerable research towards the development of new modeling paradigms that go beyond subspace methods. This dissertation focuses on the study of sparse models and their interplay with modern machine learning techniques such as manifold, ensemble and graph-based methods, along with their applications in image analysis and recovery. By considering graph relations between data samples while learning sparse models, graph-embedded codes can be obtained for use in unsupervised, supervised and semi-supervised problems. Using experiments on standard datasets, it is demonstrated that the codes obtained from the proposed methods outperform several baseline algorithms. In order to facilitate sparse learning with large scale data, the paradigm of ensemble sparse coding is proposed, and different strategies for constructing weak base models are developed. Experiments with image recovery and clustering demonstrate that these ensemble models perform better when compared to conventional sparse coding frameworks. When examples from the data manifold are available, manifold constraints can be incorporated with sparse models and two approaches are proposed to combine sparse coding with manifold projection. The improved performance of the proposed techniques in comparison to sparse coding approaches is demonstrated using several image recovery experiments. In addition to these approaches, it might be required in some applications to combine multiple sparse models with different regularizations. In particular, combining an unconstrained sparse model with non-negative sparse coding is important in image analysis, and it poses several algorithmic and theoretical challenges. A convex and an efficient greedy algorithm for recovering combined representations are proposed. Theoretical guarantees on sparsity thresholds for exact recovery using these algorithms are derived and recovery performance is also demonstrated using simulations on synthetic data. Finally, the problem of non-linear compressive sensing, where the measurement process is carried out in feature space obtained using non-linear transformations, is considered. An optimized non-linear measurement system is proposed, and improvements in recovery performance are demonstrated in comparison to using random measurements as well as optimized linear measurements.

ContributorsNatesan Ramamurthy, Karthikeyan (Author) / Spanias, Andreas (Thesis advisor) / Tsakalis, Konstantinos (Committee member) / Karam, Lina (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2013

Reconstruction-free inference from compressive measurements

Description

As a promising solution to the problem of acquiring and storing large amounts of image and video data, spatial-multiplexing camera architectures have received lot of attention in the recent past. Such architectures have the attractive feature of combining a two-step process of acquisition and compression of pixel measurements in a…

As a promising solution to the problem of acquiring and storing large amounts of image and video data, spatial-multiplexing camera architectures have received lot of attention in the recent past. Such architectures have the attractive feature of combining a two-step process of acquisition and compression of pixel measurements in a conventional camera, into a single step. A popular variant is the single-pixel camera that obtains measurements of the scene using a pseudo-random measurement matrix. Advances in compressive sensing (CS) theory in the past decade have supplied the tools that, in theory, allow near-perfect reconstruction of an image from these measurements even for sub-Nyquist sampling rates. However, current state-of-the-art reconstruction algorithms suffer from two drawbacks -- They are (1) computationally very expensive and (2) incapable of yielding high fidelity reconstructions for high compression ratios. In computer vision, the final goal is usually to perform an inference task using the images acquired and not signal recovery. With this motivation, this thesis considers the possibility of inference directly from compressed measurements, thereby obviating the need to use expensive reconstruction algorithms. It is often the case that non-linear features are used for inference tasks in computer vision. However, currently, it is unclear how to extract such features from compressed measurements. Instead, using the theoretical basis provided by the Johnson-Lindenstrauss lemma, discriminative features using smashed correlation filters are derived and it is shown that it is indeed possible to perform reconstruction-free inference at high compression ratios with only a marginal loss in accuracy. As a specific inference problem in computer vision, face recognition is considered, mainly beyond the visible spectrum such as in the short wave infra-red region (SWIR), where sensors are expensive.

ContributorsLohit, Suhas Anand (Author) / Turaga, Pavan (Thesis advisor) / Spanias, Andreas (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2015

Exploring latent structure in data: algorithms and implementations

Description

Feature representations for raw data is one of the most important component in a machine learning system. Traditionally, features are \textit{hand crafted} by domain experts which can often be a time consuming process. Furthermore, they do not generalize well to unseen data and novel tasks. Recently, there have been many…

Feature representations for raw data is one of the most important component in a machine learning system. Traditionally, features are \textit{hand crafted} by domain experts which can often be a time consuming process. Furthermore, they do not generalize well to unseen data and novel tasks. Recently, there have been many efforts to generate data-driven representations using clustering and sparse models. This dissertation focuses on building data-driven unsupervised models for analyzing raw data and developing efficient feature representations.

Simultaneous segmentation and feature extraction approaches for silicon-pores sensor data are considered. Aggregating data into a matrix and performing low rank and sparse matrix decompositions with additional smoothness constraints are proposed to solve this problem. Comparison of several variants of the approaches and results for signal de-noising and translocation/trapping event extraction are presented. Algorithms to improve transform-domain features for ion-channel time-series signals based on matrix completion are presented. The improved features achieve better performance in classification tasks and in reducing the false alarm rates when applied to analyte detection.

Developing representations for multimedia is an important and challenging problem with applications ranging from scene recognition, multi-media retrieval and personal life-logging systems to field robot navigation. In this dissertation, we present a new framework for feature extraction for challenging natural environment sounds. Proposed features outperform traditional spectral features on challenging environmental sound datasets. Several algorithms are proposed that perform supervised tasks such as recognition and tag annotation. Ensemble methods are proposed to improve the tag annotation process.

To facilitate the use of large datasets, fast implementations are developed for sparse coding, the key component in our algorithms. Several strategies to speed-up Orthogonal Matching Pursuit algorithm using CUDA kernel on a GPU are proposed. Implementations are also developed for a large scale image retrieval system. Image-based "exact search" and "visually similar search" using the image patch sparse codes are performed. Results demonstrate large speed-up over CPU implementations and good retrieval performance is also achieved.

ContributorsSattigeri, Prasanna S (Author) / Spanias, Andreas (Thesis advisor) / Thornton, Trevor (Committee member) / Goryll, Michael (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2014

The Capabilities and Obstacles of Integrating Machine Learning into a Supply Chain

Description

Only an Executive Summary of the project is included.
The goal of this project is to develop a deeper understanding of how machine learning pertains to the business world and how business professionals can capitalize on its capabilities. It explores the end-to-end process of integrating a machine and the tradeoffs…

Only an Executive Summary of the project is included.
The goal of this project is to develop a deeper understanding of how machine learning pertains to the business world and how business professionals can capitalize on its capabilities. It explores the end-to-end process of integrating a machine and the tradeoffs and obstacles to consider. This topic is extremely pertinent today as the advent of big data increases and the use of machine learning and artificial intelligence is expanding across industries and functional roles. The approach I took was to expand on a project I championed as a Microsoft intern where I facilitated the integration of a forecasting machine learning model firsthand into the business. I supplement my findings from the experience with research on machine learning as a disruptive technology. This paper will not delve into the technical aspects of coding a machine model, but rather provide a holistic overview of developing the model from a business perspective. My findings show that, while the advantages of machine learning are large and widespread, a lack of visibility and transparency into the algorithms behind machine learning, the necessity for large amounts of data, and the overall complexity of creating accurate models are all tradeoffs to consider when deciding whether or not machine learning is suitable for a certain objective. The results of this paper are important in order to increase the understanding of any business professional on the capabilities and obstacles of integrating machine learning into their business operations.

ContributorsVerma, Ria (Author) / Goegan, Brian (Thesis director) / Moore, James (Committee member) / Department of Information Systems (Contributor) / Department of Supply Chain Management (Contributor) / Department of Economics (Contributor) / Barrett, The Honors College (Contributor)

Created2019-05

Graph-based estimation of information divergence functions

Description

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric…

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is rarely the case in real-word scenarios. Non-parametric density estimators are characterized by a very large number of parameters that have to be tuned with costly cross-validation. In this dissertation we focus on a specific family of non-parametric estimators, called direct estimators, that bypass density estimation completely and directly estimate the quantity of interest from the data. We introduce a new divergence measure, the $D_p$-divergence, that can be estimated directly from samples without parametric assumptions on the distribution. We show that the $D_p$-divergence bounds the binary, cross-domain, and multi-class Bayes error rates and, in certain cases, provides provably tighter bounds than the Hellinger divergence. In addition, we also propose a new methodology that allows the experimenter to construct direct estimators for existing divergence measures or to construct new divergence measures with custom properties that are tailored to the application. To examine the practical efficacy of these new methods, we evaluate them in a statistical learning framework on a series of real-world data science problems involving speech-based monitoring of neuro-motor disorders.

ContributorsWisler, Alan (Author) / Berisha, Visar (Thesis advisor) / Spanias, Andreas (Thesis advisor) / Liss, Julie (Committee member) / Bliss, Daniel (Committee member) / Arizona State University (Publisher)

Created2017

Photovoltaic Array Fault Detection and Optimization Using Machine Learning

Description

The increasing demand for clean energy solutions requires more than just expansion, but also improvements in the efficiency of renewable sources, such as solar. This requires analytics for each panel regarding voltage, current, temperature, and irradiance. This project involves the development of machine learning algorithms along with a data logger…

The increasing demand for clean energy solutions requires more than just expansion, but also improvements in the efficiency of renewable sources, such as solar. This requires analytics for each panel regarding voltage, current, temperature, and irradiance. This project involves the development of machine learning algorithms along with a data logger for the purpose of photovoltaic (PV) monitoring and control. Machine learning is used for fault classification. Once a fault is detected, the system can change its reconfiguration to minimize the power losses. Accuracy in the fault detection was demonstrated to be at a level over 90% and topology reconfiguration showed to increase power output by as much as 5%.

ContributorsNavas, John (Author) / Spanias, Andreas (Thesis director) / Rao, Sunil (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Quantum Machine Learning for Optical and SAR Classification

Description

We present in this paper a method to compare scene classification accuracy of C-band Synthetic aperture radar (SAR) and optical images utilizing both classical and quantum computing algorithms. This REU study uses data from the Sentinel satellite. The dataset contains (i) synthetic aperture radar images collected from the Sentinel-1 satellite…

We present in this paper a method to compare scene classification accuracy of C-band Synthetic aperture radar (SAR) and optical images utilizing both classical and quantum computing algorithms. This REU study uses data from the Sentinel satellite. The dataset contains (i) synthetic aperture radar images collected from the Sentinel-1 satellite and (ii) optical images for the same area as the SAR images collected from the Sentinel-2 satellite. We utilize classical neural networks to classify four classes of images. We then use Quantum Convolutional Neural Networks and deep learning techniques to take advantage of machine learning to help the system train, learn, and identify at a higher classification accuracy. A hybrid Quantum-classical model that is trained on the Sentinel1-2 dataset is proposed, and the performance is then compared against the classical in terms of classification accuracy.

ContributorsMiller, Leslie (Author) / Spanias, Andreas (Thesis director) / Uehara, Glen (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2023-05

Evaluation of Machine Learning Techniques for Pneumonia Detection

Description

Although relatively new technology, machine learning has rapidly demonstrated its many uses. One potential application of machine learning is the diagnosis of ailments in medical imaging. Ideally, through classification methods, a computer program would be able to identify different medical conditions when provided with an X-ray or other such scan.…

Although relatively new technology, machine learning has rapidly demonstrated its many uses. One potential application of machine learning is the diagnosis of ailments in medical imaging. Ideally, through classification methods, a computer program would be able to identify different medical conditions when provided with an X-ray or other such scan. This would be very beneficial for overworked doctors, and could act as a potential crutch to aid in giving accurate diagnoses. For this thesis project, five different machine-learning algorithms were tested on two datasets containing 5,856 lung X-ray scans labeled as either “Pneumonia” or “Normal”. The goal was to determine which algorithm achieved the highest accuracy, as well as how preprocessing the data affected the accuracy of the models. The following supervised-learning methods were tested: support vector machines, logistic regression, decision trees, random forest, and a convolutional neural network. Each model was adjusted independently in order to achieve maximum performance before accuracy metrics were generated to pit the models against each other. Additionally, the effect of resizing images on model performance was investigated. Overall, a convolutional neural network proved to be the superior model for pneumonia detection, with a 91% accuracy. After resizing to 28x28, CNN accuracy decreased to 85%. The random forest model performed second best. The 28x28 PneumoniaMNIST dataset achieved higher accuracy using traditional machine learning models than the HD Chest X-Ray dataset. Resizing the Chest X-ray images had minimal effect on traditional model performance when resized to 28x28 or larger.

ContributorsVollkommer, Margie (Author) / Spanias, Andreas (Thesis director) / Sivaraman Narayanaswamy, Vivek (Committee member) / Barrett, The Honors College (Contributor) / Harrington Bioengineering Program (Contributor)

Created2023-05

Distributed Learning and Data Collection with Strategic Agents

Description

The presence of strategic agents can pose unique challenges to data collection and distributed learning. This dissertation first explores the social network dimension of data collection markets, and then focuses on how the strategic agents can be efficiently and effectively incentivized to cooperate in distributed machine learning frameworks. The first problem…

The presence of strategic agents can pose unique challenges to data collection and distributed learning. This dissertation first explores the social network dimension of data collection markets, and then focuses on how the strategic agents can be efficiently and effectively incentivized to cooperate in distributed machine learning frameworks. The first problem explores the impact of social learning in collecting and trading unverifiable information where a data collector purchases data from users through a payment mechanism. Each user starts with a personal signal which represents the knowledge about the underlying state the data collector desires to learn. Through social interactions, each user also acquires additional information from his neighbors in the social network. It is revealed that both the data collector and the users can benefit from social learning which drives down the privacy costs and helps to improve the state estimation for a given total payment budget. In the second half, a federated learning scheme to train a global learning model with strategic agents, who are not bound to contribute their resources unconditionally, is considered. Since the agents are not obliged to provide their true stochastic gradient updates and the server is not capable of directly validating the authenticity of reported updates, the learning process may reach a noncooperative equilibrium. First, the actions of the agents are assumed to be binary: cooperative or defective. If the cooperative action is taken, the agent sends a privacy-preserved version of stochastic gradient signal. If the defective action is taken, the agent sends an arbitrary uninformative noise signal. Furthermore, this setup is extended into the scenarios with more general actions spaces where the quality of the stochastic gradient updates have a range of discrete levels. The proposed methodology evaluates each agent's stochastic gradient according to a reference gradient estimate which is constructed from the gradients provided by other agents, and rewards the agent based on that evaluation.

ContributorsAkbay, Abdullah Basar (Author) / Tepedelenlioğlu, Cihan (Thesis advisor) / Spanias, Andreas (Committee member) / Kosut, Oliver (Committee member) / Ewaisha, Ahmed (Committee member) / Arizona State University (Publisher)

Created2023

Filtering by