Matching Items (16)
Filtering by

Clear all filters

149915-Thumbnail Image.png
Description
Spotlight mode synthetic aperture radar (SAR) imaging involves a tomo- graphic reconstruction from projections, necessitating acquisition of large amounts of data in order to form a moderately sized image. Since typical SAR sensors are hosted on mobile platforms, it is common to have limitations on SAR data acquisi- tion, storage

Spotlight mode synthetic aperture radar (SAR) imaging involves a tomo- graphic reconstruction from projections, necessitating acquisition of large amounts of data in order to form a moderately sized image. Since typical SAR sensors are hosted on mobile platforms, it is common to have limitations on SAR data acquisi- tion, storage and communication that can lead to data corruption and a resulting degradation of image quality. It is convenient to consider corrupted samples as missing, creating a sparsely sampled aperture. A sparse aperture would also result from compressive sensing, which is a very attractive concept for data intensive sen- sors such as SAR. Recent developments in sparse decomposition algorithms can be applied to the problem of SAR image formation from a sparsely sampled aperture. Two modified sparse decomposition algorithms are developed, based on well known existing algorithms, modified to be practical in application on modest computa- tional resources. The two algorithms are demonstrated on real-world SAR images. Algorithm performance with respect to super-resolution, noise, coherent speckle and target/clutter decomposition is explored. These algorithms yield more accu- rate image reconstruction from sparsely sampled apertures than classical spectral estimators. At the current state of development, sparse image reconstruction using these two algorithms require about two orders of magnitude greater processing time than classical SAR image formation.
ContributorsWerth, Nicholas (Author) / Karam, Lina (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)
Created2011
151656-Thumbnail Image.png
Description
Image resolution limits the extent to which zooming enhances clarity, restricts the size digital photographs can be printed at, and, in the context of medical images, can prevent a diagnosis. Interpolation is the supplementing of known data with estimated values based on a function or model involving some or all

Image resolution limits the extent to which zooming enhances clarity, restricts the size digital photographs can be printed at, and, in the context of medical images, can prevent a diagnosis. Interpolation is the supplementing of known data with estimated values based on a function or model involving some or all of the known samples. The selection of the contributing data points and the specifics of how they are used to define the interpolated values influences how effectively the interpolation algorithm is able to estimate the underlying, continuous signal. The main contributions of this dissertation are three fold: 1) Reframing edge-directed interpolation of a single image as an intensity-based registration problem. 2) Providing an analytical framework for intensity-based registration using control grid constraints. 3) Quantitative assessment of the new, single-image enlargement algorithm based on analytical intensity-based registration. In addition to single image resizing, the new methods and analytical approaches were extended to address a wide range of applications including volumetric (multi-slice) image interpolation, video deinterlacing, motion detection, and atmospheric distortion correction. Overall, the new approaches generate results that more accurately reflect the underlying signals than less computationally demanding approaches and with lower processing requirements and fewer restrictions than methods with comparable accuracy.
ContributorsZwart, Christine M. (Author) / Frakes, David H (Thesis advisor) / Karam, Lina (Committee member) / Kodibagkar, Vikram (Committee member) / Spanias, Andreas (Committee member) / Towe, Bruce (Committee member) / Arizona State University (Publisher)
Created2013
190757-Thumbnail Image.png
Description
Huge advancements have been made over the years in terms of modern image-sensing hardware and visual computing algorithms (e.g. computer vision, image processing, computational photography). However, to this day, there still exists a current gap between the hardware and software design in an imaging system, which silos one research domain

Huge advancements have been made over the years in terms of modern image-sensing hardware and visual computing algorithms (e.g. computer vision, image processing, computational photography). However, to this day, there still exists a current gap between the hardware and software design in an imaging system, which silos one research domain from another. Bridging this gap is the key to unlocking new visual computing capabilities for end applications in commercial photography, industrial inspection, and robotics. This thesis explores avenues where hardware-software co-design of image sensors can be leveraged to replace conventional hardware components in an imaging system with software for enhanced reconfigurability. As a result, the user can program the image sensor in a way best suited to the end application. This is referred to as software-defined imaging (SDI), where image sensor behavior can be altered by the system software depending on the user's needs. The scope of this thesis covers the development and deployment of SDI algorithms for low-power computer vision. Strategies for sparse spatial sampling have been developed in this thesis for power optimization of the vision sensor. This dissertation shows how a hardware-compatible state-of-the-art object tracker can be coupled with a Kalman filter for energy gains at the sensor level. Extensive experiments reveal how adaptive spatial sampling of image frames with this hardware-friendly framework offers attractive energy-accuracy tradeoffs. Another thrust of this thesis is to demonstrate the benefits of reinforcement learning in this research avenue. A major finding reported in this dissertation shows how neural-network-based reinforcement learning can be exploited for the adaptive subsampling framework to achieve improved sampling performance, thereby optimizing the energy efficiency of the image sensor. The last thrust of this thesis is to leverage emerging event-based SDI technology for building a low-power navigation system. A homography estimation pipeline has been proposed in this thesis which couples the right data representation with a differential scale-invariant feature transform (SIFT) module to extract rich visual cues from event streams. Positional encoding is leveraged with a multilayer perceptron (MLP) network to get robust homography estimation from event data.
ContributorsIqbal, Odrika (Author) / Jayasuriya, Suren (Thesis advisor) / Spanias, Andreas (Thesis advisor) / LiKamWa, Robert (Committee member) / Owens, Chris (Committee member) / Arizona State University (Publisher)
Created2023
168287-Thumbnail Image.png
Description
Dealing with relational data structures is central to a wide-range of applications including social networks, epidemic modeling, molecular chemistry, medicine, energy distribution, and transportation. Machine learning models that can exploit the inherent structural/relational bias in the graph structured data have gained prominence in recent times. A recurring idea that appears

Dealing with relational data structures is central to a wide-range of applications including social networks, epidemic modeling, molecular chemistry, medicine, energy distribution, and transportation. Machine learning models that can exploit the inherent structural/relational bias in the graph structured data have gained prominence in recent times. A recurring idea that appears in all approaches is to encode the nodes in the graph (or the entire graph) as low-dimensional vectors also known as embeddings, prior to carrying out downstream task-specific learning. It is crucial to eliminate hand-crafted features and instead directly incorporate the structural inductive bias into the deep learning architectures. In this dissertation, deep learning models that directly operate on graph structured data are proposed for effective representation learning. A literature review on existing graph representation learning is provided in the beginning of the dissertation. The primary focus of dissertation is on building novel graph neural network architectures that are robust against adversarial attacks. The proposed graph neural network models are extended to multiplex graphs (heterogeneous graphs). Finally, a relational neural network model is proposed to operate on a human structural connectome. For every research contribution of this dissertation, several empirical studies are conducted on benchmark datasets. The proposed graph neural network models, approaches, and architectures demonstrate significant performance improvements in comparison to the existing state-of-the-art graph embedding strategies.
ContributorsShanthamallu, Uday Shankar (Author) / Spanias, Andreas (Thesis advisor) / Thiagarajan, Jayaraman J (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)
Created2021
161801-Thumbnail Image.png
Description
High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are commonly used for dealing with such data. In addition, outliers

High-dimensional data is omnipresent in modern industrial systems. An imaging sensor in a manufacturing plant a can take images of millions of pixels or a sensor may collect months of data at very granular time steps. Dimensionality reduction techniques are commonly used for dealing with such data. In addition, outliers typically exist in such data, which may be of direct or indirect interest given the nature of the problem that is being solved. Current research does not address the interdependent nature of dimensionality reduction and outliers. Some works ignore the existence of outliers altogether—which discredits the robustness of these methods in real life—while others provide suboptimal, often band-aid solutions. In this dissertation, I propose novel methods to achieve outlier-awareness in various dimensionality reduction methods. The problem is considered from many different angles depend- ing on the dimensionality reduction technique used (e.g., deep autoencoder, tensors), the nature of the application (e.g., manufacturing, transportation) and the outlier structure (e.g., sparse point anomalies, novelties).
ContributorsSergin, Nurettin Dorukhan (Author) / Yan, Hao (Thesis advisor) / Li, Jing (Committee member) / Wu, Teresa (Committee member) / Tsung, Fugee (Committee member) / Arizona State University (Publisher)
Created2021
190765-Thumbnail Image.png
Description
Speech analysis for clinical applications has emerged as a burgeoning field, providing valuable insights into an individual's physical and physiological state. Researchers have explored speech features for clinical applications, such as diagnosing, predicting, and monitoring various pathologies. Before presenting the new deep learning frameworks, this thesis introduces a study on

Speech analysis for clinical applications has emerged as a burgeoning field, providing valuable insights into an individual's physical and physiological state. Researchers have explored speech features for clinical applications, such as diagnosing, predicting, and monitoring various pathologies. Before presenting the new deep learning frameworks, this thesis introduces a study on conventional acoustic feature changes in subjects with post-traumatic headache (PTH) attributed to mild traumatic brain injury (mTBI). This work demonstrates the effectiveness of using speech signals to assess the pathological status of individuals. At the same time, it highlights some of the limitations of conventional acoustic and linguistic features, such as low repeatability and generalizability. Two critical characteristics of speech features are (1) good robustness, as speech features need to generalize across different corpora, and (2) high repeatability, as speech features need to be invariant to all confounding factors except the pathological state of targets. This thesis presents two research thrusts in the context of speech signals in clinical applications that focus on improving the robustness and repeatability of speech features, respectively. The first thrust introduces a deep learning framework to generate acoustic feature embeddings sensitive to vocal quality and robust across different corpora. A contrastive loss combined with a classification loss is used to train the model jointly, and data-warping techniques are employed to improve the robustness of embeddings. Empirical results demonstrate that the proposed method achieves high in-corpus and cross-corpus classification accuracy and generates good embeddings sensitive to voice quality and robust across different corpora. The second thrust introduces using the intra-class correlation coefficient (ICC) to evaluate the repeatability of embeddings. A novel regularizer, the ICC regularizer, is proposed to regularize deep neural networks to produce embeddings with higher repeatability. This ICC regularizer is implemented and applied to three speech applications: a clinical application, speaker verification, and voice style conversion. The experimental results reveal that the ICC regularizer improves the repeatability of learned embeddings compared to the contrastive loss, leading to enhanced performance in downstream tasks.
ContributorsZhang, Jianwei (Author) / Jayasuriya, Suren (Thesis advisor) / Berisha, Visar (Thesis advisor) / Liss, Julie (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)
Created2023
187456-Thumbnail Image.png
Description
The past decade witnessed the success of deep learning models in various applications of computer vision and natural language processing. This success can be predominantly attributed to the (i) availability of large amounts of training data; (ii) access of domain aware knowledge; (iii) i.i.d assumption between the train and target

The past decade witnessed the success of deep learning models in various applications of computer vision and natural language processing. This success can be predominantly attributed to the (i) availability of large amounts of training data; (ii) access of domain aware knowledge; (iii) i.i.d assumption between the train and target distributions and (iv) belief on existing metrics as reliable indicators of performance. When any of these assumptions are violated, the models exhibit brittleness producing adversely varied behavior. This dissertation focuses on methods for accurate model design and characterization that enhance process reliability when certain assumptions are not met. With the need to safely adopt artificial intelligence tools in practice, it is vital to build reliable failure detectors that indicate regimes where the model must not be invoked. To that end, an error predictor trained with a self-calibration objective is developed to estimate loss consistent with the underlying model. The properties of the error predictor are described and their utility in supporting introspection via feature importances and counterfactual explanations is elucidated. While such an approach can signal data regime changes, it is critical to calibrate models using regimes of inlier (training) and outlier data to prevent under- and over-generalization in models i.e., incorrectly identifying inliers as outliers and vice-versa. By identifying the space for specifying inliers and outliers, an anomaly detector that can effectively flag data of varying semantic complexities in medical imaging is next developed. Uncertainty quantification in deep learning models involves identifying sources of failure and characterizing model confidence to enable actionability. A training strategy is developed that allows the accurate estimation of model uncertainties and its benefits are demonstrated for active learning and generalization gap prediction. This helps identify insufficiently sampled regimes and representation insufficiency in models. In addition, the task of deep inversion under data scarce scenarios is considered, which in practice requires a prior to control the optimization. By identifying limitations in existing work, data priors powered by generative models and deep model priors are designed for audio restoration. With relevant empirical studies on a variety of benchmarks, the need for such design strategies is demonstrated.
ContributorsNarayanaswamy, Vivek Sivaraman (Author) / Spanias, Andreas (Thesis advisor) / J. Thiagarajan, Jayaraman (Committee member) / Berisha, Visar (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)
Created2023
156587-Thumbnail Image.png
Description
Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance.

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance. This dissertation focuses on the representation learning approaches as the fusion strategy. Specifically, the objective is to learn the shared latent representation which jointly exploit the structural information encoded in all modalities, such that a straightforward learning model can be adopted to obtain the prediction.

We first consider sensor fusion, a typical multimodal fusion problem critical to building a pervasive computing platform. A systematic fusion technique is described to support both multiple sensors and descriptors for activity recognition. Targeted to learn the optimal combination of kernels, Multiple Kernel Learning (MKL) algorithms have been successfully applied to numerous fusion problems in computer vision etc. Utilizing the MKL formulation, next we describe an auto-context algorithm for learning image context via the fusion with low-level descriptors. Furthermore, a principled fusion algorithm using deep learning to optimize kernel machines is developed. By bridging deep architectures with kernel optimization, this approach leverages the benefits of both paradigms and is applied to a wide variety of fusion problems.

In many real-world applications, the modalities exhibit highly specific data structures, such as time sequences and graphs, and consequently, special design of the learning architecture is needed. In order to improve the temporal modeling for multivariate sequences, we developed two architectures centered around attention models. A novel clinical time series analysis model is proposed for several critical problems in healthcare. Another model coupled with triplet ranking loss as metric learning framework is described to better solve speaker diarization. Compared to state-of-the-art recurrent networks, these attention-based multivariate analysis tools achieve improved performance while having a lower computational complexity. Finally, in order to perform community detection on multilayer graphs, a fusion algorithm is described to derive node embedding from word embedding techniques and also exploit the complementary relational information contained in each layer of the graph.
ContributorsSong, Huan (Author) / Spanias, Andreas (Thesis advisor) / Thiagarajan, Jayaraman (Committee member) / Berisha, Visar (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)
Created2018
157308-Thumbnail Image.png
Description
Image-based process monitoring has recently attracted increasing attention due to the advancement of the sensing technologies. However, existing process monitoring methods fail to fully utilize the spatial information of images due to their complex characteristics including the high dimensionality and complex spatial structures. Recent advancement of the unsupervised deep models

Image-based process monitoring has recently attracted increasing attention due to the advancement of the sensing technologies. However, existing process monitoring methods fail to fully utilize the spatial information of images due to their complex characteristics including the high dimensionality and complex spatial structures. Recent advancement of the unsupervised deep models such as a generative adversarial network (GAN) and generative adversarial autoencoder (AAE) has enabled to learn the complex spatial structures automatically. Inspired by this advancement, we propose an anomaly detection framework based on the AAE for unsupervised anomaly detection for images. AAE combines the power of GAN with the variational autoencoder, which serves as a nonlinear dimension reduction technique with regularization from the discriminator. Based on this, we propose a monitoring statistic efficiently capturing the change of the image data. The performance of the proposed AAE-based anomaly detection algorithm is validated through a simulation study and real case study for rolling defect detection.
ContributorsYeh, Huai-Ming (Author) / Yan, Hao (Thesis advisor) / Pan, Rong (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)
Created2019
154086-Thumbnail Image.png
Description
Discriminative learning when training and test data belong to different distributions is a challenging and complex task. Often times we have very few or no labeled data from the test or target distribution, but we may have plenty of labeled data from one or multiple related sources with different distributions.

Discriminative learning when training and test data belong to different distributions is a challenging and complex task. Often times we have very few or no labeled data from the test or target distribution, but we may have plenty of labeled data from one or multiple related sources with different distributions. Due to its capability of migrating knowledge from related domains, transfer learning has shown to be effective for cross-domain learning problems. In this dissertation, I carry out research along this direction with a particular focus on designing efficient and effective algorithms for BioImaging and Bilingual applications. Specifically, I propose deep transfer learning algorithms which combine transfer learning and deep learning to improve image annotation performance. Firstly, I propose to generate the deep features for the Drosophila embryo images via pretrained deep models and build linear classifiers on top of the deep features. Secondly, I propose to fine-tune the pretrained model with a small amount of labeled images. The time complexity and performance of deep transfer learning methodologies are investigated. Promising results have demonstrated the knowledge transfer ability of proposed deep transfer algorithms. Moreover, I propose a novel Robust Principal Component Analysis (RPCA) approach to process the noisy images in advance. In addition, I also present a two-stage re-weighting framework for general domain adaptation problems. The distribution of source domain is mapped towards the target domain in the first stage, and an adaptive learning model is proposed in the second stage to incorporate label information from the target domain if it is available. Then the proposed model is applied to tackle cross lingual spam detection problem at LinkedIn’s website. Our experimental results on real data demonstrate the efficiency and effectiveness of the proposed algorithms.
ContributorsSun, Qian (Author) / Ye, Jieping (Committee member) / Xue, Guoliang (Committee member) / Liu, Huan (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)
Created2015