Search Content

Towards robust semantic attribute learning in visual computing

Description

The rapid growth of social media in recent years provides a large amount of user-generated visual objects, e.g., images and videos. Advanced semantic understanding approaches on such visual objects are desired to better serve applications such as human-machine interaction, image retrieval, etc. Semantic visual attributes have been proposed and utilized…

The rapid growth of social media in recent years provides a large amount of user-generated visual objects, e.g., images and videos. Advanced semantic understanding approaches on such visual objects are desired to better serve applications such as human-machine interaction, image retrieval, etc. Semantic visual attributes have been proposed and utilized in multiple visual computing tasks to bridge the so-called "semantic gap" between extractable low-level feature representations and high-level semantic understanding of the visual objects.

Despite years of research, there are still some unsolved problems on semantic attribute learning. First, real-world applications usually involve hundreds of attributes which requires great effort to acquire sufficient amount of labeled data for model learning. Second, existing attribute learning work for visual objects focuses primarily on images, with semantic analysis on videos left largely unexplored.

In this dissertation I conduct innovative research and propose novel approaches to tackling the aforementioned problems. In particular, I propose robust and accurate learning frameworks on both attribute ranking and prediction by exploring the correlation among multiple attributes and utilizing various types of label information. Furthermore, I propose a video-based skill coaching framework by extending attribute learning to the video domain for robust motion skill analysis. Experiments on various types of applications and datasets and comparisons with multiple state-of-the-art baseline approaches confirm that my proposed approaches can achieve significant performance improvements for the general attribute learning problem.

ContributorsChen, Lin (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Wang, Yalin (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)

Created2016

Multi-task learning and its applications to biomedical informatics

Description

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning…

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning aims at simultaneously building models for all tasks in order to improve the generalization performance, leveraging inherent relatedness of these tasks. In this thesis, I firstly propose a clustered multi-task learning (CMTL) formulation, which simultaneously learns task models and performs task clustering. I provide theoretical analysis to establish the equivalence between the CMTL formulation and the alternating structure optimization, which learns a shared low-dimensional hypothesis space for different tasks. Then I present two real-world biomedical informatics applications which can benefit from multi-task learning. In the first application, I study the disease progression problem and present multi-task learning formulations for disease progression. In the formulations, the prediction at each point is a regression task and multiple tasks at different time points are learned simultaneously, leveraging the temporal smoothness among the tasks. The proposed formulations have been tested extensively on predicting the progression of the Alzheimer's disease, and experimental results demonstrate the effectiveness of the proposed models. In the second application, I present a novel data-driven framework for densifying the electronic medical records (EMR) to overcome the sparsity problem in predictive modeling using EMR. The densification of each patient is a learning task, and the proposed algorithm simultaneously densify all patients. As such, the densification of one patient leverages useful information from other patients.

ContributorsZhou, Jiayu (Author) / Ye, Jieping (Thesis advisor) / Mittelmann, Hans (Committee member) / Li, Baoxin (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)

Created2014

Semantic sparse learning in images and videos

Description

Many learning models have been proposed for various tasks in visual computing. Popular examples include hidden Markov models and support vector machines. Recently, sparse-representation-based learning methods have attracted a lot of attention in the computer vision field, largely because of their impressive performance in many applications. In the literature, many…

Many learning models have been proposed for various tasks in visual computing. Popular examples include hidden Markov models and support vector machines. Recently, sparse-representation-based learning methods have attracted a lot of attention in the computer vision field, largely because of their impressive performance in many applications. In the literature, many of such sparse learning methods focus on designing or application of some learning techniques for certain feature space without much explicit consideration on possible interaction between the underlying semantics of the visual data and the employed learning technique. Rich semantic information in most visual data, if properly incorporated into algorithm design, should help achieving improved performance while delivering intuitive interpretation of the algorithmic outcomes. My study addresses the problem of how to explicitly consider the semantic information of the visual data in the sparse learning algorithms. In this work, we identify four problems which are of great importance and broad interest to the community. Specifically, a novel approach is proposed to incorporate label information to learn a dictionary which is not only reconstructive but also discriminative; considering the formation process of face images, a novel image decomposition approach for an ensemble of correlated images is proposed, where a subspace is built from the decomposition and applied to face recognition; based on the observation that, the foreground (or salient) objects are sparse in input domain and the background is sparse in frequency domain, a novel and efficient spatio-temporal saliency detection algorithm is proposed to identify the salient regions in video; and a novel hidden Markov model learning approach is proposed by utilizing a sparse set of pairwise comparisons among the data, which is easier to obtain and more meaningful, consistent than tradition labels, in many scenarios, e.g., evaluating motion skills in surgical simulations. In those four problems, different types of semantic information are modeled and incorporated in designing sparse learning algorithms for the corresponding visual computing tasks. Several real world applications are selected to demonstrate the effectiveness of the proposed methods, including, face recognition, spatio-temporal saliency detection, abnormality detection, spatio-temporal interest point detection, motion analysis and emotion recognition. In those applications, data of different modalities are involved, ranging from audio signal, image to video. Experiments on large scale real world data with comparisons to state-of-art methods confirm the proposed approaches deliver salient advantages, showing adding those semantic information dramatically improve the performances of the general sparse learning methods.

ContributorsZhang, Qiang (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Wang, Yalin (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2014

Cluster metrics and temporal coherency in pixel based matrices

Description

In this thesis, the application of pixel-based vertical axes used within parallel coordinate plots is explored in an attempt to improve how existing tools can explain complex multivariate interactions across temporal data. Several promising visualization techniques are combined, such as: visual boosting to allow for quicker consumption of large data…

In this thesis, the application of pixel-based vertical axes used within parallel coordinate plots is explored in an attempt to improve how existing tools can explain complex multivariate interactions across temporal data. Several promising visualization techniques are combined, such as: visual boosting to allow for quicker consumption of large data sets, the bond energy algorithm to find finer patterns and anomalies through contrast, multi-dimensional scaling, flow lines, user guided clustering, and row-column ordering. User input is applied on precomputed data sets to provide for real time interaction. General applicability of the techniques are tested against industrial trade, social networking, financial, and sparse data sets of varying dimensionality.

ContributorsHayden, Thomas (Author) / Maciejewski, Ross (Thesis advisor) / Wang, Yalin (Committee member) / Runger, George C. (Committee member) / Mack, Elizabeth (Committee member) / Arizona State University (Publisher)

Created2014

Graph-based sparse learning: models, algorithms, and applications

Description

Sparse learning is a powerful tool to generate models of high-dimensional data with high interpretability, and it has many important applications in areas such as bioinformatics, medical image processing, and computer vision. Recently, the a priori structural information has been shown to be powerful for improving the performance of sparse…

Sparse learning is a powerful tool to generate models of high-dimensional data with high interpretability, and it has many important applications in areas such as bioinformatics, medical image processing, and computer vision. Recently, the a priori structural information has been shown to be powerful for improving the performance of sparse learning models. A graph is a fundamental way to represent structural information of features. This dissertation focuses on graph-based sparse learning. The first part of this dissertation aims to integrate a graph into sparse learning to improve the performance. Specifically, the problem of feature grouping and selection over a given undirected graph is considered. Three models are proposed along with efficient solvers to achieve simultaneous feature grouping and selection, enhancing estimation accuracy. One major challenge is that it is still computationally challenging to solve large scale graph-based sparse learning problems. An efficient, scalable, and parallel algorithm for one widely used graph-based sparse learning approach, called anisotropic total variation regularization is therefore proposed, by explicitly exploring the structure of a graph. The second part of this dissertation focuses on uncovering the graph structure from the data. Two issues in graphical modeling are considered. One is the joint estimation of multiple graphical models using a fused lasso penalty and the other is the estimation of hierarchical graphical models. The key technical contribution is to establish the necessary and sufficient condition for the graphs to be decomposable. Based on this key property, a simple screening rule is presented, which reduces the size of the optimization problem, dramatically reducing the computational cost.

ContributorsYang, Sen (Author) / Ye, Jieping (Thesis advisor) / Wonka, Peter (Thesis advisor) / Wang, Yalin (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)

Created2014

Developing new methods for analyzing urban energy use in buildings: historic turnover, spatial patterns, and future forecasting

Description

Energy use within urban building stocks is continuing to increase globally as populations expand and access to electricity improves. This projected increase in demand could require deployment of new generation capacity, but there is potential to offset some of this demand through modification of the buildings themselves. Building…

Energy use within urban building stocks is continuing to increase globally as populations expand and access to electricity improves. This projected increase in demand could require deployment of new generation capacity, but there is potential to offset some of this demand through modification of the buildings themselves. Building stocks are quasi-permanent infrastructures which have enduring influence on urban energy consumption, and research is needed to understand: 1) how development patterns constrain energy use decisions and 2) how cities can achieve energy and environmental goals given the constraints of the stock. This requires a thorough evaluation of both the growth of the stock and as well as the spatial distribution of use throughout the city. In this dissertation, a case study in Los Angeles County, California (LAC) is used to quantify urban growth, forecast future energy use under climate change, and to make recommendations for mitigating energy consumption increases. A reproducible methodological framework is included for application to other urban areas.

In LAC, residential electricity demand could increase as much as 55-68% between 2020 and 2060, and building technology lock-in has constricted the options for mitigating energy demand, as major changes to the building stock itself are not possible, as only a small portion of the stock is turned over every year. Aggressive and timely efficiency upgrades to residential appliances and building thermal shells can significantly offset the projected increases, potentially avoiding installation of new generation capacity, but regulations on new construction will likely be ineffectual due to the long residence time of the stock (60+ years and increasing). These findings can be extrapolated to other U.S. cities where the majority of urban expansion has already occurred, such as the older cities on the eastern coast. U.S. population is projected to increase 40% by 2060, with growth occurring in the warmer southern and western regions. In these growing cities, improving new construction buildings can help offset electricity demand increases before the city reaches the lock-in phase.

ContributorsReyna, Janet Lorel (Author) / Chester, Mikhail V (Thesis advisor) / Gurney, Kevin (Committee member) / Reddy, T. Agami (Committee member) / Rey, Sergio (Committee member) / Arizona State University (Publisher)

Created2016

Evaluating Tessellation and Screen-Space Ambient Occlusion in WebGL-Based Real-Time Application

Description

Tessellation and Screen-Space Ambient Occlusion are algorithms which have been widely-used in real-time rendering in the past decade. They aim to enhance the details of the mesh, cast better shadow effects and improve the quality of the rendered images in real time. WebGL is a web-based graphics library derived from…

Tessellation and Screen-Space Ambient Occlusion are algorithms which have been widely-used in real-time rendering in the past decade. They aim to enhance the details of the mesh, cast better shadow effects and improve the quality of the rendered images in real time. WebGL is a web-based graphics library derived from OpenGL ES used for rendering in web applications. It is relatively new and has been rapidly evolving, this has resulted in it supporting a subset of rendering features normally supported by desktop applications. In this thesis, the research is focusing on evaluating Curved PN-Triangles tessellation with Screen Space Ambient Occlusion (SSAO), Horizon-Based Ambient Occlusion (HBAO) and Horizon-Based Ambient Occlusion Plus (HBAO+) in WebGL-based real-time application and comparing its performance to desktop based application and to discuss the capabilities, limitations and bottlenecks of WebGL 1.0.

ContributorsLi, Chenyang (Author) / Amresh, Ashish (Thesis advisor) / Wang, Yalin (Thesis advisor) / Kobayashi, Yoshihiro (Committee member) / Arizona State University (Publisher)

Created2017

Metal Complexes for Organic Optoelectronic Applications

Description

Organic optoelectronic devices have drawn extensive attention by over the past two decades. Two major applications for Organic optoelectronic devices are efficient organic photovoltaic devices(OPV) and organic light emitting diodes (OLED). Organic Solar cell has been proven to be compatible with the low cost, large area bulk processing technology and…

Organic optoelectronic devices have drawn extensive attention by over the past two decades. Two major applications for Organic optoelectronic devices are efficient organic photovoltaic devices(OPV) and organic light emitting diodes (OLED). Organic Solar cell has been proven to be compatible with the low cost, large area bulk processing technology and processed high absorption efficiencies compared to inorganic solar cells. Organic light emitting diodes are a promising approach for display and solid state lighting applications. To improve the efficiency, stability, and materials variety for organic optoelectronic devices, several emissive materials, absorber-type materials, and charge transporting materials were developed and employed in various device settings. Optical, electrical, and photophysical studies of the organic materials and their corresponding devices were thoroughly carried out. In this thesis, Chapter 1 provides an introduction to the background knowledge of OPV and OLED research fields presented. Chapter 2 discusses new porphyrin derivatives- azatetrabenzylporphyrins for OPV and near infrared OLED applications. A modified synthetic method is utilized to increase the reaction yield of the azatetrabenzylporphyrin materials and their photophysical properties, electrochemical properties are studied. OPV devices are also fabricated using Zinc azatetrabenzylporphyrin as donor materials. Pt(II) azatetrabenzylporphyrin were also synthesized and used in near infra-red OLED to achieve an emission over 800 nm with reasonable external quantum efficiencies. Chapter 3, discusses the synthesis, characterization, and device evaluation of a series of tetradentate platinum and palladium complexesfor single doped white OLED applications and RGB white OLED applications. Devices employing some of the developed emitters demonstrated impressively high external quantum efficiencies within the range of 22%-27% for various emitter concentrations. And the palladium complex, i.e. Pd3O3, enables the fabrication of stable devices achieving nearly 1000h. at 1000cd/m2 without any outcoupling enhancement while simultaneously achieving peak external quantum efficiencies of 19.9%. Chapter 4 discusses tetradentate platinum and palladium complexes as deep blue emissive materials for display and lighting applications. The platinum complex PtNON, achieved a peak external quantum efficiency of 24.4 % and CIE coordinates of (0.18, 0.31) in a device structure designed for charge confinement and the palladium complexes Pd2O2 exhibited peak external quantum efficiency of up to 19.2%.

ContributorsHuang, Liang (Author) / Li, Jian (Thesis advisor) / Adams, James (Committee member) / Alford, Terry (Committee member) / Arizona State University (Publisher)

Created2017

Gene Network Inference via Sequence Alignment and Rectification

Description

While techniques for reading DNA in some capacity has been possible for decades,

the ability to accurately edit genomes at scale has remained elusive. Novel techniques

have been introduced recently to aid in the writing of DNA sequences. While writing

DNA is more accessible, it still remains expensive, justifying the increased interest in

in…

While techniques for reading DNA in some capacity has been possible for decades,

the ability to accurately edit genomes at scale has remained elusive. Novel techniques

have been introduced recently to aid in the writing of DNA sequences. While writing

DNA is more accessible, it still remains expensive, justifying the increased interest in

in silico predictions of cell behavior. In order to accurately predict the behavior of

cells it is necessary to extensively model the cell environment, including gene-to-gene

interactions as completely as possible.

Significant algorithmic advances have been made for identifying these interactions,

but despite these improvements current techniques fail to infer some edges, and

fail to capture some complexities in the network. Much of this limitation is due to

heavily underdetermined problems, whereby tens of thousands of variables are to be

inferred using datasets with the power to resolve only a small fraction of the variables.

Additionally, failure to correctly resolve gene isoforms using short reads contributes

significantly to noise in gene quantification measures.

This dissertation introduces novel mathematical models, machine learning techniques,

and biological techniques to solve the problems described above. Mathematical

models are proposed for simulation of gene network motifs, and raw read simulation.

Machine learning techniques are shown for DNA sequence matching, and DNA

sequence correction.

Results provide novel insights into the low level functionality of gene networks. Also

shown is the ability to use normalization techniques to aggregate data for gene network

inference leading to larger data sets while minimizing increases in inter-experimental

noise. Results also demonstrate that high error rates experienced by third generation

sequencing are significantly different than previous error profiles, and that these errors can be modeled, simulated, and rectified. Finally, techniques are provided for amending this DNA error that preserve the benefits of third generation sequencing.

ContributorsFaucon, Philippe Christophe (Author) / Liu, Huan (Thesis advisor) / Wang, Xiao (Committee member) / Crook, Sharon M (Committee member) / Wang, Yalin (Committee member) / Sarjoughian, Hessam S. (Committee member) / Arizona State University (Publisher)

Created2017

Deep Learning based Classification of FDG-PET Data for Alzheimer's Disease

Description

Alzheimer’s Disease (AD), a neurodegenerative disease is a progressive disease that affects the brain gradually with time and worsens. Reliable and early diagnosis of AD and its prodromal stages (i.e. Mild Cognitive Impairment(MCI)) is essential. Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate…

Alzheimer’s Disease (AD), a neurodegenerative disease is a progressive disease that affects the brain gradually with time and worsens. Reliable and early diagnosis of AD and its prodromal stages (i.e. Mild Cognitive Impairment(MCI)) is essential. Fluorodeoxyglucose (FDG) positron emission tomography (PET) measures the decline in the regional cerebral metabolic rate for glucose, offering a reliable metabolic biomarker even on presymptomatic AD patients. PET scans provide functional information that is unique and unavailable using other types of imaging. The computational efficacy of FDG-PET data alone, for the classification of various Alzheimer’s Diagnostic categories (AD, MCI (LMCI, EMCI), Control) has not been studied. This serves as motivation to correctly classify the various diagnostic categories using FDG-PET data. Deep learning has recently been applied to the analysis of structural and functional brain imaging data. This thesis is an introduction to a deep learning based classification technique using neural networks with dimensionality reduction techniques to classify the different stages of AD based on FDG-PET image analysis.

This thesis develops a classification method to investigate the performance of FDG-PET as an effective biomarker for Alzheimer's clinical group classification. This involves dimensionality reduction using Probabilistic Principal Component Analysis on max-pooled data and mean-pooled data, followed by a Multilayer Feed Forward Neural Network which performs binary classification. Max pooled features result into better classification performance compared to results on mean pooled features. Additionally, experiments are done to investigate if the addition of important demographic features such as Functional Activities Questionnaire(FAQ), gene information helps improve performance. Classification results indicate that our designed classifiers achieve competitive results, and better with the additional of demographic features.

ContributorsSingh, Shibani (Author) / Wang, Yalin (Thesis advisor) / Li, Baoxin (Committee member) / Liang, Jianming (Committee member) / Arizona State University (Publisher)

Created2017