Search Content

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Modeling and control for microgrids

Description

Traditional approaches to modeling microgrids include the behavior of each inverter operating in a particular network configuration and at a particular operating point. Such models quickly become computationally intensive for large systems. Similarly, traditional approaches to control do not use advanced methodologies and suffer from poor performance and limited operating…

Traditional approaches to modeling microgrids include the behavior of each inverter operating in a particular network configuration and at a particular operating point. Such models quickly become computationally intensive for large systems. Similarly, traditional approaches to control do not use advanced methodologies and suffer from poor performance and limited operating range. In this document a linear model is derived for an inverter connected to the Thevenin equivalent of a microgrid. This model is then compared to a nonlinear simulation model and analyzed using the open and closed loop systems in both the time and frequency domains. The modeling error is quantified with emphasis on its use for controller design purposes. Control design examples are given using a Glover McFarlane controller, gain sched- uled Glover McFarlane controller, and bumpless transfer controller which are compared to the standard droop control approach. These examples serve as a guide to illustrate the use of multi-variable modeling techniques in the context of robust controller design and show that gain scheduled MIMO control techniques can extend the operating range of a microgrid. A hardware implementation is used to compare constant gain droop controllers with Glover McFarlane controllers and shows a clear advantage of the Glover McFarlane approach.

ContributorsSteenis, Joel (Author) / Ayyanar, Raja (Thesis advisor) / Mittelmann, Hans (Committee member) / Tsakalis, Konstantinos (Committee member) / Tylavsky, Daniel (Committee member) / Arizona State University (Publisher)

Created2013

Multi-task learning and its applications to biomedical informatics

Description

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning…

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning aims at simultaneously building models for all tasks in order to improve the generalization performance, leveraging inherent relatedness of these tasks. In this thesis, I firstly propose a clustered multi-task learning (CMTL) formulation, which simultaneously learns task models and performs task clustering. I provide theoretical analysis to establish the equivalence between the CMTL formulation and the alternating structure optimization, which learns a shared low-dimensional hypothesis space for different tasks. Then I present two real-world biomedical informatics applications which can benefit from multi-task learning. In the first application, I study the disease progression problem and present multi-task learning formulations for disease progression. In the formulations, the prediction at each point is a regression task and multiple tasks at different time points are learned simultaneously, leveraging the temporal smoothness among the tasks. The proposed formulations have been tested extensively on predicting the progression of the Alzheimer's disease, and experimental results demonstrate the effectiveness of the proposed models. In the second application, I present a novel data-driven framework for densifying the electronic medical records (EMR) to overcome the sparsity problem in predictive modeling using EMR. The densification of each patient is a learning task, and the proposed algorithm simultaneously densify all patients. As such, the densification of one patient leverages useful information from other patients.

ContributorsZhou, Jiayu (Author) / Ye, Jieping (Thesis advisor) / Mittelmann, Hans (Committee member) / Li, Baoxin (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)

Created2014

A variational approach to planning, allocation and mapping in robot swarms using infinite dimensional models

Description

This thesis considers two problems in the control of robotic swarms. Firstly, it addresses a trajectory planning and task allocation problem for a swarm of resource-constrained robots that cannot localize or communicate with each other and that exhibit stochasticity in their motion and task switching policies. We model the population…

This thesis considers two problems in the control of robotic swarms. Firstly, it addresses a trajectory planning and task allocation problem for a swarm of resource-constrained robots that cannot localize or communicate with each other and that exhibit stochasticity in their motion and task switching policies. We model the population dynamics of the robotic swarm as a set of advection-diffusion- reaction (ADR) partial differential equations (PDEs).

Specifically, we consider a linear parabolic PDE model that is bilinear in the robots' velocity and task-switching rates. These parameters constitute a set of time-dependent control variables that can be optimized and transmitted to the robots prior to their deployment or broadcasted in real time. The planning and allocation problem can then be formulated as a PDE-constrained optimization problem, which we solve using techniques from optimal control. Simulations of a commercial pollination scenario validate the ability of our control approach to drive a robotic swarm to achieve predefined spatial distributions of activity over a closed domain, which may contain obstacles. Secondly, we consider a mapping problem wherein a robotic swarm is deployed over a closed domain and it is necessary to reconstruct the unknown spatial distribution of a feature of interest. The ADR-based primitives result in a coefficient identification problem for the corresponding system of PDEs. To deal with the inherent ill-posedness of the problem, we frame it as an optimization problem. We validate our approach through simulations and show that reconstruction of the spatially-dependent coefficient can be achieved with considerable accuracy using temporal information alone.

ContributorsElamvazhuthi, Karthik (Author) / Berman, Spring Melody (Thesis advisor) / Peet, Matthew Monnig (Committee member) / Mittelmann, Hans (Committee member) / Arizona State University (Publisher)

Created2014

Adaptive mesh generation for solution of incompressible fluid flows using high order gradients

Description

A new method of adaptive mesh generation for the computation of fluid flows is investigated. The method utilizes gradients of the flow solution to adapt the size and stretching of elements or volumes in the computational mesh as is commonly done in the conventional Hessian approach. However, in…

A new method of adaptive mesh generation for the computation of fluid flows is investigated. The method utilizes gradients of the flow solution to adapt the size and stretching of elements or volumes in the computational mesh as is commonly done in the conventional Hessian approach. However, in the new method, higher-order gradients are used in place of the Hessian. The method is applied to the finite element solution of the incompressible Navier-Stokes equations on model problems. Results indicate that a significant efficiency benefit is realized.

ContributorsShortridge, Randall (Author) / Chen, Kang Ping (Thesis advisor) / Herrmann, Marcus (Thesis advisor) / Wells, Valana (Committee member) / Huang, Huei-Ping (Committee member) / Mittelmann, Hans (Committee member) / Arizona State University (Publisher)

Created2011

Some topics concerning the singular value decomposition and generalized singular value decomposition

Description

This dissertation involves three problems that are all related by the use of the singular value decomposition (SVD) or generalized singular value decomposition (GSVD). The specific problems are (i) derivation of a generalized singular value expansion (GSVE), (ii) analysis of the properties of the chi-squared method for regularization parameter selection…

This dissertation involves three problems that are all related by the use of the singular value decomposition (SVD) or generalized singular value decomposition (GSVD). The specific problems are (i) derivation of a generalized singular value expansion (GSVE), (ii) analysis of the properties of the chi-squared method for regularization parameter selection in the case of nonnormal data and (iii) formulation of a partial canonical correlation concept for continuous time stochastic processes. The finite dimensional SVD has an infinite dimensional generalization to compact operators. However, the form of the finite dimensional GSVD developed in, e.g., Van Loan does not extend directly to infinite dimensions as a result of a key step in the proof that is specific to the matrix case. Thus, the first problem of interest is to find an infinite dimensional version of the GSVD. One such GSVE for compact operators on separable Hilbert spaces is developed. The second problem concerns regularization parameter estimation. The chi-squared method for nonnormal data is considered. A form of the optimized regularization criterion that pertains to measured data or signals with nonnormal noise is derived. Large sample theory for phi-mixing processes is used to derive a central limit theorem for the chi-squared criterion that holds under certain conditions. Departures from normality are seen to manifest in the need for a possibly different scale factor in normalization rather than what would be used under the assumption of normality. The consequences of our large sample work are illustrated by empirical experiments. For the third problem, a new approach is examined for studying the relationships between a collection of functional random variables. The idea is based on the work of Sunder that provides mappings to connect the elements of algebraic and orthogonal direct sums of subspaces in a Hilbert space. When combined with a key isometry associated with a particular Hilbert space indexed stochastic process, this leads to a useful formulation for situations that involve the study of several second order processes. In particular, using our approach with two processes provides an independent derivation of the functional canonical correlation analysis (CCA) results of Eubank and Hsing. For more than two processes, a rigorous derivation of the functional partial canonical correlation analysis (PCCA) concept that applies to both finite and infinite dimensional settings is obtained.

ContributorsHuang, Qing (Author) / Eubank, Randall (Thesis advisor) / Renaut, Rosemary (Thesis advisor) / Cochran, Douglas (Committee member) / Gelb, Anne (Committee member) / Young, Dennis (Committee member) / Arizona State University (Publisher)

Created2012

Observability methods in sensor scheduling

Description

Modern measurement schemes for linear dynamical systems are typically designed so that different sensors can be scheduled to be used at each time step. To determine which sensors to use, various metrics have been suggested. One possible such metric is the observability of the system. Observability is a binary condition…

Modern measurement schemes for linear dynamical systems are typically designed so that different sensors can be scheduled to be used at each time step. To determine which sensors to use, various metrics have been suggested. One possible such metric is the observability of the system. Observability is a binary condition determining whether a finite number of measurements suffice to recover the initial state. However to employ observability for sensor scheduling, the binary definition needs to be expanded so that one can measure how observable a system is with a particular measurement scheme, i.e. one needs a metric of observability. Most methods utilizing an observability metric are about sensor selection and not for sensor scheduling. In this dissertation we present a new approach to utilize the observability for sensor scheduling by employing the condition number of the observability matrix as the metric and using column subset selection to create an algorithm to choose which sensors to use at each time step. To this end we use a rank revealing QR factorization algorithm to select sensors. Several numerical experiments are used to demonstrate the performance of the proposed scheme.

ContributorsIlkturk, Utku (Author) / Gelb, Anne (Thesis advisor) / Platte, Rodrigo (Thesis advisor) / Cochran, Douglas (Committee member) / Renaut, Rosemary (Committee member) / Armbruster, Dieter (Committee member) / Arizona State University (Publisher)

Created2015

Critical coupling and synchronized clusters in arbitrary networks of Kuramoto oscillators

Description

The Kuramoto model is an archetypal model for studying synchronization in groups

of nonidentical oscillators where oscillators are imbued with their own frequency and

coupled with other oscillators though a network of interactions. As the coupling

strength increases, there is a bifurcation to complete synchronization where all oscillators

move with the same frequency and…

The Kuramoto model is an archetypal model for studying synchronization in groups

of nonidentical oscillators where oscillators are imbued with their own frequency and

coupled with other oscillators though a network of interactions. As the coupling

strength increases, there is a bifurcation to complete synchronization where all oscillators

move with the same frequency and show a collective rhythm. Kuramoto-like

dynamics are considered a relevant model for instabilities of the AC-power grid which

operates in synchrony under standard conditions but exhibits, in a state of failure,

segmentation of the grid into desynchronized clusters.

In this dissertation the minimum coupling strength required to ensure total frequency

synchronization in a Kuramoto system, called the critical coupling, is investigated.

For coupling strength below the critical coupling, clusters of oscillators form

where oscillators within a cluster are on average oscillating with the same long-term

frequency. A unified order parameter based approach is developed to create approximations

of the critical coupling. Some of the new approximations provide strict lower

bounds for the critical coupling. In addition, these approximations allow for predictions

of the partially synchronized clusters that emerge in the bifurcation from the

synchronized state.

Merging the order parameter approach with graph theoretical concepts leads to a

characterization of this bifurcation as a weighted graph partitioning problem on an

arbitrary networks which then leads to an optimization problem that can efficiently

estimate the partially synchronized clusters. Numerical experiments on random Kuramoto

systems show the high accuracy of these methods. An interpretation of the

methods in the context of power systems is provided.

ContributorsGilg, Brady (Author) / Armbruster, Dieter (Thesis advisor) / Mittelmann, Hans (Committee member) / Scaglione, Anna (Committee member) / Strogatz, Steven (Committee member) / Welfert, Bruno (Committee member) / Arizona State University (Publisher)

Created2018

Practicality of the Convolutional Solution Method of the Polarization Estimation Inverse Problem for Solid Oxide Fuel Cells

Description

A specific species of the genus Geobacter exhibits useful electrical properties when processing a molecule often found in waste water. A team at ASU including Dr Cèsar Torres and Dr Sudeep Popat used that species to create a special type of solid oxide fuel cell we refer to as a…

A specific species of the genus Geobacter exhibits useful electrical properties when processing a molecule often found in waste water. A team at ASU including Dr Cèsar Torres and Dr Sudeep Popat used that species to create a special type of solid oxide fuel cell we refer to as a microbial fuel cell. Identification of possible chemical processes and properties of the reactions used by the Geobacter are investigated indirectly by taking measurements using Electrochemical Impedance Spectroscopy of the electrode-electrolyte interface of the microbial fuel cell to obtain the value of the fuel cell's complex impedance at specific frequencies. Investigation of the multiple polarization processes which give rise to measured impedance values is difficult to do directly and so examination of the distribution function of relaxation times (DRT) is considered instead. The DRT is related to the measured complex impedance values using a general, non-physical equivalent circuit model. That model is originally given in terms of a Fredholm integral equation with a non-square integrable kernel which makes the inverse problem of determining the DRT given the impedance measurements an ill-posed problem. The original integral equation is rewritten in terms of new variables into an equation relating the complex impedance to the convolution of a function based upon the original integral kernel and a related but separate distribution function which we call the convolutional distribution function. This new convolutional equation is solved by reducing the convolution to a pointwise product using the Fourier transform and then solving the inverse problem by pointwise division and application of a filter function (equivalent to regularization). The inverse Fourier transform is then taken to get the convolutional distribution function. In the literature the convolutional distribution function is then examined and certain values of a specific, less general equivalent circuit model are calculated from which aspects of the original chemical processes are derived. We attempted to instead directly determine the original DRT from the calculated convolutional distribution function. This method proved to be practically less useful due to certain values determined at the time of experiment which meant the original DRT could only be recovered in a window which would not normally contain the desired information for the original DRT. This limits any attempt to extend the solution for the convolutional distribution function to the original DRT. Further research may determine a method for interpreting the convolutional distribution function without an equivalent circuit model as is done with the regularization method used to solve directly for the original DRT.

ContributorsBaker, Robert Simpson (Author) / Renaut, Rosemary (Thesis director) / Kostelich, Eric (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Downsampling for Efficient Parameter Choice in Ill-Posed Deconvolution Problems

Description

Deconvolution of noisy data is an ill-posed problem, and requires some form of regularization to stabilize its solution. Tikhonov regularization is the most common method used, but it depends on the choice of a regularization parameter λ which must generally be estimated using one of several common methods. These methods…

Deconvolution of noisy data is an ill-posed problem, and requires some form of regularization to stabilize its solution. Tikhonov regularization is the most common method used, but it depends on the choice of a regularization parameter λ which must generally be estimated using one of several common methods. These methods can be computationally intensive, so I consider their behavior when only a portion of the sampled data is used. I show that the results of these methods converge as the sampling resolution increases, and use this to suggest a method of downsampling to estimate λ. I then present numerical results showing that this method can be feasible, and propose future avenues of inquiry.

ContributorsHansen, Jakob Kristian (Author) / Renaut, Rosemary (Thesis director) / Cochran, Douglas (Committee member) / Barrett, The Honors College (Contributor) / School of Music (Contributor) / Economics Program in CLAS (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2015-05