Search Content

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Optimized Hydrodynamics of Bio-Inspired Locomotive Swimmers

Description

This work aims to address the design optimization of bio-inspired locomotive devices in collective swimming by developing a computational methodology which combines surrogate-based optimization with high fidelity fluid-structure interactions (FSI) simulations of thunniform swimmers. Three main phases highlight the contribution and novelty of the current work. The first phase includes…

This work aims to address the design optimization of bio-inspired locomotive devices in collective swimming by developing a computational methodology which combines surrogate-based optimization with high fidelity fluid-structure interactions (FSI) simulations of thunniform swimmers. Three main phases highlight the contribution and novelty of the current work. The first phase includes the development and bench-marking of a constrained surrogate-based optimization algorithm which is appropriate to the current design problem. Additionally, new FSI techniques, such as a volume-conservation scheme, has been developed to enhance the accuracy and speed of the simulations. The second phase involves an investigation of the optimized hydrodynamics of a solitary accelerating self-propelled thunniform swimmer during start-up. The third phase extends the analysis to include the optimized hydrodynamics of accelerating swimmers in phalanx schools. Future work includes extending the analysis to the optimized hydrodynamics of steady-state and accelerating swimmers in a diamond-shaped school. The results of the first phase indicate that the proposed optimization algorithm maintains a competitive performance when compared to other gradient-based and gradient-free methods, in dealing with expensive simulations-based black-box optimization problems with constraints. In addition, the proposed optimization algorithm is capable of insuring strictly feasible candidates during the optimization procedure, which is a desirable property in applied engineering problems where design variables must remain feasible for simulations or experiments not to fail. The results of the second phase indicate that the optimized kinematic gait of a solitary accelerating swimmer generates the reverse Karman vortex street associated with high propulsive efficiency. Moreover, the efficiency of sub-optimum modes, in solitary swimming, is found to increase with both the tail amplitude and the effective flapping length of the swimmer, and a new scaling law is proposed to capture these trends. Results of the third phase indicate that the optimal midline kinematics in accelerating phalanx schools resemble those of accelerating solitary swimmers. The optimal separation distance in a phalanx school is shown to be around 2L (where L is the swimmer's total length). Furthermore, separation distance is shown to have a stronger effect, ceteris paribus, on the propulsion efficiency of a school when compared to phase synchronization.

ContributorsAbouhussein, Ahmed (Author) / Peet, Yulia (Thesis advisor) / Adrian, Ronald (Committee member) / Kim, Jeonglae (Committee member) / Kasbaoui, Mohamed (Committee member) / Mittelmann, Hans (Committee member) / Arizona State University (Publisher)

Created2022

Multi-task learning and its applications to biomedical informatics

Description

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning…

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning aims at simultaneously building models for all tasks in order to improve the generalization performance, leveraging inherent relatedness of these tasks. In this thesis, I firstly propose a clustered multi-task learning (CMTL) formulation, which simultaneously learns task models and performs task clustering. I provide theoretical analysis to establish the equivalence between the CMTL formulation and the alternating structure optimization, which learns a shared low-dimensional hypothesis space for different tasks. Then I present two real-world biomedical informatics applications which can benefit from multi-task learning. In the first application, I study the disease progression problem and present multi-task learning formulations for disease progression. In the formulations, the prediction at each point is a regression task and multiple tasks at different time points are learned simultaneously, leveraging the temporal smoothness among the tasks. The proposed formulations have been tested extensively on predicting the progression of the Alzheimer's disease, and experimental results demonstrate the effectiveness of the proposed models. In the second application, I present a novel data-driven framework for densifying the electronic medical records (EMR) to overcome the sparsity problem in predictive modeling using EMR. The densification of each patient is a learning task, and the proposed algorithm simultaneously densify all patients. As such, the densification of one patient leverages useful information from other patients.

ContributorsZhou, Jiayu (Author) / Ye, Jieping (Thesis advisor) / Mittelmann, Hans (Committee member) / Li, Baoxin (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)

Created2014

Test-based falsification and conformance testing for cyber-physical systems

Description

In this dissertation, two problems are addressed in the verification and control of Cyber-Physical Systems (CPS):

1) Falsification: given a CPS, and a property of interest that the CPS must satisfy under all allowed operating conditions, does the CPS violate, i.e. falsify, the property?

2) Conformance testing: given a model of a…

In this dissertation, two problems are addressed in the verification and control of Cyber-Physical Systems (CPS):

1) Falsification: given a CPS, and a property of interest that the CPS must satisfy under all allowed operating conditions, does the CPS violate, i.e. falsify, the property?

2) Conformance testing: given a model of a CPS, and an implementation of that CPS on an embedded platform, how can we characterize the properties satisfied by the implementation, given the properties satisfied by the model?

Both problems arise in the context of Model-Based Design (MBD) of CPS: in MBD, the designers start from a set of formal requirements that the system-to-be-designed must satisfy.

A first model of the system is created.

Because it may not be possible to formally verify the CPS model against the requirements, falsification tries to verify whether the model satisfies the requirements by searching for behavior that violates them.

In the first part of this dissertation, I present improved methods for finding falsifying behaviors of CPS when properties are expressed in Metric Temporal Logic (MTL).

These methods leverage the notion of robust semantics of MTL formulae: if a falsifier exists, it is in the neighborhood of local minimizers of the robustness function.

The proposed algorithms compute descent directions of the robustness function in the space of initial conditions and input signals, and provably converge to local minima of the robustness function.

The initial model of the CPS is then iteratively refined by modeling previously ignored phenomena, adding more functionality, etc., with each refinement resulting in a new model.

Many of the refinements in the MBD process described above do not provide an a priori guaranteed relation between the successive models.

Thus, the second problem above arises: how to quantify the distance between two successive models M_n and M_{n+1}?

If M_n has been verified to satisfy the specification, can it be guaranteed that M_{n+1} also satisfies the same, or some closely related, specification?

This dissertation answers both questions for a general class of CPS, and properties expressed in MTL.

ContributorsAbbas, Houssam Y (Author) / Fainekos, Georgios (Thesis advisor) / Duman, Tolga (Thesis advisor) / Mittelmann, Hans (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2015

Filtering by

Batch mode active learning for multimedia pattern recognition

Optimized Hydrodynamics of Bio-Inspired Locomotive Swimmers

Multi-task learning and its applications to biomedical informatics

Test-based falsification and conformance testing for cyber-physical systems