Search Content

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Re-sonification of objects, events, and environments

Description

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating…

Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating objects. In this work, methods of sound synthesis by re-sonification are considered. Re-sonification, herein, refers to the general process of analyzing, possibly transforming, and resynthesizing or reusing recorded sounds in meaningful ways, to convey information. Applied to soundscapes, re-sonification is presented as a means of conveying activity within an environment. Applied to the sounds of objects, this work examines modeling the perception of objects as well as their physical properties and the ability to simulate interactive events with such objects. To create soundscapes to re-sonify geographic environments, a method of automated soundscape design is presented. Using recorded sounds that are classified based on acoustic, social, semantic, and geographic information, this method produces stochastically generated soundscapes to re-sonify selected geographic areas. Drawing on prior knowledge, local sounds and those deemed similar comprise a locale's soundscape. In the context of re-sonifying events, this work examines processes for modeling and estimating the excitations of sounding objects. These include plucking, striking, rubbing, and any interaction that imparts energy into a system, affecting the resultant sound. A method of estimating a linear system's input, constrained to a signal-subspace, is presented and applied toward improving the estimation of percussive excitations for re-sonification. To work toward robust recording-based modeling and re-sonification of objects, new implementations of banded waveguide (BWG) models are proposed for object modeling and sound synthesis. Previous implementations of BWGs use arbitrary model parameters and may produce a range of simulations that do not match digital waveguide or modal models of the same design. Subject to linear excitations, some models proposed here behave identically to other equivalently designed physical models. Under nonlinear interactions, such as bowing, many of the proposed implementations exhibit improvements in the attack characteristics of synthesized sounds.

ContributorsFink, Alex M (Author) / Spanias, Andreas S (Thesis advisor) / Cook, Perry R. (Committee member) / Turaga, Pavan (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2013

Modeling and control for microgrids

Description

Traditional approaches to modeling microgrids include the behavior of each inverter operating in a particular network configuration and at a particular operating point. Such models quickly become computationally intensive for large systems. Similarly, traditional approaches to control do not use advanced methodologies and suffer from poor performance and limited operating…

Traditional approaches to modeling microgrids include the behavior of each inverter operating in a particular network configuration and at a particular operating point. Such models quickly become computationally intensive for large systems. Similarly, traditional approaches to control do not use advanced methodologies and suffer from poor performance and limited operating range. In this document a linear model is derived for an inverter connected to the Thevenin equivalent of a microgrid. This model is then compared to a nonlinear simulation model and analyzed using the open and closed loop systems in both the time and frequency domains. The modeling error is quantified with emphasis on its use for controller design purposes. Control design examples are given using a Glover McFarlane controller, gain sched- uled Glover McFarlane controller, and bumpless transfer controller which are compared to the standard droop control approach. These examples serve as a guide to illustrate the use of multi-variable modeling techniques in the context of robust controller design and show that gain scheduled MIMO control techniques can extend the operating range of a microgrid. A hardware implementation is used to compare constant gain droop controllers with Glover McFarlane controllers and shows a clear advantage of the Glover McFarlane approach.

ContributorsSteenis, Joel (Author) / Ayyanar, Raja (Thesis advisor) / Mittelmann, Hans (Committee member) / Tsakalis, Konstantinos (Committee member) / Tylavsky, Daniel (Committee member) / Arizona State University (Publisher)

Created2013

Multipath mitigating correlation kernels

Description

Autonomous vehicle control systems utilize real-time kinematic Global Navigation Satellite Systems (GNSS) receivers to provide a position within two-centimeter of truth. GNSS receivers utilize the satellite signal time of arrival estimates to solve for position; and multipath corrupts the time of arrival estimates with a time-varying bias. Time of arrival…

Autonomous vehicle control systems utilize real-time kinematic Global Navigation Satellite Systems (GNSS) receivers to provide a position within two-centimeter of truth. GNSS receivers utilize the satellite signal time of arrival estimates to solve for position; and multipath corrupts the time of arrival estimates with a time-varying bias. Time of arrival estimates are based upon accurate direct sequence spread spectrum (DSSS) code and carrier phase tracking. Current multipath mitigating GNSS solutions include fixed radiation pattern antennas and windowed delay-lock loop code phase discriminators. A new multipath mitigating code tracking algorithm is introduced that utilizes a non-symmetric correlation kernel to reject multipath. Independent parameters provide a means to trade-off code tracking discriminant gain against multipath mitigation performance. The algorithm performance is characterized in terms of multipath phase error bias, phase error estimation variance, tracking range, tracking ambiguity and implementation complexity. The algorithm is suitable for modernized GNSS signals including Binary Phase Shift Keyed (BPSK) and a variety of Binary Offset Keyed (BOC) signals. The algorithm compensates for unbalanced code sequences to ensure a code tracking bias does not result from the use of asymmetric correlation kernels. The algorithm does not require explicit knowledge of the propagation channel model. Design recommendations for selecting the algorithm parameters to mitigate precorrelation filter distortion are also provided.

ContributorsMiller, Steven (Author) / Spanias, Andreas (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Tsakalis, Konstantinos (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)

Created2013

The influence of dome size, parent vessel angle, and coil packing density on coil embolization treatment in cerebral aneurysms

Description

A cerebral aneurysm is a bulging of a blood vessel in the brain. Aneurysmal rupture affects 25,000 people each year and is associated with a 45% mortality rate. Therefore, it is critically important to treat cerebral aneurysms effectively before they rupture. Endovascular coiling is the most effective treatment for cerebral…

A cerebral aneurysm is a bulging of a blood vessel in the brain. Aneurysmal rupture affects 25,000 people each year and is associated with a 45% mortality rate. Therefore, it is critically important to treat cerebral aneurysms effectively before they rupture. Endovascular coiling is the most effective treatment for cerebral aneurysms. During coiling process, series of metallic coils are deployed into the aneurysmal sack with the intent of reaching a sufficient packing density (PD). Coils packing can facilitate thrombus formation and help seal off the aneurysm from circulation over time. While coiling is effective, high rates of treatment failure have been associated with basilar tip aneurysms (BTAs). Treatment failure may be related to geometrical features of the aneurysm. The purpose of this study was to investigate the influence of dome size, parent vessel (PV) angle, and PD on post-treatment aneurysmal hemodynamics using both computational fluid dynamics (CFD) and particle image velocimetry (PIV). Flows in four idealized BTA models with a combination of dome sizes and two different PV angles were simulated using CFD and then validated against PIV data. Percent reductions in post-treatment aneurysmal velocity and cross-neck (CN) flow as well as percent coverage of low wall shear stress (WSS) area were analyzed. In all models, aneurysmal velocity and CN flow decreased after coiling, while low WSS area increased. However, with increasing PD, further reductions were observed in aneurysmal velocity and CN flow, but minimal changes were observed in low WSS area. Overall, coil PD had the greatest impact while dome size has greater impact than PV angle on aneurysmal hemodynamics. These findings lead to a conclusion that combinations of treatment goals and geometric factor may play key roles in coil embolization treatment outcomes, and support that different treatment timing may be a critical factor in treatment optimization.

ContributorsIndahlastari, Aprinda (Author) / Frakes, David (Thesis advisor) / Chong, Brian (Committee member) / Muthuswamy, Jitendran (Committee member) / Arizona State University (Publisher)

Created2013

Building adaptive computational systems for physiological and biomedical data

Description

In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems.…

In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems. The greatest challenge in developing such systems is the subject-dependent data variations or subject-based variability in physiological and biomedical data, which leads to difference in data distributions making the task of modeling these data, using traditional machine learning algorithms, complex and challenging. As a result, despite the wide application of machine learning, efficient deployment of its principles to model real-world data is still a challenge. This dissertation addresses the problem of subject based variability in physiological and biomedical data and proposes person adaptive prediction models based on novel transfer and active learning algorithms, an emerging field in machine learning. One of the significant contributions of this dissertation is a person adaptive method, for early detection of muscle fatigue using Surface Electromyogram signals, based on a new multi-source transfer learning algorithm. This dissertation also proposes a subject-independent algorithm for grading the progression of muscle fatigue from 0 to 1 level in a test subject, during isometric or dynamic contractions, at real-time. Besides subject based variability, biomedical image data also varies due to variations in their imaging techniques, leading to distribution differences between the image databases. Hence a classifier learned on one database may perform poorly on the other database. Another significant contribution of this dissertation has been the design and development of an efficient biomedical image data annotation framework, based on a novel combination of transfer learning and a new batch-mode active learning method, capable of addressing the distribution differences across databases. The methodologies developed in this dissertation are relevant and applicable to a large set of computing problems where there is a high variation of data between subjects or sources, such as face detection, pose detection and speech recognition. From a broader perspective, these frameworks can be viewed as a first step towards design of automated adaptive systems for real world data.

ContributorsChattopadhyay, Rita (Author) / Panchanathan, Sethuraman (Thesis advisor) / Ye, Jieping (Thesis advisor) / Li, Baoxin (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)

Created2013

Robust implementation of NL2KR system and it's application in iRODS domain

Description

Currently, to interact with computer based systems one needs to learn the specific interface language of that system. In most cases, interaction would be much easier if it could be done in natural language. For that, we will need a module which understands natural language and automatically translates it to…

Currently, to interact with computer based systems one needs to learn the specific interface language of that system. In most cases, interaction would be much easier if it could be done in natural language. For that, we will need a module which understands natural language and automatically translates it to the interface language of the system. NL2KR (Natural language to knowledge representation) v.1 system is a prototype of such a system. It is a learning based system that learns new meanings of words in terms of lambda-calculus formulas given an initial lexicon of some words and their meanings and a training corpus of sentences with their translations. As a part of this thesis, we take the prototype NL2KR v.1 system and enhance various components of it to make it usable for somewhat substantial and useful interface languages. We revamped the lexicon learning components, Inverse-lambda and Generalization modules, and redesigned the lexicon learning algorithm which uses these components to learn new meanings of words. Similarly, we re-developed an inbuilt parser of the system in Answer Set Programming (ASP) and also integrated external parser with the system. Apart from this, we added some new rich features like various system configurations and memory cache in the learning component of the NL2KR system. These enhancements helped in learning more meanings of the words, boosted performance of the system by reducing the computation time by a factor of 8 and improved the usability of the system. We evaluated the NL2KR system on iRODS domain. iRODS is a rule-oriented data system, which helps in managing large set of computer files using policies. This system provides a Rule-Oriented interface langauge whose syntactic structure is like any procedural programming language (eg. C). However, direct translation of natural language (NL) to this interface language is difficult. So, for automatic translation of NL to this language, we define a simple intermediate Policy Declarative Language (IPDL) to represent the knowledge in the policies, which then can be directly translated to iRODS rules. We develop a corpus of 100 policy statements and manually translate them to IPDL langauge. This corpus is then used for the evaluation of NL2KR system. We performed 10 fold cross validation on the system. Furthermore, using this corpus, we illustrate how different components of our NL2KR system work.

ContributorsKumbhare, Kanchan Ravishankar (Author) / Baral, Chitta (Thesis advisor) / Ye, Jieping (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2013

EMG-based robot control interfaces: beyond decoding

Description

Electromyogram (EMG)-based control interfaces are increasingly used in robot teleoperation, prosthetic devices control and also in controlling robotic exoskeletons. Over the last two decades researchers have come up with a plethora of decoding functions to map myoelectric signals to robot motions. However, this requires a lot of training and validation…

Electromyogram (EMG)-based control interfaces are increasingly used in robot teleoperation, prosthetic devices control and also in controlling robotic exoskeletons. Over the last two decades researchers have come up with a plethora of decoding functions to map myoelectric signals to robot motions. However, this requires a lot of training and validation data sets, while the parameters of the decoding function are specific for each subject. In this thesis we propose a new methodology that doesn't require training and is not user-specific. The main idea is to supplement the decoding functional error with the human ability to learn inverse model of an arbitrary mapping function. We have shown that the subjects gradually learned the control strategy and their learning rates improved. We also worked on identifying an optimized control scheme that would be even more effective and easy to learn for the subjects. Optimization was done by taking into account that muscles act in synergies while performing a motion task. The low-dimensional representation of the neural activity was used to control a two-dimensional task. Results showed that in the case of reduced dimensionality mapping, the subjects were able to learn to control the device in a slower pace, however they were able to reach and retain the same level of controllability. To summarize, we were able to build an EMG-based controller for robot devices that would work for any subject, without any training or decoding function, suggesting human-embedded controllers for robotic devices.

ContributorsAntuvan, Chris Wilson (Author) / Artemiadis, Panagiotis (Thesis advisor) / Muthuswamy, Jitendran (Committee member) / Santos, Veronica J (Committee member) / Arizona State University (Publisher)

Created2013

Undergraduate signal processing laboratories on the Android platform

Description

The field of education has been immensely benefited by major breakthroughs in technology. The arrival of computers and the internet made student-teacher interaction from different parts of the world viable, increasing the reach of the educator to hitherto remote corners of the world. The arrival of mobile phones in the…

The field of education has been immensely benefited by major breakthroughs in technology. The arrival of computers and the internet made student-teacher interaction from different parts of the world viable, increasing the reach of the educator to hitherto remote corners of the world. The arrival of mobile phones in the recent past has the potential to provide the next paradigm shift in the way education is conducted. It combines the universal reach and powerful visualization capabilities of the computer with intimacy and portability. Engineering education is a field which can exploit the benefits of mobile devices to enhance learning and spread essential technical know-how to different parts of the world. In this thesis, I present AJDSP, an Android application evolved from JDSP, providing an intuitive and a easy to use environment for signal processing education. AJDSP is a graphical programming laboratory for digital signal processing developed for the Android platform. It is designed to provide utility; both as a supplement to traditional classroom learning and as a tool for self-learning. The architecture of AJDSP is based on the Model-View-Controller paradigm optimized for the Android platform. The extensive set of function modules cover a wide range of basic signal processing areas such as convolution, fast Fourier transform, z transform and filter design. The simple and intuitive user interface inspired from iJDSP is designed to facilitate ease of navigation and to provide the user with an intimate learning environment. Rich visualizations necessary to understand mathematically intensive signal processing algorithms have been incorporated into the software. Interactive demonstrations boosting student understanding of concepts like convolution and the relation between different signal domains have also been developed. A set of detailed assessments to evaluate the application has been conducted for graduate and senior-level undergraduate students.

ContributorsRanganath, Suhas (Author) / Spanias, Andreas (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Tsakalis, Konstantinos (Committee member) / Arizona State University (Publisher)

Created2013

Classifying everyday activity through label propagation with sparse training data

Description

We solve the problem of activity verification in the context of sustainability. Activity verification is the process of proving the user assertions pertaining to a certain activity performed by the user. Our motivation lies in incentivizing the user for engaging in sustainable activities like taking public transport or recycling. Such…

We solve the problem of activity verification in the context of sustainability. Activity verification is the process of proving the user assertions pertaining to a certain activity performed by the user. Our motivation lies in incentivizing the user for engaging in sustainable activities like taking public transport or recycling. Such incentivization schemes require the system to verify the claim made by the user. The system verifies these claims by analyzing the supporting evidence captured by the user while performing the activity. The proliferation of portable smart-phones in the past few years has provided us with a ubiquitous and relatively cheap platform, having multiple sensors like accelerometer, gyroscope, microphone etc. to capture this evidence data in-situ. In this research, we investigate the supervised and semi-supervised learning techniques for activity verification. Both these techniques make use the data set constructed using the evidence submitted by the user. Supervised learning makes use of annotated evidence data to build a function to predict the class labels of the unlabeled data points. The evidence data captured can be either unimodal or multimodal in nature. We use the accelerometer data as evidence for transportation mode verification and image data as evidence for recycling verification. After training the system, we achieve maximum accuracy of 94% when classifying the transport mode and 81% when detecting recycle activity. In the case of recycle verification, we could improve the classification accuracy by asking the user for more evidence. We present some techniques to ask the user for the next best piece of evidence that maximizes the probability of classification. Using these techniques for detecting recycle activity, the accuracy increases to 93%. The major disadvantage of using supervised models is that it requires extensive annotated training data, which expensive to collect. Due to the limited training data, we look at the graph based inductive semi-supervised learning methods to propagate the labels among the unlabeled samples. In the semi-supervised approach, we represent each instance in the data set as a node in the graph. Since it is a complete graph, edges interconnect these nodes, with each edge having some weight representing the similarity between the points. We propagate the labels in this graph, based on the proximity of the data points to the labeled nodes. We estimate the performance of these algorithms by measuring how close the probability distribution of the data after label propagation is to the probability distribution of the ground truth data. Since labeling has a cost associated with it, in this thesis we propose two algorithms that help us in selecting minimum number of labeled points to propagate the labels accurately. Our proposed algorithm achieves a maximum of 73% increase in performance when compared to the baseline algorithm.

ContributorsDesai, Vaishnav (Author) / Sundaram, Hari (Thesis advisor) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by