Search Content

Batch mode active learning for multimedia pattern recognition

Description

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a…

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.

ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

Building adaptive computational systems for physiological and biomedical data

Description

In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems.…

In recent years, machine learning and data mining technologies have received growing attention in several areas such as recommendation systems, natural language processing, speech and handwriting recognition, image processing and biomedical domain. Many of these applications which deal with physiological and biomedical data require person specific or person adaptive systems. The greatest challenge in developing such systems is the subject-dependent data variations or subject-based variability in physiological and biomedical data, which leads to difference in data distributions making the task of modeling these data, using traditional machine learning algorithms, complex and challenging. As a result, despite the wide application of machine learning, efficient deployment of its principles to model real-world data is still a challenge. This dissertation addresses the problem of subject based variability in physiological and biomedical data and proposes person adaptive prediction models based on novel transfer and active learning algorithms, an emerging field in machine learning. One of the significant contributions of this dissertation is a person adaptive method, for early detection of muscle fatigue using Surface Electromyogram signals, based on a new multi-source transfer learning algorithm. This dissertation also proposes a subject-independent algorithm for grading the progression of muscle fatigue from 0 to 1 level in a test subject, during isometric or dynamic contractions, at real-time. Besides subject based variability, biomedical image data also varies due to variations in their imaging techniques, leading to distribution differences between the image databases. Hence a classifier learned on one database may perform poorly on the other database. Another significant contribution of this dissertation has been the design and development of an efficient biomedical image data annotation framework, based on a novel combination of transfer learning and a new batch-mode active learning method, capable of addressing the distribution differences across databases. The methodologies developed in this dissertation are relevant and applicable to a large set of computing problems where there is a high variation of data between subjects or sources, such as face detection, pose detection and speech recognition. From a broader perspective, these frameworks can be viewed as a first step towards design of automated adaptive systems for real world data.

ContributorsChattopadhyay, Rita (Author) / Panchanathan, Sethuraman (Thesis advisor) / Ye, Jieping (Thesis advisor) / Li, Baoxin (Committee member) / Santello, Marco (Committee member) / Arizona State University (Publisher)

Created2013

Robust implementation of NL2KR system and it's application in iRODS domain

Description

Currently, to interact with computer based systems one needs to learn the specific interface language of that system. In most cases, interaction would be much easier if it could be done in natural language. For that, we will need a module which understands natural language and automatically translates it to…

Currently, to interact with computer based systems one needs to learn the specific interface language of that system. In most cases, interaction would be much easier if it could be done in natural language. For that, we will need a module which understands natural language and automatically translates it to the interface language of the system. NL2KR (Natural language to knowledge representation) v.1 system is a prototype of such a system. It is a learning based system that learns new meanings of words in terms of lambda-calculus formulas given an initial lexicon of some words and their meanings and a training corpus of sentences with their translations. As a part of this thesis, we take the prototype NL2KR v.1 system and enhance various components of it to make it usable for somewhat substantial and useful interface languages. We revamped the lexicon learning components, Inverse-lambda and Generalization modules, and redesigned the lexicon learning algorithm which uses these components to learn new meanings of words. Similarly, we re-developed an inbuilt parser of the system in Answer Set Programming (ASP) and also integrated external parser with the system. Apart from this, we added some new rich features like various system configurations and memory cache in the learning component of the NL2KR system. These enhancements helped in learning more meanings of the words, boosted performance of the system by reducing the computation time by a factor of 8 and improved the usability of the system. We evaluated the NL2KR system on iRODS domain. iRODS is a rule-oriented data system, which helps in managing large set of computer files using policies. This system provides a Rule-Oriented interface langauge whose syntactic structure is like any procedural programming language (eg. C). However, direct translation of natural language (NL) to this interface language is difficult. So, for automatic translation of NL to this language, we define a simple intermediate Policy Declarative Language (IPDL) to represent the knowledge in the policies, which then can be directly translated to iRODS rules. We develop a corpus of 100 policy statements and manually translate them to IPDL langauge. This corpus is then used for the evaluation of NL2KR system. We performed 10 fold cross validation on the system. Furthermore, using this corpus, we illustrate how different components of our NL2KR system work.

ContributorsKumbhare, Kanchan Ravishankar (Author) / Baral, Chitta (Thesis advisor) / Ye, Jieping (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2013

Texture structure analysis

Description

Texture analysis plays an important role in applications like automated pattern inspection, image and video compression, content-based image retrieval, remote-sensing, medical imaging and document processing, to name a few. Texture Structure Analysis is the process of studying the structure present in the textures. This structure can be expressed in terms…

Texture analysis plays an important role in applications like automated pattern inspection, image and video compression, content-based image retrieval, remote-sensing, medical imaging and document processing, to name a few. Texture Structure Analysis is the process of studying the structure present in the textures. This structure can be expressed in terms of perceived regularity. Our human visual system (HVS) uses the perceived regularity as one of the important pre-attentive cues in low-level image understanding. Similar to the HVS, image processing and computer vision systems can make fast and efficient decisions if they can quantify this regularity automatically. In this work, the problem of quantifying the degree of perceived regularity when looking at an arbitrary texture is introduced and addressed. One key contribution of this work is in proposing an objective no-reference perceptual texture regularity metric based on visual saliency. Other key contributions include an adaptive texture synthesis method based on texture regularity, and a low-complexity reduced-reference visual quality metric for assessing the quality of synthesized textures. In order to use the best performing visual attention model on textures, the performance of the most popular visual attention models to predict the visual saliency on textures is evaluated. Since there is no publicly available database with ground-truth saliency maps on images with exclusive texture content, a new eye-tracking database is systematically built. Using the Visual Saliency Map (VSM) generated by the best visual attention model, the proposed texture regularity metric is computed. The proposed metric is based on the observation that VSM characteristics differ between textures of differing regularity. The proposed texture regularity metric is based on two texture regularity scores, namely a textural similarity score and a spatial distribution score. In order to evaluate the performance of the proposed regularity metric, a texture regularity database called RegTEX, is built as a part of this work. It is shown through subjective testing that the proposed metric has a strong correlation with the Mean Opinion Score (MOS) for the perceived regularity of textures. The proposed method is also shown to be robust to geometric and photometric transformations and outperforms some of the popular texture regularity metrics in predicting the perceived regularity. The impact of the proposed metric to improve the performance of many image-processing applications is also presented. The influence of the perceived texture regularity on the perceptual quality of synthesized textures is demonstrated through building a synthesized textures database named SynTEX. It is shown through subjective testing that textures with different degrees of perceived regularities exhibit different degrees of vulnerability to artifacts resulting from different texture synthesis approaches. This work also proposes an algorithm for adaptively selecting the appropriate texture synthesis method based on the perceived regularity of the original texture. A reduced-reference texture quality metric for texture synthesis is also proposed as part of this work. The metric is based on the change in perceived regularity and the change in perceived granularity between the original and the synthesized textures. The perceived granularity is quantified through a new granularity metric that is proposed in this work. It is shown through subjective testing that the proposed quality metric, using just 2 parameters, has a strong correlation with the MOS for the fidelity of synthesized textures and outperforms the state-of-the-art full-reference quality metrics on 3 different texture databases. Finally, the ability of the proposed regularity metric in predicting the perceived degradation of textures due to compression and blur artifacts is also established.

ContributorsVaradarajan, Srenivas (Author) / Karam, Lina J (Thesis advisor) / Chakrabarti, Chaitali (Committee member) / Li, Baoxin (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)

Created2014

Emerging neural coincidences in rats agranular medial and agranular lateral cortices during learning of a directional choice task

Description

To uncover the neural correlates to go-directed behavior, single unit action potentials are considered fundamental computing units and have been examined by different analytical methodologies under a broad set of hypotheses. Using a behaving rat performing a directional choice learning task, we aim to study changes in rat's cortical neural…

To uncover the neural correlates to go-directed behavior, single unit action potentials are considered fundamental computing units and have been examined by different analytical methodologies under a broad set of hypotheses. Using a behaving rat performing a directional choice learning task, we aim to study changes in rat's cortical neural patterns while he improved his task performance accuracy from chance to 80% or higher. Specifically, simultaneous multi-channel single unit neural recordings from the rat's agranular medial (AGm) and Agranular lateral (AGl) cortices were analyzed using joint peristimulus time histogram (JPSTHs), which effectively unveils firing coincidences in neural action potentials. My results based on data from six rats revealed that coincidences of pair-wise neural action potentials are higher when rats were performing the task than they were not at the learning stage, and this trend abated after the rats learned the task. Another finding is that the coincidences at the learning stage are stronger than that when the rats learned the task especially when they were performing the task. Therefore, this coincidence measure is the highest when the rats were performing the task at the learning stage. This may suggest that neural coincidences play a role in the coordination and communication among populations of neurons engaged in a purposeful act. Additionally, attention and working memory may have contributed to the modulation of neural coincidences during the designed task.

ContributorsCheng, Bing (Author) / Si, Jennie (Thesis advisor) / Chae, Junseok (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2014

The effects of a visual disability on marital relationships

Description

This qualitative study examines the major changes in relationship closeness of married couples when one spouse acquires a vision disability. Turning Points analysis and Retrospective Interview Technique (RIT) were utilized which required participants to plot their relational journey on a graph after the onset of the disability. A sample of…

This qualitative study examines the major changes in relationship closeness of married couples when one spouse acquires a vision disability. Turning Points analysis and Retrospective Interview Technique (RIT) were utilized which required participants to plot their relational journey on a graph after the onset of the disability. A sample of 32 participants generating 100 unique turning points and 32 RIT graphs lent in-depth insight into the less explored area of the impact of a visual disability on marital relationships. A constant comparison method employed for the analysis of these turning points revealed six major categories, which include Change in Relational Dynamics, Realization of the Disability, Regaining Normality in Life, Resilience, Reactions to Assistance, and Dealing with the Disability. These turning points differ in terms of their positive or negative impact on the relational closeness between partners. In addition, the 32 individual RIT graphs were also analyzed and were grouped into four categories based on visual similarity, which include Erratic Relational Restoration, Erratic Relational Increase, Consistent Closeness and Gradual Relational Increase. Results provide theoretical contributions to disability and marriage literature. Implications for the application of turning points to the study of post-disability marital relationships are also discussed, and research directions identified.

ContributorsBhagchandani, Bhoomika (Author) / Kassing, Jeffrey W. (Thesis advisor) / Kelley, Douglas L. (Committee member) / Fisher, Carla L. (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2014

Integrated Distributed Power Management System for Photovoltaic

Description

Photovoltaic (PV) systems are affected by converter losses, partial shading and other mismatches in the panels. This dissertation introduces a sub-panel maximum power point tracking (MPPT) architecture together with an integrated CMOS current sensor circuit on a chip to reduce the mismatch effects, losses and increase the efficiency of the…

Photovoltaic (PV) systems are affected by converter losses, partial shading and other mismatches in the panels. This dissertation introduces a sub-panel maximum power point tracking (MPPT) architecture together with an integrated CMOS current sensor circuit on a chip to reduce the mismatch effects, losses and increase the efficiency of the PV system. The sub-panel MPPT increases the efficiency of the PV during the shading and replaces the bypass diodes in the panels with an integrated MPPT and DC-DC regulator. For the integrated MPPT and regulator, the research developed an integrated standard CMOS low power and high common mode range Current-to-Digital Converter (IDC) circuit and its application for DC-DC regulator and MPPT. The proposed charge based CMOS switched-capacitor circuit directly digitizes the output current of the DC-DC regulator without an analog-to-digital converter (ADC) and the need for high-voltage process technology. Compared to the resistor based current-sensing methods that requires current-to-voltage circuit, gain block and ADC, the proposed CMOS IDC is a low-power efficient integrated circuit that achieves high resolution, lower complexity, and lower power consumption. The IDC circuit is fabricated on a 0.7 um CMOS process, occupies 2mm x 2mm and consumes less than 27mW. The IDC circuit has been tested and used for boost DC-DC regulator and MPPT for photo-voltaic system. The DC-DC converter has an efficiency of 95%. The sub-module level power optimization improves the output power of a shaded panel by up to 20%, compared to panel MPPT with bypass diodes.

ContributorsMarti-Arbona, Edgar (Author) / Kiaei, Sayfe (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Kitchen, Jennifer (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2014

Multi-task learning and its applications to biomedical informatics

Description

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning…

In many fields one needs to build predictive models for a set of related machine learning tasks, such as information retrieval, computer vision and biomedical informatics. Traditionally these tasks are treated independently and the inference is done separately for each task, which ignores important connections among the tasks. Multi-task learning aims at simultaneously building models for all tasks in order to improve the generalization performance, leveraging inherent relatedness of these tasks. In this thesis, I firstly propose a clustered multi-task learning (CMTL) formulation, which simultaneously learns task models and performs task clustering. I provide theoretical analysis to establish the equivalence between the CMTL formulation and the alternating structure optimization, which learns a shared low-dimensional hypothesis space for different tasks. Then I present two real-world biomedical informatics applications which can benefit from multi-task learning. In the first application, I study the disease progression problem and present multi-task learning formulations for disease progression. In the formulations, the prediction at each point is a regression task and multiple tasks at different time points are learned simultaneously, leveraging the temporal smoothness among the tasks. The proposed formulations have been tested extensively on predicting the progression of the Alzheimer's disease, and experimental results demonstrate the effectiveness of the proposed models. In the second application, I present a novel data-driven framework for densifying the electronic medical records (EMR) to overcome the sparsity problem in predictive modeling using EMR. The densification of each patient is a learning task, and the proposed algorithm simultaneously densify all patients. As such, the densification of one patient leverages useful information from other patients.

ContributorsZhou, Jiayu (Author) / Ye, Jieping (Thesis advisor) / Mittelmann, Hans (Committee member) / Li, Baoxin (Committee member) / Wang, Yalin (Committee member) / Arizona State University (Publisher)

Created2014

Automated place and route methodologies for multi-project test chips

Description

This work describes the development of automated flows to generate pad rings, mixed signal power grids, and mega cells in a multi-project test chip. There were three major design flows that were created to create the test chip. The first was the pad ring which was used as the staring…

This work describes the development of automated flows to generate pad rings, mixed signal power grids, and mega cells in a multi-project test chip. There were three major design flows that were created to create the test chip. The first was the pad ring which was used as the staring block for creating the test chip. This flow put all of the signals for the chip in the order that was wanted along the outside of the die along with creation of the power ring that is used to supply the chip with a robust power source.

The second flow that was created was used to put together a flash block that is based off of a XILIX XCFXXP. This flow was somewhat similar to how the pad ring flow worked except that optimizations and a clock tree was added into the flow. There was a couple of design redoes due to timing and orientation constraints.

Finally, the last flow that was created was the top level flow which is where all of the components are combined together to create a finished test chip ready for fabrication. The main components that were used were the finished flash block, HERMES, test structures, and a clock instance along with the pad ring flow for the creation of the pad ring and power ring.

Also discussed is some work that was done on a previous multi-project test chip. The work that was done was the creation of power gaters that were used like switches to turn the power on and off for some flash modules. To control the power gaters the functionality change of some pad drivers was done so that they output a higher voltage than what is seen in the core of the chip.

ContributorsLieb, Christopher (Author) / Clark, Lawrence (Thesis advisor) / Holbert, Keith E. (Committee member) / Seo, Jae-Sun (Committee member) / Arizona State University (Publisher)

Created2015

Probabilistic topic models for human emotion analysis

Description

While discrete emotions like joy, anger, disgust etc. are quite popular, continuous

emotion dimensions like arousal and valence are gaining popularity within the research

community due to an increase in the availability of datasets annotated with these

emotions. Unlike the discrete emotions, continuous emotions allow modeling of subtle

and complex affect dimensions but are…

While discrete emotions like joy, anger, disgust etc. are quite popular, continuous

emotion dimensions like arousal and valence are gaining popularity within the research

community due to an increase in the availability of datasets annotated with these

emotions. Unlike the discrete emotions, continuous emotions allow modeling of subtle

and complex affect dimensions but are difficult to predict.

Dimension reduction techniques form the core of emotion recognition systems and

help create a new feature space that is more helpful in predicting emotions. But these

techniques do not necessarily guarantee a better predictive capability as most of them

are unsupervised, especially in regression learning. In emotion recognition literature,

supervised dimension reduction techniques have not been explored much and in this

work a solution is provided through probabilistic topic models. Topic models provide

a strong probabilistic framework to embed new learning paradigms and modalities.

In this thesis, the graphical structure of Latent Dirichlet Allocation has been explored

and new models tuned to emotion recognition and change detection have been built.

In this work, it has been shown that the double mixture structure of topic models

helps 1) to visualize feature patterns, and 2) to project features onto a topic simplex

that is more predictive of human emotions, when compared to popular techniques

like PCA and KernelPCA. Traditionally, topic models have been used on quantized

features but in this work, a continuous topic model called the Dirichlet Gaussian

Mixture model has been proposed. Evaluation of DGMM has shown that while modeling

videos, performance of LDA models can be replicated even without quantizing

the features. Until now, topic models have not been explored in a supervised context

of video analysis and thus a Regularized supervised topic model (RSLDA) that

models video and audio features is introduced. RSLDA learning algorithm performs

both dimension reduction and regularized linear regression simultaneously, and has outperformed supervised dimension reduction techniques like SPCA and Correlation

based feature selection algorithms. In a first of its kind, two new topic models, Adaptive

temporal topic model (ATTM) and SLDA for change detection (SLDACD) have

been developed for predicting concept drift in time series data. These models do not

assume independence of consecutive frames and outperform traditional topic models

in detecting local and global changes respectively.

ContributorsLade, Prasanth (Author) / Panchanathan, Sethuraman (Thesis advisor) / Davulcu, Hasan (Committee member) / Li, Baoxin (Committee member) / Balasubramanian, Vineeth N (Committee member) / Arizona State University (Publisher)

Created2015

ASU Electronic Theses and Dissertations