Search Content

A study of boosting based transfer learning for activity and gesture recognition

Description

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs…

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs and without the need for explicit relearning from scratch. In this thesis, a novel instance transfer technique that adapts a "Cost-sensitive" variation of AdaBoost is presented. The method capitalizes on the theoretical and functional properties of AdaBoost to selectively reuse outdated training instances obtained from a "source" domain to effectively classify unseen instances occurring in a different, but related "target" domain. The algorithm is evaluated on real-world classification problems namely accelerometer based 3D gesture recognition, smart home activity recognition and text categorization. The performance on these datasets is analyzed and evaluated against popular boosting-based instance transfer techniques. In addition, supporting empirical studies, that investigate some of the less explored bottlenecks of boosting based instance transfer methods, are presented, to understand the suitability and effectiveness of this form of knowledge transfer.

ContributorsVenkatesan, Ashok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Li, Baoxin (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2011

Conformal predictions in multimedia pattern recognition

Description

The fields of pattern recognition and machine learning are on a fundamental quest to design systems that can learn the way humans do. One important aspect of human intelligence that has so far not been given sufficient attention is the capability of humans to express when they are certain about…

The fields of pattern recognition and machine learning are on a fundamental quest to design systems that can learn the way humans do. One important aspect of human intelligence that has so far not been given sufficient attention is the capability of humans to express when they are certain about a decision, or when they are not. Machine learning techniques today are not yet fully equipped to be trusted with this critical task. This work seeks to address this fundamental knowledge gap. Existing approaches that provide a measure of confidence on a prediction such as learning algorithms based on the Bayesian theory or the Probably Approximately Correct theory require strong assumptions or often produce results that are not practical or reliable. The recently developed Conformal Predictions (CP) framework - which is based on the principles of hypothesis testing, transductive inference and algorithmic randomness - provides a game-theoretic approach to the estimation of confidence with several desirable properties such as online calibration and generalizability to all classification and regression methods. This dissertation builds on the CP theory to compute reliable confidence measures that aid decision-making in real-world problems through: (i) Development of a methodology for learning a kernel function (or distance metric) for optimal and accurate conformal predictors; (ii) Validation of the calibration properties of the CP framework when applied to multi-classifier (or multi-regressor) fusion; and (iii) Development of a methodology to extend the CP framework to continuous learning, by using the framework for online active learning. These contributions are validated on four real-world problems from the domains of healthcare and assistive technologies: two classification-based applications (risk prediction in cardiac decision support and multimodal person recognition), and two regression-based applications (head pose estimation and saliency prediction in images). The results obtained show that: (i) multiple kernel learning can effectively increase efficiency in the CP framework; (ii) quantile p-value combination methods provide a viable solution for fusion in the CP framework; and (iii) eigendecomposition of p-value difference matrices can serve as effective measures for online active learning; demonstrating promise and potential in using these contributions in multimedia pattern recognition problems in real-world settings.

ContributorsNallure Balasubramanian, Vineeth (Author) / Panchanathan, Sethuraman (Thesis advisor) / Ye, Jieping (Committee member) / Li, Baoxin (Committee member) / Vovk, Vladimir (Committee member) / Arizona State University (Publisher)

Created2010

Population Receptive Field Prediction with Convolutional Neural Networks

Description

The Population Receptive Field (pRF) model is widely used to predict the location (retinotopy) and size of receptive fields on the visual space. Doing so allows for the creation of a mapping from locations in the visual field to the associated groups of neurons in the cortical region (within the…

The Population Receptive Field (pRF) model is widely used to predict the location (retinotopy) and size of receptive fields on the visual space. Doing so allows for the creation of a mapping from locations in the visual field to the associated groups of neurons in the cortical region (within the visual cortex of the brain). However, using the pRF model is very time consuming. Past research has focused on the creation of Convolutional Neural Networks (CNN) to mimic the pRF model in a fraction of the time, and they have worked well under highly controlled conditions. However, these models have not been thoroughly tested on real human data. This thesis focused on adapting one of these CNNs to accurately predict the retinotopy of a real human subject using a dataset from the Human Connectome Project. The results show promise towards creating a fully functioning CNN, but they also expose new challenges that must be overcome before the model could be used to predict the retinotopy of new human subjects.

ContributorsBurgard, Braeden (Author) / Wang, Yalin (Thesis director) / Ta, Duyan (Committee member) / Barrett, The Honors College (Contributor) / School of International Letters and Cultures (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2022-05

Voltage Pulse Production for RRAM Crossbar Array ASIC for Machine Learning Applications

Description

Most machine learning algorithms, and specifically neural networks, utilize vector-matrix multiplication (VMM) to process information, but these calculations are CPU intensive and can have long run-times. This issue is fundamentally outlined by the von Neumann bottleneck. Because of this undesirable expense associated with performing VMM via software, the exploration of…

Most machine learning algorithms, and specifically neural networks, utilize vector-matrix multiplication (VMM) to process information, but these calculations are CPU intensive and can have long run-times. This issue is fundamentally outlined by the von Neumann bottleneck. Because of this undesirable expense associated with performing VMM via software, the exploration of new ways to perform the same calculations via hardware have grown more popular. When performed with hardware that is specialized to perform these calculations, VMM becomes far more power-efficient and less time consuming. This project expands upon those principles and seeks to validate the use of RRAM in this hardware. The flexibility of the conductance of RRAM makes these devices a strong contender for hardware-driven VMM calculation for neural network computing. The conductance of these devices is affected by the pulse width of a voltage signal sent across the devices at each node. This pulse is produced on-chip and can be modified by user inputs. The design of this pulse- producing circuit, as well as the simulated and physical functionality of the design, is discussed in this Honors Thesis. Simulation and physical testing of the pulse-producing design on the ASIC have verified correct operation of the design. This operation is imperative to the future ability of the ASIC to perform accurate VMM.

ContributorsPearson, Katherine (Author) / Barnaby, Hugh (Thesis director) / Wilson, Donald (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor) / School of International Letters and Cultures (Contributor)

Created2022-05

Using Machine Learning Classification Techniques to Predict Recessionary Periods in the U.S. Economy

Description

The goal of this research project is to determine how beneficial machine learning (ML) techniquescan be in predicting recessions. Past work has utilized a multitude of classification methods from Probit models to linear Support Vector Machines (SVMs) and obtained accuracies nearing 60-70%, where some models even predicted the Great Recession…

The goal of this research project is to determine how beneficial machine learning (ML) techniquescan be in predicting recessions. Past work has utilized a multitude of classification methods from Probit models to linear Support Vector Machines (SVMs) and obtained accuracies nearing 60-70%, where some models even predicted the Great Recession based off data from the previous 50 years. This paper will build on past work, by starting with less complex classification techniques that are more broadly used in recession forecasting and end by incorporating more complex ML models that produce higher accuracies than their more primitive counterparts. Many models were tested in this analysis and the findings here corroborate past work that the SVM methodology produces more accurate results than currently used probit models, but adds on that other ML models produced sufficient accuracy as well.

ContributorsHogan, Carter (Author) / McCulloch, Robert (Thesis director) / Pereira, Claudiney (Committee member) / Barrett, The Honors College (Contributor) / School of International Letters and Cultures (Contributor) / Economics Program in CLAS (Contributor) / School of Mathematical and Statistical Sciences (Contributor)

Created2022-05

Building a Machine Learning Model to Predict Spring Wheat Crop Yield in Yuma, Arizona

Description

Machine learning(ML) has been on the rise in many fields including agriculture. It is used for many things including crop yield prediction which is meant to help farmers decide when and what to grow based on the model. Many models have been built for various crops and areas of the…

Machine learning(ML) has been on the rise in many fields including agriculture. It is used for many things including crop yield prediction which is meant to help farmers decide when and what to grow based on the model. Many models have been built for various crops and areas of the world utilizing various sources of data. However, there is yet to exist a model designed to predict any crop’s yield in Yuma Arizona, one of the premier places to grow crops in America. For this, I built a dataset from farm documentation that describes the actions taken before, during, and after a crop is being grown. To supplement this data, ecological data was also used so data such as temperature, heat units, soil type, and soil water holding capacity were included. I used this dataset to train various regression models where I discovered that the farm data was useful, but only when used in conjunction with the ecological data.

ContributorsJohnson, Nicholas (Author) / Kerner, Hannah (Thesis director) / Bandaru, Varaprasad (Committee member) / Barrett, The Honors College (Contributor) / School of International Letters and Cultures (Contributor) / Computer Science and Engineering Program (Contributor)

Created2024-05

Domain Adaptive Computational Models for Computer Vision

Description

The widespread adoption of computer vision models is often constrained by the issue of domain mismatch. Models that are trained with data belonging to one distribution, perform poorly when tested with data from a different distribution. Variations in vision based data can be attributed to the following reasons, viz., differences…

The widespread adoption of computer vision models is often constrained by the issue of domain mismatch. Models that are trained with data belonging to one distribution, perform poorly when tested with data from a different distribution. Variations in vision based data can be attributed to the following reasons, viz., differences in image quality (resolution, brightness, occlusion and color), changes in camera perspective, dissimilar backgrounds and an inherent diversity of the samples themselves. Machine learning techniques like transfer learning are employed to adapt computational models across distributions. Domain adaptation is a special case of transfer learning, where knowledge from a source domain is transferred to a target domain in the form of learned models and efficient feature representations.

The dissertation outlines novel domain adaptation approaches across different feature spaces; (i) a linear Support Vector Machine model for domain alignment; (ii) a nonlinear kernel based approach that embeds domain-aligned data for enhanced classification; (iii) a hierarchical model implemented using deep learning, that estimates domain-aligned hash values for the source and target data, and (iv) a proposal for a feature selection technique to reduce cross-domain disparity. These adaptation procedures are tested and validated across a range of computer vision applications like object classification, facial expression recognition, digit recognition, and activity recognition. The dissertation also provides a unique perspective of domain adaptation literature from the point-of-view of linear, nonlinear and hierarchical feature spaces. The dissertation concludes with a discussion on the future directions for research that highlight the role of domain adaptation in an era of rapid advancements in artificial intelligence.

ContributorsDemakethepalli Venkateswara, Hemanth (Author) / Panchanathan, Sethuraman (Thesis advisor) / Li, Baoxin (Committee member) / Davulcu, Hasan (Committee member) / Ye, Jieping (Committee member) / Chakraborty, Shayok (Committee member) / Arizona State University (Publisher)

Created2017

Learning Transferable Data Representations Using Deep Generative Models

Description

Machine learning models convert raw data in the form of video, images, audio,

text, etc. into feature representations that are convenient for computational process-

ing. Deep neural networks have proven to be very efficient feature extractors for a

variety of machine learning tasks. Generative models based on deep neural networks

introduce constraints on the…

Machine learning models convert raw data in the form of video, images, audio,

text, etc. into feature representations that are convenient for computational process-

ing. Deep neural networks have proven to be very efficient feature extractors for a

variety of machine learning tasks. Generative models based on deep neural networks

introduce constraints on the feature space to learn transferable and disentangled rep-

resentations. Transferable feature representations help in training machine learning

models that are robust across different distributions of data. For example, with the

application of transferable features in domain adaptation, models trained on a source

distribution can be applied to a data from a target distribution even though the dis-

tributions may be different. In style transfer and image-to-image translation, disen-

tangled representations allow for the separation of style and content when translating

images.

This thesis examines learning transferable data representations in novel deep gen-

erative models. The Semi-Supervised Adversarial Translator (SAT) utilizes adversar-

ial methods and cross-domain weight sharing in a neural network to extract trans-

ferable representations. These transferable interpretations can then be decoded into

the original image or a similar image in another domain. The Explicit Disentangling

Network (EDN) utilizes generative methods to disentangle images into their core at-

tributes and then segments sets of related attributes. The EDN can separate these

attributes by controlling the ow of information using a novel combination of losses

and network architecture. This separation of attributes allows precise modi_cations

to speci_c components of the data representation, boosting the performance of ma-

chine learning tasks. The effectiveness of these models is evaluated across domain

adaptation, style transfer, and image-to-image translation tasks.

ContributorsEusebio, Jose Miguel Ang (Author) / Panchanathan, Sethuraman (Thesis advisor) / Davulcu, Hasan (Committee member) / Venkateswara, Hemanth (Committee member) / Arizona State University (Publisher)

Created2018

Language Image Transformer

Description

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in…

Humans perceive the environment using multiple modalities like vision, speech (language), touch, taste, and smell. The knowledge obtained from one modality usually complements the other. Learning through several modalities helps in constructing an accurate model of the environment. Most of the current vision and language models are modality-specific and, in many cases, extensively use deep-learning based attention mechanisms for learning powerful representations. This work discusses the role of attention in associating vision and language for generating shared representation. Language Image Transformer (LIT) is proposed for learning multi-modal representations of the environment. It uses a training objective based on Contrastive Predictive Coding (CPC) to maximize the Mutual Information (MI) between the visual and linguistic representations. It learns the relationship between the modalities using the proposed cross-modal attention layers. It is trained and evaluated using captioning datasets, MS COCO, and Conceptual Captions. The results and the analysis offers a perspective on the use of Mutual Information Maximisation (MIM) for generating generalizable representations across multiple modalities.

ContributorsRamakrishnan, Raghavendran (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth Kumar (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Generalized Domain Adaptation for Visual Domains

Description

Humans have a great ability to recognize objects in different environments irrespective of their variations. However, the same does not apply to machine learning models which are unable to generalize to images of objects from different domains. The generalization of these models to new data is constrained by the domain…

Humans have a great ability to recognize objects in different environments irrespective of their variations. However, the same does not apply to machine learning models which are unable to generalize to images of objects from different domains. The generalization of these models to new data is constrained by the domain gap. Many factors such as image background, image resolution, color, camera perspective and variations in the objects are responsible for the domain gap between the training data (source domain) and testing data (target domain). Domain adaptation algorithms aim to overcome the domain gap between the source and target domains and learn robust models that can perform well across both the domains.

This thesis provides solutions for the standard problem of unsupervised domain adaptation (UDA) and the more generic problem of generalized domain adaptation (GDA). The contributions of this thesis are as follows. (1) Certain and Consistent Domain Adaptation model for closed-set unsupervised domain adaptation by aligning the features of the source and target domain using deep neural networks. (2) A multi-adversarial deep learning model for generalized domain adaptation. (3) A gating model that detects out-of-distribution samples for generalized domain adaptation.

The models were tested across multiple computer vision datasets for domain adaptation.

The dissertation concludes with a discussion on the proposed approaches and future directions for research in closed set and generalized domain adaptation.

ContributorsNagabandi, Bhadrinath (Author) / Panchanathan, Sethuraman (Thesis advisor) / Venkateswara, Hemanth (Thesis advisor) / McDaniel, Troy (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by