Search Content

Within and Cross-Corpus Speech Emotion Recognition Using Latent Topic Model-Based Features

Description

Owing to the suprasegmental behavior of emotional speech, turn-level features have demonstrated a better success than frame-level features for recognition-related tasks. Conventionally, such features are obtained via a brute-force collection of statistics over frames, thereby losing important local information in the process which affects the performance. To overcome these limitations,…

Owing to the suprasegmental behavior of emotional speech, turn-level features have demonstrated a better success than frame-level features for recognition-related tasks. Conventionally, such features are obtained via a brute-force collection of statistics over frames, thereby losing important local information in the process which affects the performance. To overcome these limitations, a novel feature extraction approach using latent topic models (LTMs) is presented in this study. Speech is assumed to comprise of a mixture of emotion-specific topics, where the latter capture emotionally salient information from the co-occurrences of frame-level acoustic features and yield better descriptors. Specifically, a supervised replicated softmax model (sRSM), based on restricted Boltzmann machines and distributed representations, is proposed to learn naturally discriminative topics. The proposed features are evaluated for the recognition of categorical or continuous emotional attributes via within and cross-corpus experiments conducted over acted and spontaneous expressions. In a within-corpus scenario, sRSM outperforms competing LTMs, while obtaining a significant improvement of 16.75% over popular statistics-based turn-level features for valence-based classification, which is considered to be a difficult task using only speech. Further analyses with respect to the turn duration show that the improvement is even more significant, 35%, on longer turns (>6 s), which is highly desirable for current turn-based practices. In a cross-corpus scenario, two novel adaptation-based approaches, instance selection, and weight regularization are proposed to reduce the inherent bias due to varying annotation procedures and cultural perceptions across databases. Experimental results indicate a natural, yet less severe, deterioration in performance - only 2.6% and 2.7%, thereby highlighting the generalization ability of the proposed features.

ContributorsShah, Mohit (Author) / Chakrabarti, Chaitali (Author) / Spanias, Andreas (Author) / Ira A. Fulton Schools of Engineering (Contributor)

Created2015-01-25

Estimation of subspace occupancy

Description

The ability to identify unoccupied resources in the radio spectrum is a key capability for opportunistic users in a cognitive radio environment. This paper draws upon and extends geometrically based ideas in statistical signal processing to develop estimators for the rank and the occupied subspace in a multi-user environment from…

The ability to identify unoccupied resources in the radio spectrum is a key capability for opportunistic users in a cognitive radio environment. This paper draws upon and extends geometrically based ideas in statistical signal processing to develop estimators for the rank and the occupied subspace in a multi-user environment from multiple temporal samples of the signal received at a single antenna. These estimators enable identification of resources, such as the orthogonal complement of the occupied subspace, that may be exploitable by an opportunistic user. This concept is supported by simulations showing the estimation of the number of users in a simple CDMA system using a maximum a posteriori (MAP) estimate for the rank. It was found that with suitable parameters, such as high SNR, sufficient number of time epochs and codes of appropriate length, the number of users could be correctly estimated using the MAP estimator even when the noise variance is unknown. Additionally, the process of identifying the maximum likelihood estimate of the orthogonal projector onto the unoccupied subspace is discussed.

ContributorsBeaudet, Kaitlyn (Author) / Cochran, Douglas (Thesis advisor) / Turaga, Pavan (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2014

Advances in Motion Estimators for Applications in Computer Vision

Description

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from relative motion between the objects and the imaging perspective. Today,…

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from relative motion between the objects and the imaging perspective. Today, optical flow fields are utilized to solve problems in various areas such as object detection and tracking, interpolation, visual odometry, etc. In this dissertation, three problems from different areas of computer vision and the solutions that make use of modified optical flow methods are explained.

The contributions of this dissertation are approaches and frameworks that introduce i) a new optical flow-based interpolation method to achieve minimally divergent velocimetry data, ii) a framework that improves the accuracy of change detection algorithms in synthetic aperture radar (SAR) images, and iii) a set of new methods to integrate Proton Magnetic Resonance Spectroscopy (1HMRSI) data into threedimensional (3D) neuronavigation systems for tumor biopsies.

In the first application an optical flow-based approach for the interpolation of minimally divergent velocimetry data is proposed. The velocimetry data of incompressible fluids contain signals that describe the flow velocity. The approach uses the additional flow velocity information to guide the interpolation process towards reduced divergence in the interpolated data.

In the second application a framework that mainly consists of optical flow methods and other image processing and computer vision techniques to improve object extraction from synthetic aperture radar images is proposed. The proposed framework is used for distinguishing between actual motion and detected motion due to misregistration in SAR image sets and it can lead to more accurate and meaningful change detection and improve object extraction from a SAR datasets.

In the third application a set of new methods that aim to improve upon the current state-of-the-art in neuronavigation through the use of detailed three-dimensional (3D) 1H-MRSI data are proposed. The result is a progressive form of online MRSI-guided neuronavigation that is demonstrated through phantom validation and clinical application.

ContributorsKanberoglu, Berkay (Author) / Frakes, David (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Spanias, Andreas (Committee member) / Berisha, Visar (Committee member) / Arizona State University (Publisher)

Created2018

Interactive Tango Milonga: An Interactive Dance System for Argentine Tango Social Dance

Description

When dancers are granted agency over music, as in interactive dance systems, the actors are most often concerned with the problem of creating a staged performance for an audience. However, as is reflected by the above quote, the practice of Argentine tango social dance is most concerned with participants internal…

When dancers are granted agency over music, as in interactive dance systems, the actors are most often concerned with the problem of creating a staged performance for an audience. However, as is reflected by the above quote, the practice of Argentine tango social dance is most concerned with participants internal experience and their relationship to the broader tango community. In this dissertation I explore creative approaches to enrich the sense of connection, that is, the experience of oneness with a partner and complete immersion in music and dance for Argentine tango dancers by providing agency over musical activities through the use of interactive technology. Specifically, I create an interactive dance system that allows tango dancers to affect and create music via their movements in the context of social dance. The motivations for this work are multifold: 1) to intensify embodied experience of the interplay between dance and music, individual and partner, couple and community, 2) to create shared experience of the conventions of tango dance, and 3) to innovate Argentine tango social dance practice for the purposes of education and increasing musicality in dancers.

ContributorsBrown, Courtney Douglass (Author) / Paine, Garth (Thesis advisor) / Feisst, Sabine (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)

Created2017

Real time estimation and prediction of similarity in human activity using factor oracle algorithm

Description

The human motion is defined as an amalgamation of several physical traits such as bipedal locomotion, posture and manual dexterity, and mental expectation. In addition to the “positive” body form defined by these traits, casting light on the body produces a “negative” of the body: its shadow. We often interchangeably…

The human motion is defined as an amalgamation of several physical traits such as bipedal locomotion, posture and manual dexterity, and mental expectation. In addition to the “positive” body form defined by these traits, casting light on the body produces a “negative” of the body: its shadow. We often interchangeably use with silhouettes in the place of shadow to emphasize indifference to interior features. In a manner of speaking, the shadow is an alter ego that imitates the individual.

The principal value of shadow is its non-invasive behaviour of reflecting precisely the actions of the individual it is attached to. Nonetheless we can still think of the body’s shadow not as the body but its alter ego.

Based on this premise, my thesis creates an experiential system that extracts the data related to the contour of your human shape and gives it a texture and life of its own, so as to emulate your movements and postures, and to be your extension. In technical terms, my thesis extracts abstraction from a pre-indexed database that could be generated from an offline data set or in real time to complement these actions of a user in front of a low-cost optical motion capture device like the Microsoft Kinect. This notion could be the system’s interpretation of the action which creates modularized art through the abstraction’s ‘similarity’ to the live action.

Through my research, I have developed a stable system that tackles various connotations associated with shadows and the need to determine the ideal features that contribute to the relevance of the actions performed. The implication of Factor Oracle [3] pattern interpretation is tested with a feature bin of videos. The system also is flexible towards several methods of Nearest Neighbours searches and a machine learning module to derive the same output. The overall purpose is to establish this in real time and provide a constant feedback to the user. This can be expanded to handle larger dynamic data.

In addition to estimating human actions, my thesis best tries to test various Nearest Neighbour search methods in real time depending upon the data stream. This provides a basis to understand varying parameters that complement human activity recognition and feature matching in real time.

ContributorsSeshasayee, Sudarshan Prashanth (Author) / Sha, Xin Wei (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Tinapple, David A (Committee member) / Arizona State University (Publisher)

Created2016

Graph-based estimation of information divergence functions

Description

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric…

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is rarely the case in real-word scenarios. Non-parametric density estimators are characterized by a very large number of parameters that have to be tuned with costly cross-validation. In this dissertation we focus on a specific family of non-parametric estimators, called direct estimators, that bypass density estimation completely and directly estimate the quantity of interest from the data. We introduce a new divergence measure, the $D_p$-divergence, that can be estimated directly from samples without parametric assumptions on the distribution. We show that the $D_p$-divergence bounds the binary, cross-domain, and multi-class Bayes error rates and, in certain cases, provides provably tighter bounds than the Hellinger divergence. In addition, we also propose a new methodology that allows the experimenter to construct direct estimators for existing divergence measures or to construct new divergence measures with custom properties that are tailored to the application. To examine the practical efficacy of these new methods, we evaluate them in a statistical learning framework on a series of real-world data science problems involving speech-based monitoring of neuro-motor disorders.

ContributorsWisler, Alan (Author) / Berisha, Visar (Thesis advisor) / Spanias, Andreas (Thesis advisor) / Liss, Julie (Committee member) / Bliss, Daniel (Committee member) / Arizona State University (Publisher)

Created2017

Photovoltaic Array Fault Detection and Optimization Using Machine Learning

Description

The increasing demand for clean energy solutions requires more than just expansion, but also improvements in the efficiency of renewable sources, such as solar. This requires analytics for each panel regarding voltage, current, temperature, and irradiance. This project involves the development of machine learning algorithms along with a data logger…

The increasing demand for clean energy solutions requires more than just expansion, but also improvements in the efficiency of renewable sources, such as solar. This requires analytics for each panel regarding voltage, current, temperature, and irradiance. This project involves the development of machine learning algorithms along with a data logger for the purpose of photovoltaic (PV) monitoring and control. Machine learning is used for fault classification. Once a fault is detected, the system can change its reconfiguration to minimize the power losses. Accuracy in the fault detection was demonstrated to be at a level over 90% and topology reconfiguration showed to increase power output by as much as 5%.

ContributorsNavas, John (Author) / Spanias, Andreas (Thesis director) / Rao, Sunil (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2021-05

Consensus algorithms and distributed structure estimation in wireless sensor networks

Description

Distributed wireless sensor networks (WSNs) have attracted researchers recently due to their advantages such as low power consumption, scalability and robustness to link failures. In sensor networks with no fusion center, consensus is a process where

all the sensors in the network achieve global agreement using only local transmissions. In this…

Distributed wireless sensor networks (WSNs) have attracted researchers recently due to their advantages such as low power consumption, scalability and robustness to link failures. In sensor networks with no fusion center, consensus is a process where

all the sensors in the network achieve global agreement using only local transmissions. In this dissertation, several consensus and consensus-based algorithms in WSNs are studied.

Firstly, a distributed consensus algorithm for estimating the maximum and minimum value of the initial measurements in a sensor network in the presence of communication noise is proposed. In the proposed algorithm, a soft-max approximation together with a non-linear average consensus algorithm is used. A design parameter controls the trade-off between the soft-max error and convergence speed. An analysis of this trade-off gives guidelines towards how to choose the design parameter for the max estimate. It is also shown that if some prior knowledge of the initial measurements is available, the consensus process can be accelerated.

Secondly, a distributed system size estimation algorithm is proposed. The proposed algorithm is based on distributed average consensus and L2 norm estimation. Different sources of error are explicitly discussed, and the distribution of the final estimate is derived. The CRBs for system size estimator with average and max consensus strategies are also considered, and different consensus based system size estimation approaches are compared.

Then, a consensus-based network center and radius estimation algorithm is described. The center localization problem is formulated as a convex optimization problem with a summation form by using soft-max approximation with exponential functions. Distributed optimization methods such as stochastic gradient descent and diffusion adaptation are used to estimate the center. Then, max consensus is used to compute the radius of the network area.

Finally, two average consensus based distributed estimation algorithms are introduced: distributed degree distribution estimation algorithm and algorithm for tracking the dynamics of the desired parameter. Simulation results for all proposed algorithms are provided.

ContributorsZhang, Sai (Electrical engineer) (Author) / Tepedelenlioğlu, Cihan (Thesis advisor) / Spanias, Andreas (Thesis advisor) / Tsakalis, Kostas (Committee member) / Bliss, Daniel (Committee member) / Arizona State University (Publisher)

Created2017