Theses and Dissertations
Displaying 1 - 2 of 2
Filtering by
- Creators: Turaga, Pavan
Description
Speech is generated by articulators acting on
a phonatory source. Identification of this
phonatory source and articulatory geometry are
individually challenging and ill-posed
problems, called speech separation and
articulatory inversion, respectively.
There exists a trade-off
between decomposition and recovered
articulatory geometry due to multiple
possible mappings between an
articulatory configuration
and the speech produced. However, if measurements
are obtained only from a microphone sensor,
they lack any invasive insight and add
additional challenge to an already difficult
problem.
A joint non-invasive estimation
strategy that couples articulatory and
phonatory knowledge would lead to better
articulatory speech synthesis. In this thesis,
a joint estimation strategy for speech
separation and articulatory geometry recovery
is studied. Unlike previous
periodic/aperiodic decomposition methods that
use stationary speech models within a
frame, the proposed model presents a
non-stationary speech decomposition method.
A parametric glottal source model and an
articulatory vocal tract response are
represented in a dynamic state space formulation.
The unknown parameters of the
speech generation components are estimated
using sequential Monte Carlo methods
under some specific assumptions.
The proposed approach is compared with other
glottal inverse filtering methods,
including iterative adaptive inverse filtering,
state-space inverse filtering, and
the quasi-closed phase method.
a phonatory source. Identification of this
phonatory source and articulatory geometry are
individually challenging and ill-posed
problems, called speech separation and
articulatory inversion, respectively.
There exists a trade-off
between decomposition and recovered
articulatory geometry due to multiple
possible mappings between an
articulatory configuration
and the speech produced. However, if measurements
are obtained only from a microphone sensor,
they lack any invasive insight and add
additional challenge to an already difficult
problem.
A joint non-invasive estimation
strategy that couples articulatory and
phonatory knowledge would lead to better
articulatory speech synthesis. In this thesis,
a joint estimation strategy for speech
separation and articulatory geometry recovery
is studied. Unlike previous
periodic/aperiodic decomposition methods that
use stationary speech models within a
frame, the proposed model presents a
non-stationary speech decomposition method.
A parametric glottal source model and an
articulatory vocal tract response are
represented in a dynamic state space formulation.
The unknown parameters of the
speech generation components are estimated
using sequential Monte Carlo methods
under some specific assumptions.
The proposed approach is compared with other
glottal inverse filtering methods,
including iterative adaptive inverse filtering,
state-space inverse filtering, and
the quasi-closed phase method.
ContributorsVenkataramani, Adarsh Akkshai (Author) / Papandreou-Suppappola, Antonia (Thesis advisor) / Bliss, Daniel W (Committee member) / Turaga, Pavan (Committee member) / Arizona State University (Publisher)
Created2018
Description
Semantic image segmentation has been a key topic in applications involving image processing and computer vision. Owing to the success and continuous research in the field of deep learning, there have been plenty of deep learning-based segmentation architectures that have been designed for various tasks. In this thesis, deep-learning architectures for a specific application in material science; namely the segmentation process for the non-destructive study of the microstructure of Aluminum Alloy AA 7075 have been developed. This process requires the use of various imaging tools and methodologies to obtain the ground-truth information. The image dataset obtained using Transmission X-ray microscopy (TXM) consists of raw 2D image specimens captured from the projections at every beam scan. The segmented 2D ground-truth images are obtained by applying reconstruction and filtering algorithms before using a scientific visualization tool for segmentation. These images represent the corrosive behavior caused by the precipitates and inclusions particles on the Aluminum AA 7075 alloy. The study of the tools that work best for X-ray microscopy-based imaging is still in its early stages.
In this thesis, the underlying concepts behind Convolutional Neural Networks (CNNs) and state-of-the-art Semantic Segmentation architectures have been discussed in detail. The data generation and pre-processing process applied to the AA 7075 Data have also been described, along with the experimentation methodologies performed on the baseline and four other state-of-the-art Segmentation architectures that predict the segmented boundaries from the raw 2D images. A performance analysis based on various factors to decide the best techniques and tools to apply Semantic image segmentation for X-ray microscopy-based imaging was also conducted.
In this thesis, the underlying concepts behind Convolutional Neural Networks (CNNs) and state-of-the-art Semantic Segmentation architectures have been discussed in detail. The data generation and pre-processing process applied to the AA 7075 Data have also been described, along with the experimentation methodologies performed on the baseline and four other state-of-the-art Segmentation architectures that predict the segmented boundaries from the raw 2D images. A performance analysis based on various factors to decide the best techniques and tools to apply Semantic image segmentation for X-ray microscopy-based imaging was also conducted.
ContributorsBarboza, Daniel (Author) / Turaga, Pavan (Thesis advisor) / Chawla, Nikhilesh (Committee member) / Jayasuriya, Suren (Committee member) / Arizona State University (Publisher)
Created2020