Search Content

Data-Driven Representation Learning in Multimodal Feature Fusion

Description

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance.…

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance. This dissertation focuses on the representation learning approaches as the fusion strategy. Specifically, the objective is to learn the shared latent representation which jointly exploit the structural information encoded in all modalities, such that a straightforward learning model can be adopted to obtain the prediction.

We first consider sensor fusion, a typical multimodal fusion problem critical to building a pervasive computing platform. A systematic fusion technique is described to support both multiple sensors and descriptors for activity recognition. Targeted to learn the optimal combination of kernels, Multiple Kernel Learning (MKL) algorithms have been successfully applied to numerous fusion problems in computer vision etc. Utilizing the MKL formulation, next we describe an auto-context algorithm for learning image context via the fusion with low-level descriptors. Furthermore, a principled fusion algorithm using deep learning to optimize kernel machines is developed. By bridging deep architectures with kernel optimization, this approach leverages the benefits of both paradigms and is applied to a wide variety of fusion problems.

In many real-world applications, the modalities exhibit highly specific data structures, such as time sequences and graphs, and consequently, special design of the learning architecture is needed. In order to improve the temporal modeling for multivariate sequences, we developed two architectures centered around attention models. A novel clinical time series analysis model is proposed for several critical problems in healthcare. Another model coupled with triplet ranking loss as metric learning framework is described to better solve speaker diarization. Compared to state-of-the-art recurrent networks, these attention-based multivariate analysis tools achieve improved performance while having a lower computational complexity. Finally, in order to perform community detection on multilayer graphs, a fusion algorithm is described to derive node embedding from word embedding techniques and also exploit the complementary relational information contained in each layer of the graph.

ContributorsSong, Huan (Author) / Spanias, Andreas (Thesis advisor) / Thiagarajan, Jayaraman (Committee member) / Berisha, Visar (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)

Created2018

Deep Active Learning Explored Across Diverse Label Spaces

Description

Deep learning architectures have been widely explored in computer vision and have

depicted commendable performance in a variety of applications. A fundamental challenge

in training deep networks is the requirement of large amounts of labeled training

data. While gathering large quantities of unlabeled data is cheap and easy, annotating

the data is an expensive…

Deep learning architectures have been widely explored in computer vision and have

depicted commendable performance in a variety of applications. A fundamental challenge

in training deep networks is the requirement of large amounts of labeled training

data. While gathering large quantities of unlabeled data is cheap and easy, annotating

the data is an expensive process in terms of time, labor and human expertise.

Thus, developing algorithms that minimize the human effort in training deep models

is of immense practical importance. Active learning algorithms automatically identify

salient and exemplar samples from large amounts of unlabeled data and can augment

maximal information to supervised learning models, thereby reducing the human annotation

effort in training machine learning models. The goal of this dissertation is to

fuse ideas from deep learning and active learning and design novel deep active learning

algorithms. The proposed learning methodologies explore diverse label spaces to

solve different computer vision applications. Three major contributions have emerged

from this work; (i) a deep active framework for multi-class image classication, (ii)

a deep active model with and without label correlation for multi-label image classi-

cation and (iii) a deep active paradigm for regression. Extensive empirical studies

on a variety of multi-class, multi-label and regression vision datasets corroborate the

potential of the proposed methods for real-world applications. Additional contributions

include: (i) a multimodal emotion database consisting of recordings of facial

expressions, body gestures, vocal expressions and physiological signals of actors enacting

various emotions, (ii) four multimodal deep belief network models and (iii)

an in-depth analysis of the effect of transfer of multimodal emotion features between

source and target networks on classification accuracy and training time. These related

contributions help comprehend the challenges involved in training deep learning

models and motivate the main goal of this dissertation.

ContributorsRanganathan, Hiranmayi (Author) / Sethuraman, Panchanathan (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Li, Baoxin (Committee member) / Chakraborty, Shayok (Committee member) / Arizona State University (Publisher)

Created2018

Channel Estimation in Half and Full Duplex Relays

Description

Both two-way relays (TWR) and full-duplex (FD) radios are spectrally efficient, and their integration shows great potential to further improve the spectral efficiency, which offers a solution to the fifth generation wireless systems. High quality channel state information (CSI) are the key components for the implementation and the performance of…

Both two-way relays (TWR) and full-duplex (FD) radios are spectrally efficient, and their integration shows great potential to further improve the spectral efficiency, which offers a solution to the fifth generation wireless systems. High quality channel state information (CSI) are the key components for the implementation and the performance of the FD TWR system, making channel estimation in FD TWRs crucial.

The impact of channel estimation on spectral efficiency in half-duplex multiple-input-multiple-output (MIMO) TWR systems is investigated. The trade-off between training and data energy is proposed. In the case that two sources are symmetric in power and number of antennas, a closed-form for the optimal ratio of data energy to total energy is derived. It can be shown that the achievable rate is a monotonically increasing function of the data length. The asymmetric case is discussed as well.

Efficient and accurate training schemes for FD TWRs are essential for profiting from the inherent spectrally efficient structures of both FD and TWRs. A novel one-block training scheme with a maximum likelihood (ML) estimator is proposed to estimate the channels between the nodes and the residual self-interference (RSI) channel simultaneously. Baseline training schemes are also considered to compare with the one-block scheme. The Cramer-Rao bounds (CRBs) of the training schemes are derived and analyzed by using the asymptotic properties of Toeplitz matrices. The benefit of estimating the RSI channel is shown analytically in terms of Fisher information.

To obtain fundamental and analytic results of how the RSI affects the spectral efficiency, one-way FD relay systems are studied. Optimal training design and ML channel estimation are proposed to estimate the RSI channel. The CRBs are derived and analyzed in closed-form so that the optimal training sequence can be found via minimizing the CRB. Extensions of the training scheme to frequency-selective channels and multiple relays are also presented.

Simultaneously sensing and transmission in an FD cognitive radio system with MIMO is considered. The trade-off between the transmission rate and the detection accuracy is characterized by the sum-rate of the primary and the secondary users. Different beamforming and combining schemes are proposed and compared.

ContributorsLi, Xiaofeng (Author) / Tepedelenlioğlu, Cihan (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Bliss, Daniel W (Committee member) / Kosut, Oliver (Committee member) / Arizona State University (Publisher)

Created2018

Dynamic Spectrum Sharing in Cognitive Radio and Device-to-Device Systems

Description

Cognitive radio (CR) and device-to-device (D2D) systems are two promising dynamic spectrum access schemes in wireless communication systems to provide improved quality-of-service, and efficient spectrum utilization. This dissertation shows that both CR and D2D systems benefit from properly designed cooperation scheme.

In underlay CR systems, where secondary users (SUs)…

Cognitive radio (CR) and device-to-device (D2D) systems are two promising dynamic spectrum access schemes in wireless communication systems to provide improved quality-of-service, and efficient spectrum utilization. This dissertation shows that both CR and D2D systems benefit from properly designed cooperation scheme.

In underlay CR systems, where secondary users (SUs) transmit simultaneously with primary users (PUs), reliable communication is by all means guaranteed for PUs, which likely deteriorates SUs’ performance. To overcome this issue, cooperation exclusively among SUs is achieved through multi-user diversity (MUD), where each SU is subject to an instantaneous interference constraint at the primary receiver. Therefore, the active number of SUs satisfying this constraint is random. Under different user distributions with the same mean number of SUs, the stochastic ordering of SU performance metrics including bit error rate (BER), outage probability, and ergodic capacity are made possible even without observing closed form expressions. Furthermore, a cooperation is assumed between primary and secondary networks, where those SUs exceeding the interference constraint facilitate PU’s transmission by relaying its signal. A fundamental performance trade-off between primary and secondary networks is observed, and it is illustrated that the proposed scheme outperforms non-cooperative underlay CR systems in the sense of system overall BER and sum achievable rate.

Similar to conventional cellular networks, CR systems suffer from an overloaded receiver having to manage signals from a large number of users. To address this issue, D2D communications has been proposed, where direct transmission links are established between users in close proximity to offload the system traffic. Several new cooperative spectrum access policies are proposed allowing coexistence of multiple D2D pairs in order to improve the spectral efficiency. Despite the additional interference, it is shown that both the cellular user’s (CU) and the individual D2D user's achievable rates can be improved simultaneously when the number of D2D pairs is below a certain threshold, resulting in a significant multiplexing gain in the sense of D2D sum rate. This threshold is quantified for different policies using second order approximations for the average achievable rates for both the CU and the individual D2D user.

ContributorsZeng, Ruochen (Author) / Tepedelenlioğlu, Cihan (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Bliss, Daniel (Committee member) / Kosut, Oliver (Committee member) / Arizona State University (Publisher)

Created2017

Non-Penetrating Microelectrode Interfaces for Cortical Neuroprosthetic Applications with a Focus on Sensory Encoding: Feasibility and Chronic Performance in Striate Cortex

Description

Growing understanding of the neural code and how to speak it has allowed for notable advancements in neural prosthetics. With commercially-available implantable systems with bi- directional neural communication on the horizon, there is an increasing imperative to develop high resolution interfaces that can survive the environment and be well tolerated…

Growing understanding of the neural code and how to speak it has allowed for notable advancements in neural prosthetics. With commercially-available implantable systems with bi- directional neural communication on the horizon, there is an increasing imperative to develop high resolution interfaces that can survive the environment and be well tolerated by the nervous system under chronic use. The sensory encoding aspect optimally interfaces at a scale sufficient to evoke perception but focal in nature to maximize resolution and evoke more complex and nuanced sensations. Microelectrode arrays can maintain high spatial density, operating on the scale of cortical columns, and can be either penetrating or non-penetrating. The non-penetrating subset sits on the tissue surface without puncturing the parenchyma and is known to engender minimal tissue response and less damage than the penetrating counterpart, improving long term viability in vivo. Provided non-penetrating microelectrodes can consistently evoke perception and maintain a localized region of activation, non-penetrating micro-electrodes may provide an ideal platform for a high performing neural prosthesis; this dissertation explores their functional capacity.

The scale at which non-penetrating electrode arrays can interface with cortex is evaluated in the context of extracting useful information. Articulate movements were decoded from surface microelectrode electrodes, and additional spatial analysis revealed unique signal content despite dense electrode spacing. With a basis for data extraction established, the focus shifts towards the information encoding half of neural interfaces. Finite element modeling was used to compare tissue recruitment under surface stimulation across electrode scales. Results indicated charge density-based metrics provide a reasonable approximation for current levels required to evoke a visual sensation and showed tissue recruitment increases exponentially with electrode diameter. Micro-scale electrodes (0.1 – 0.3 mm diameter) could sufficiently activate layers II/III in a model tuned to striate cortex while maintaining focal radii of activated tissue.

In vivo testing proceeded in a nonhuman primate model. Stimulation consistently evoked visual percepts at safe current thresholds. Tracking perception thresholds across one year reflected stable values within minimal fluctuation. Modulating waveform parameters was found useful in reducing charge requirements to evoke perception. Pulse frequency and phase asymmetry were each used to reduce thresholds, improve charge efficiency, lower charge per phase – charge density metrics associated with tissue damage. No impairments to photic perception were observed during the course of the study, suggesting limited tissue damage from array implantation or electrically induced neurotoxicity. The subject consistently identified stimulation on closely spaced electrodes (2 mm center-to-center) as separate percepts, indicating sub-visual degree discrete resolution may be feasible with this platform. Although continued testing is necessary, preliminary results supports epicortical microelectrode arrays as a stable platform for interfacing with neural tissue and a viable option for bi-directional BCI applications.

ContributorsOswalt, Denise (Author) / Greger, Bradley (Thesis advisor) / Buneo, Christopher (Committee member) / Helms-Tillery, Stephen (Committee member) / Mirzadeh, Zaman (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Arizona State University (Publisher)

Created2018

Low Cost 3D Flow Estimation in Medical Ultrasound

Description

Medical ultrasound imaging is widely used today because of it being non-invasive and cost-effective. Flow estimation helps in accurate diagnosis of vascular diseases and adds an important dimension to medical ultrasound imaging. Traditionally flow estimation is done using Doppler-based methods which only estimate velocity in the beam direction. Thus…

Medical ultrasound imaging is widely used today because of it being non-invasive and cost-effective. Flow estimation helps in accurate diagnosis of vascular diseases and adds an important dimension to medical ultrasound imaging. Traditionally flow estimation is done using Doppler-based methods which only estimate velocity in the beam direction. Thus when blood vessels are close to being orthogonal to the beam direction, there are large errors in the estimation results. In this dissertation, a low cost blood flow estimation method that does not have the angle dependency of Doppler-based methods, is presented.

First, a velocity estimator based on speckle tracking and synthetic lateral phase is proposed for clutter-free blood flow.

Speckle tracking is based on kernel matching and does not have any angle dependency. While velocity estimation in axial dimension is accurate, lateral velocity estimation is challenging due to reduced resolution and lack of phase information. This work presents a two tiered method which estimates the pixel level movement using sum-of-absolute difference, and then estimates the sub-pixel level using synthetic phase information in the lateral dimension. Such a method achieves highly accurate velocity estimation with reduced complexity compared to a cross correlation based method. The average bias of the proposed estimation method is less than 2% for plug flow and less than 7% for parabolic flow.

Blood is always accompanied by clutter which originates from vessel wall and surrounding tissues. As magnitude of the blood signal is usually 40-60 dB lower than magnitude of the clutter signal, clutter filtering is necessary before blood flow estimation. Clutter filters utilize the high magnitude and low frequency features of clutter signal to effectively remove them from the compound (blood + clutter) signal. Instead of low complexity FIR filter or high complexity SVD-based filters, here a power/subspace iteration based method is proposed for clutter filtering. Excellent clutter filtering performance is achieved for both slow and fast moving clutters with lower complexity compared to SVD-based filters. For instance, use of the proposed method results in the bias being less than 8% and standard deviation being less than 12% for fast moving clutter when the beam-to-flow-angle is $90^o$.

Third, a flow rate estimation method based on kernel power weighting is proposed. As the velocity estimator is a kernel-based method, the estimation accuracy degrades near the vessel boundary. In order to account for kernels that are not fully inside the vessel, fractional weights are given to these kernels based on their signal power. The proposed method achieves excellent flow rate estimation results with less than 8% bias for both slow and fast moving clutters.

The performance of the velocity estimator is also evaluated for challenging models. A 2D version of our two-tiered method is able to accurately estimate velocity vectors in a spinning disk as well as in a carotid bifurcation model, both of which are part of the synthetic aperture vector flow imaging (SA-VFI) challenge of 2018. In fact, the proposed method ranked 3rd in the challenge for testing dataset with carotid bifurcation. The flow estimation method is also evaluated for blood flow in vessels with stenosis. Simulation results show that the proposed method is able to estimate the flow rate with less than 9% bias.

ContributorsWei, Siyuan (Author) / Chakrabarti, Chaitali (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Ogras, Umit Y. (Committee member) / Wenisch, Thomas F. (Committee member) / Arizona State University (Publisher)

Created2018

Health management and prognostics of complex structures and systems

Description

This dissertation presents the development of structural health monitoring and prognostic health management methodologies for complex structures and systems in the field of mechanical engineering. To overcome various challenges historically associated with complex structures and systems such as complicated sensing mechanisms, noisy information, and large-size datasets, a hybrid monitoring framework…

This dissertation presents the development of structural health monitoring and prognostic health management methodologies for complex structures and systems in the field of mechanical engineering. To overcome various challenges historically associated with complex structures and systems such as complicated sensing mechanisms, noisy information, and large-size datasets, a hybrid monitoring framework comprising of solid mechanics concepts and data mining technologies is developed. In such a framework, the solid mechanics simulations provide additional intuitions to data mining techniques reducing the dependence of accuracy on the training set, while the data mining approaches fuse and interpret information from the targeted system enabling the capability for real-time monitoring with efficient computation.

In the case of structural health monitoring, ultrasonic guided waves are utilized for damage identification and localization in complex composite structures. Signal processing and data mining techniques are integrated into the damage localization framework, and the converted wave modes, which are induced by the thickness variation due to the presence of delamination, are used as damage indicators. This framework has been validated through experiments and has shown sufficient accuracy in locating delamination in X-COR sandwich composites without the need of baseline information. Besides the localization of internal damage, the Gaussian process machine learning technique is integrated with finite element method as an online-offline prediction model to predict crack propagation with overloads under biaxial loading conditions; such a probabilistic prognosis model, with limited number of training examples, has shown increased accuracy over state-of-the-art techniques in predicting crack retardation behaviors induced by overloads. In the case of system level management, a monitoring framework built using a multivariate Gaussian model as basis is developed to evaluate the anomalous condition of commercial aircrafts. This method has been validated using commercial airline data and has shown high sensitivity to variations in aircraft dynamics and pilot operations. Moreover, this framework was also tested on simulated aircraft faults and its feasibility for real-time monitoring was demonstrated with sufficient computation efficiency.

This research is expected to serve as a practical addition to the existing literature while possessing the potential to be adopted in realistic engineering applications.

ContributorsLi, Guoyi (Ph.D.) (Author) / Chattopadhyay, Aditi (Thesis advisor) / Mignolet, Marc (Committee member) / Papandreou-Suppappola, Antonia (Committee member) / Yekani Fard, Masoud (Committee member) / Jiang, Hanqing (Committee member) / Arizona State University (Publisher)

Created2019

Finding Community in Learning: Encouraging Group Learning and Cohesiveness in the Workplace

Description

This action research project centered on a group of instructional technology professionals who provide support to instructors at a public university in the United States. The practical goal of this project was to increase collaboration within the team, and to encourage alignment of the team’s efforts in relation to the…

This action research project centered on a group of instructional technology professionals who provide support to instructors at a public university in the United States. The practical goal of this project was to increase collaboration within the team, and to encourage alignment of the team’s efforts in relation to the university’s proposed redesign of its general education curriculum. Using the communities of practice perspective as a model for the team’s development, participants engaged in a sixteen-week activity in which they studied and discussed aspects of the proposed curriculum, and then used that knowledge to observe classes and compare the extent to which classroom pedagogy at the time aligned with the aims of the proposed curriculum. This qualitative action research study then explored how the team used these experiences to construct knowledge and the extent to which the group came to resemble a community of practice. Additionally, this study explored the changes that took place in the group’s capacity to interpret instructional environments. The first major finding was that the group’s identity changed from being one characterized by relationship management with their clientele to one that aligned with the institution’s instructional priorities and could be projected into the future to devise coordinated plans in support of those priorities. A second major finding was that the team developed a group-specific language and a rudimentary capacity to interpret instructional environments as a group.

ContributorsLang, Andrew (Author) / Gee, Elisabeth (Thesis advisor) / Koro-Ljungberg, Mirka (Committee member) / Hogan, Kelly (Committee member) / Arizona State University (Publisher)

Created2019

Model Based Automatic and Robust Spike Sorting for Large Volumes of Multi-channel Extracellular Data

Description

Spike sorting is a critical step for single-unit-based analysis of neural activities extracellularly and simultaneously recorded using multi-channel electrodes. When dealing with recordings from very large numbers of neurons, existing methods, which are mostly semiautomatic in nature, become inadequate.

This dissertation aims at automating the spike sorting process. A high performance,…

Spike sorting is a critical step for single-unit-based analysis of neural activities extracellularly and simultaneously recorded using multi-channel electrodes. When dealing with recordings from very large numbers of neurons, existing methods, which are mostly semiautomatic in nature, become inadequate.

This dissertation aims at automating the spike sorting process. A high performance, automatic and computationally efficient spike detection and clustering system, namely, the M-Sorter2 is presented. The M-Sorter2 employs the modified multiscale correlation of wavelet coefficients (MCWC) for neural spike detection. At the center of the proposed M-Sorter2 are two automatic spike clustering methods. They share a common hierarchical agglomerative modeling (HAM) model search procedure to strategically form a sequence of mixture models, and a new model selection criterion called difference of model evidence (DoME) to automatically determine the number of clusters. The M-Sorter2 employs two methods differing by how they perform clustering to infer model parameters: one uses robust variational Bayes (RVB) and the other uses robust Expectation-Maximization (REM) for Student’s 𝑡-mixture modeling. The M-Sorter2 is thus a significantly improved approach to sorting as an automatic procedure.

M-Sorter2 was evaluated and benchmarked with popular algorithms using simulated, artificial and real data with truth that are openly available to researchers. Simulated datasets with known statistical distributions were first used to illustrate how the clustering algorithms, namely REMHAM and RVBHAM, provide robust clustering results under commonly experienced performance degrading conditions, such as random initialization of parameters, high dimensionality of data, low signal-to-noise ratio (SNR), ambiguous clusters, and asymmetry in cluster sizes. For the artificial dataset from single-channel recordings, the proposed sorter outperformed Wave_Clus, Plexon’s Offline Sorter and Klusta in most of the comparison cases. For the real dataset from multi-channel electrodes, tetrodes and polytrodes, the proposed sorter outperformed all comparison algorithms in terms of false positive and false negative rates. The software package presented in this dissertation is available for open access.

ContributorsMa, Weichao (Author) / Si, Jennie (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / He, Jingrui (Committee member) / Helms Tillery, Stephen (Committee member) / Arizona State University (Publisher)

Created2019

Remote Sensing For Vital Signs Monitoring Using Advanced Radar Signal Processing Techniques

Description

In the past half century, low-power wireless signals from portable radar sensors, initially continuous-wave (CW) radars and more recently ultra-wideband (UWB) radar systems, have been successfully used to detect physiological movements of stationary human beings.

The thesis starts with a careful review of existing signal processing techniques and state…

In the past half century, low-power wireless signals from portable radar sensors, initially continuous-wave (CW) radars and more recently ultra-wideband (UWB) radar systems, have been successfully used to detect physiological movements of stationary human beings.

The thesis starts with a careful review of existing signal processing techniques and state of the art methods possible for vital signs monitoring using UWB impulse systems. Then an in-depth analysis of various approaches is presented.

Robust heart-rate monitoring methods are proposed based on a novel result: spectrally the fundamental heartbeat frequency is respiration-interference-limited while its higher-order harmonics are noise-limited. The higher-order statistics related to heartbeat can be a robust indication when the fundamental heartbeat is masked by the strong lower-order harmonics of respiration or when phase calibration is not accurate if phase-based method is used. Analytical spectral analysis is performed to validate that the higher-order harmonics of heartbeat is almost respiration-interference free. Extensive experiments have been conducted to justify an adaptive heart-rate monitoring algorithm. The scenarios of interest are, 1) single subject, 2) multiple subjects at different ranges, 3) multiple subjects at same range, and 4) through wall monitoring.

A remote sensing radar system implemented using the proposed adaptive heart-rate estimation algorithm is compared to the competing remote sensing technology, a remote imaging photoplethysmography system, showing promising results.

State of the art methods for vital signs monitoring are fundamentally related to process the phase variation due to vital signs motions. Their performance are determined by a phase calibration procedure. Existing methods fail to consider the time-varying nature of phase noise. There is no prior knowledge about which of the corrupted complex signals, in-phase component (I) and quadrature component (Q), need to be corrected. A precise phase calibration routine is proposed based on the respiration pattern. The I/Q samples from every breath are more likely to experience similar motion noise and therefore they should be corrected independently. High slow-time sampling rate is used to ensure phase calibration accuracy. Occasionally, a 180-degree phase shift error occurs after the initial calibration step and should be corrected as well. All phase trajectories in the I/Q plot are only allowed in certain angular spaces. This precise phase calibration routine is validated through computer simulations incorporating a time-varying phase noise model, controlled mechanic system, and human subject experiment.

ContributorsRong, Yu (Author) / Bliss, Daniel W (Thesis advisor) / Richmond, Christ D (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Alkhateeb, Ahmed (Committee member) / Arizona State University (Publisher)

Created2018

Filtering by