Search Content

Invariant human pose feature extraction for movement recognition and pose estimation

Description

Reliable extraction of human pose features that are invariant to view angle and body shape changes is critical for advancing human movement analysis. In this dissertation, the multifactor analysis techniques, including the multilinear analysis and the multifactor Gaussian process methods, have been exploited to extract such invariant pose features from…

Reliable extraction of human pose features that are invariant to view angle and body shape changes is critical for advancing human movement analysis. In this dissertation, the multifactor analysis techniques, including the multilinear analysis and the multifactor Gaussian process methods, have been exploited to extract such invariant pose features from video data by decomposing various key contributing factors, such as pose, view angle, and body shape, in the generation of the image observations. Experimental results have shown that the resulting pose features extracted using the proposed methods exhibit excellent invariance properties to changes in view angles and body shapes. Furthermore, using the proposed invariant multifactor pose features, a suite of simple while effective algorithms have been developed to solve the movement recognition and pose estimation problems. Using these proposed algorithms, excellent human movement analysis results have been obtained, and most of them are superior to those obtained from state-of-the-art algorithms on the same testing datasets. Moreover, a number of key movement analysis challenges, including robust online gesture spotting and multi-camera gesture recognition, have also been addressed in this research. To this end, an online gesture spotting framework has been developed to automatically detect and learn non-gesture movement patterns to improve gesture localization and recognition from continuous data streams using a hidden Markov network. In addition, the optimal data fusion scheme has been investigated for multicamera gesture recognition, and the decision-level camera fusion scheme using the product rule has been found to be optimal for gesture recognition using multiple uncalibrated cameras. Furthermore, the challenge of optimal camera selection in multi-camera gesture recognition has also been tackled. A measure to quantify the complementary strength across cameras has been proposed. Experimental results obtained from a real-life gesture recognition dataset have shown that the optimal camera combinations identified according to the proposed complementary measure always lead to the best gesture recognition results.

ContributorsPeng, Bo (Author) / Qian, Gang (Thesis advisor) / Ye, Jieping (Committee member) / Li, Baoxin (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)

Created2011

Fully differential difference amplifier based microphone interface circuit and an adaptive signal to noise ratio analog front end for dual channel digital hearing aids

Description

A dual-channel directional digital hearing aid (DHA) front-end using a fully differential difference amplifier (FDDA) based Microphone interface circuit (MIC) for a capacitive Micro Electro Mechanical Systems (MEMS) microphones and an adaptive-power analog font end (AFE) is presented. The Microphone interface circuit based on FDDA converts…

A dual-channel directional digital hearing aid (DHA) front-end using a fully differential difference amplifier (FDDA) based Microphone interface circuit (MIC) for a capacitive Micro Electro Mechanical Systems (MEMS) microphones and an adaptive-power analog font end (AFE) is presented. The Microphone interface circuit based on FDDA converts the capacitance variations into voltage signal, achieves a noise of 32 dB SPL (sound pressure level) and an SNR of 72 dB, additionally it also performs single to differential conversion allowing for fully differential analog signal chain. The analog front-end consists of 40dB VGA and a power scalable continuous time sigma delta ADC, with 68dB SNR dissipating 67u¬W from a 1.2V supply. The ADC implements a self calibrating feedback DAC, for calibrating the 2nd order non-linearity. The VGA and power scalable ADC is fabricated on 0.25 um CMOS TSMC process. The dual channels of the DHA are precisely matched and achieve about 0.5dB gain mismatch, resulting in greater than 5dB directivity index. This will enable a highly integrated and low power DHA

ContributorsNaqvi, Syed Roomi (Author) / Kiaei, Sayfe (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Chae, Junseok (Committee member) / Barnby, Hugh (Committee member) / Aberle, James T., 1961- (Committee member) / Arizona State University (Publisher)

Created2011

Practical coding schemes for multi-user communications

Description

There are many wireless communication and networking applications that require high transmission rates and reliability with only limited resources in terms of bandwidth, power, hardware complexity etc.. Real-time video streaming, gaming and social networking are a few such examples. Over the years many problems have been addressed towards the goal…

There are many wireless communication and networking applications that require high transmission rates and reliability with only limited resources in terms of bandwidth, power, hardware complexity etc.. Real-time video streaming, gaming and social networking are a few such examples. Over the years many problems have been addressed towards the goal of enabling such applications; however, significant challenges still remain, particularly, in the context of multi-user communications. With the motivation of addressing some of these challenges, the main focus of this dissertation is the design and analysis of capacity approaching coding schemes for several (wireless) multi-user communication scenarios. Specifically, three main themes are studied: superposition coding over broadcast channels, practical coding for binary-input binary-output broadcast channels, and signalling schemes for two-way relay channels. As the first contribution, we propose an analytical tool that allows for reliable comparison of different practical codes and decoding strategies over degraded broadcast channels, even for very low error rates for which simulations are impractical. The second contribution deals with binary-input binary-output degraded broadcast channels, for which an optimal encoding scheme that achieves the capacity boundary is found, and a practical coding scheme is given by concatenation of an outer low density parity check code and an inner (non-linear) mapper that induces desired distribution of "one" in a codeword. The third contribution considers two-way relay channels where the information exchange between two nodes takes place in two transmission phases using a coding scheme called physical-layer network coding. At the relay, a near optimal decoding strategy is derived using a list decoding algorithm, and an approximation is obtained by a joint decoding approach. For the latter scheme, an analytical approximation of the word error rate based on a union bounding technique is computed under the assumption that linear codes are employed at the two nodes exchanging data. Further, when the wireless channel is frequency selective, two decoding strategies at the relay are developed, namely, a near optimal decoding scheme implemented using list decoding, and a reduced complexity detection/decoding scheme utilizing a linear minimum mean squared error based detector followed by a network coded sequence decoder.

ContributorsBhat, Uttam (Author) / Duman, Tolga M. (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Li, Baoxin (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)

Created2011

Temperature compensated, high common mode range, Cu-trace based current shunt monitors design and analysis

Description

Sensing and controlling current flow is a fundamental requirement for many electronic systems, including power management (DC-DC converters and LDOs), battery chargers, electric vehicles, solenoid positioning, motor control, and power monitoring. Current Shunt Monitor (CSM) systems have various applications for precise current monitoring of those aforementioned applications. CSMs enable current…

Sensing and controlling current flow is a fundamental requirement for many electronic systems, including power management (DC-DC converters and LDOs), battery chargers, electric vehicles, solenoid positioning, motor control, and power monitoring. Current Shunt Monitor (CSM) systems have various applications for precise current monitoring of those aforementioned applications. CSMs enable current measurement across an external sense resistor (RS) in series to current flow. Two different types of CSMs designed and characterized in this paper. First design used direct current reading method and the other design used indirect current reading method. Proposed CSM systems can sense power supply current ranging from 1mA to 200mA for the direct current reading topology and from 1mA to 500mA for the indirect current reading topology across a typical board Cu-trace resistance of 1 ohm with less than 10 µV input-referred offset, 0.3 µV/°C offset drift and 0.1% accuracy for both topologies. Proposed systems avoid using a costly zero-temperature coefficient (TC) sense resistor that is normally used in typical CSM systems. Instead, both of the designs used existing Cu-trace on the printed circuit board (PCB) in place of the costly resistor. The systems use chopper stabilization at the front-end amplifier signal path to suppress input-referred offset down to less than 10 µV. Switching current-mode (SI) FIR filtering technique is used at the instrumentation amplifier output to filter out the chopping ripple caused by input offset and flicker noise by averaging half of the phase 1 signal and the other half of the phase 2 signal. In addition, residual offset mainly caused by clock feed-through and charge injection of the chopper switches at the chopping frequency and its multiple frequencies notched out by the since response of the SI-FIR filter. A frequency domain Sigma Delta ADC which is used for the indirect current reading type design enables a digital interface to processor applications with minimally added circuitries to build a simple 1st order Sigma Delta ADC. The CSMs are fabricated on a 0.7µm CMOS process with 3 levels of metal, with maximum Vds tolerance of 8V and operates across a common mode range of 0 to 26V for the direct current reading type and of 0 to 30V for the indirect current reading type achieving less than 10nV/sqrtHz of flicker noise at 100 Hz for both approaches. By using a semi-digital SI-FIR filter, residual chopper offset is suppressed down to 0.5mVpp from a baseline of 8mVpp, which is equivalent to 25dB suppression.

ContributorsYeom, Hyunsoo (Author) / Bakkaloglu, Bertan (Thesis advisor) / Kiaei, Sayfe (Committee member) / Ozev, Sule (Committee member) / Yu, Hongyu (Committee member) / Arizona State University (Publisher)

Created2011

Dissertation on linear asset pricing models

Description

One necessary condition for the two-pass risk premium estimator to be consistent and asymptotically normal is that the rank of the beta matrix in a proposed linear asset pricing model is full column. I first investigate the asymptotic properties of the risk premium estimators and the related t-test and…

One necessary condition for the two-pass risk premium estimator to be consistent and asymptotically normal is that the rank of the beta matrix in a proposed linear asset pricing model is full column. I first investigate the asymptotic properties of the risk premium estimators and the related t-test and Wald test statistics when the full rank condition fails. I show that the beta risk of useless factors or multiple proxy factors for a true factor are priced more often than they should be at the nominal size in the asset pricing models omitting some true factors. While under the null hypothesis that the risk premiums of the true factors are equal to zero, the beta risk of the true factors are priced less often than the nominal size. The simulation results are consistent with the theoretical findings. Hence, the factor selection in a proposed factor model should not be made solely based on their estimated risk premiums. In response to this problem, I propose an alternative estimation of the underlying factor structure. Specifically, I propose to use the linear combination of factors weighted by the eigenvectors of the inner product of estimated beta matrix. I further propose a new method to estimate the rank of the beta matrix in a factor model. For this method, the idiosyncratic components of asset returns are allowed to be correlated both over different cross-sectional units and over different time periods. The estimator I propose is easy to use because it is computed with the eigenvalues of the inner product of an estimated beta matrix. Simulation results show that the proposed method works well even in small samples. The analysis of US individual stock returns suggests that there are six common risk factors in US individual stock returns among the thirteen factor candidates used. The analysis of portfolio returns reveals that the estimated number of common factors changes depending on how the portfolios are constructed. The number of risk sources found from the analysis of portfolio returns is generally smaller than the number found in individual stock returns.

ContributorsWang, Na (Author) / Ahn, Seung C. (Thesis advisor) / Kallberg, Jarl G. (Committee member) / Liu, Crocker H. (Committee member) / Arizona State University (Publisher)

Created2011

Mining semantics from low-level features in multimedia computing

Description

Bridging semantic gap is one of the fundamental problems in multimedia computing and pattern recognition. The challenge of associating low-level signal with their high-level semantic interpretation is mainly due to the fact that semantics are often conveyed implicitly in a context, relying on interactions among multiple levels of concepts or…

Bridging semantic gap is one of the fundamental problems in multimedia computing and pattern recognition. The challenge of associating low-level signal with their high-level semantic interpretation is mainly due to the fact that semantics are often conveyed implicitly in a context, relying on interactions among multiple levels of concepts or low-level data entities. Also, additional domain knowledge may often be indispensable for uncovering the underlying semantics, but in most cases such domain knowledge is not readily available from the acquired media streams. Thus, making use of various types of contextual information and leveraging corresponding domain knowledge are vital for effectively associating high-level semantics with low-level signals with higher accuracies in multimedia computing problems. In this work, novel computational methods are explored and developed for incorporating contextual information/domain knowledge in different forms for multimedia computing and pattern recognition problems. Specifically, a novel Bayesian approach with statistical-sampling-based inference is proposed for incorporating a special type of domain knowledge, spatial prior for the underlying shapes; cross-modality correlations via Kernel Canonical Correlation Analysis is explored and the learnt space is then used for associating multimedia contents in different forms; model contextual information as a graph is leveraged for regulating interactions among high-level semantic concepts (e.g., category labels), low-level input signal (e.g., spatial/temporal structure). Four real-world applications, including visual-to-tactile face conversion, photo tag recommendation, wild web video classification and unconstrained consumer video summarization, are selected to demonstrate the effectiveness of the approaches. These applications range from classic research challenges to emerging tasks in multimedia computing. Results from experiments on large-scale real-world data with comparisons to other state-of-the-art methods and subjective evaluations with end users confirmed that the developed approaches exhibit salient advantages, suggesting that they are promising for leveraging contextual information/domain knowledge for a wide range of multimedia computing and pattern recognition problems.

ContributorsWang, Zhesheng (Author) / Li, Baoxin (Thesis advisor) / Sundaram, Hari (Committee member) / Qian, Gang (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2011

Finding provenance data in social media

Description

A statement appearing in social media provides a very significant challenge for determining the provenance of the statement. Provenance describes the origin, custody, and ownership of something. Most statements appearing in social media are not published with corresponding provenance data. However, the same characteristics that make the social media environment…

A statement appearing in social media provides a very significant challenge for determining the provenance of the statement. Provenance describes the origin, custody, and ownership of something. Most statements appearing in social media are not published with corresponding provenance data. However, the same characteristics that make the social media environment challenging, including the massive amounts of data available, large numbers of users, and a highly dynamic environment, provide unique and untapped opportunities for solving the provenance problem for social media. Current approaches for tracking provenance data do not scale for online social media and consequently there is a gap in provenance methodologies and technologies providing exciting research opportunities. The guiding vision is the use of social media information itself to realize a useful amount of provenance data for information in social media. This departs from traditional approaches for data provenance which rely on a central store of provenance information. The contemporary online social media environment is an enormous and constantly updated "central store" that can be mined for provenance information that is not readily made available to the average social media user. This research introduces an approach and builds a foundation aimed at realizing a provenance data capability for social media users that is not accessible today.

ContributorsBarbier, Geoffrey P (Author) / Liu, Huan (Thesis advisor) / Bell, Herbert (Committee member) / Li, Baoxin (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)

Created2011

A 280 mW, 0.07 % THD+N class-D audio amplifier using a frequency-domain quantizer

Description

Pulse Density Modulation- (PDM-) based class-D amplifiers can reduce non-linearity and tonal content due to carrier signal in Pulse Width Modulation - (PWM-) based amplifiers. However, their low-voltage analog implementations also require a linear- loop filter and a quantizer. A PDM-based class-D audio amplifier using a frequency-domain quantization is presented…

Pulse Density Modulation- (PDM-) based class-D amplifiers can reduce non-linearity and tonal content due to carrier signal in Pulse Width Modulation - (PWM-) based amplifiers. However, their low-voltage analog implementations also require a linear- loop filter and a quantizer. A PDM-based class-D audio amplifier using a frequency-domain quantization is presented in this paper. The digital-intensive frequency domain approach achieves high linearity under low-supply regimes. An analog comparator and a single-bit quantizer are replaced with a Current-Controlled Oscillator- (ICO-) based frequency discriminator. By using the ICO as a phase integrator, a third-order noise shaping is achieved using only two analog integrators. A single-loop, singlebit class-D audio amplifier is presented with an H-bridge switching power stage, which is designed and fabricated on a 0.18 um CMOS process, with 6 layers of metal achieving a total harmonic distortion plus noise (THD+N) of 0.065% and a peak power efficiency of 80% while driving a 4-ohms loudspeaker load. The amplifier can deliver the output power of 280 mW.

ContributorsLee, Junghan (Author) / Bakkaloglu, Bertan (Thesis advisor) / Kiaei, Sayfe (Committee member) / Ozev, Sule (Committee member) / Song, Hongjiang (Committee member) / Arizona State University (Publisher)

Created2011

Multi-label dimensionality reduction

Description

Multi-label learning, which deals with data associated with multiple labels simultaneously, is ubiquitous in real-world applications. To overcome the curse of dimensionality in multi-label learning, in this thesis I study multi-label dimensionality reduction, which extracts a small number of features by removing the irrelevant, redundant, and noisy information while considering…

Multi-label learning, which deals with data associated with multiple labels simultaneously, is ubiquitous in real-world applications. To overcome the curse of dimensionality in multi-label learning, in this thesis I study multi-label dimensionality reduction, which extracts a small number of features by removing the irrelevant, redundant, and noisy information while considering the correlation among different labels in multi-label learning. Specifically, I propose Hypergraph Spectral Learning (HSL) to perform dimensionality reduction for multi-label data by exploiting correlations among different labels using a hypergraph. The regularization effect on the classical dimensionality reduction algorithm known as Canonical Correlation Analysis (CCA) is elucidated in this thesis. The relationship between CCA and Orthonormalized Partial Least Squares (OPLS) is also investigated. To perform dimensionality reduction efficiently for large-scale problems, two efficient implementations are proposed for a class of dimensionality reduction algorithms, including canonical correlation analysis, orthonormalized partial least squares, linear discriminant analysis, and hypergraph spectral learning. The first approach is a direct least squares approach which allows the use of different regularization penalties, but is applicable under a certain assumption; the second one is a two-stage approach which can be applied in the regularization setting without any assumption. Furthermore, an online implementation for the same class of dimensionality reduction algorithms is proposed when the data comes sequentially. A Matlab toolbox for multi-label dimensionality reduction has been developed and released. The proposed algorithms have been applied successfully in the Drosophila gene expression pattern image annotation. The experimental results on some benchmark data sets in multi-label learning also demonstrate the effectiveness and efficiency of the proposed algorithms.

ContributorsSun, Liang (Author) / Ye, Jieping (Thesis advisor) / Li, Baoxin (Committee member) / Liu, Huan (Committee member) / Mittelmann, Hans D. (Committee member) / Arizona State University (Publisher)

Created2011

A 5 GHz ring-oscillator PLL with active delay-discriminator phase noise cancellation loop

Description

Voltage Control Oscillator (VCO) is one of the most critical blocks in Phase Lock Loops (PLLs). LC-tank VCOs have a superior phase noise performance, however they require bulky passive resonators and often calibration architectures to overcome their limited tuning range. Ring oscillator (RO) based VCOs are attractive for digital technology…

Voltage Control Oscillator (VCO) is one of the most critical blocks in Phase Lock Loops (PLLs). LC-tank VCOs have a superior phase noise performance, however they require bulky passive resonators and often calibration architectures to overcome their limited tuning range. Ring oscillator (RO) based VCOs are attractive for digital technology applications owing to their ease of integration, small die area and scalability in deep submicron processes. However, due to their supply sensitivity and poor phase noise performance, they have limited use in applications demanding low phase noise floor, such as wireless or optical transceivers. Particularly, out-of-band phase noise of RO-based PLLs is dominated by RO performance, which cannot be suppressed by the loop gain, impairing RF receiver's sensitivity or BER of optical clock-data recovery circuits. Wide loop bandwidth PLLs can overcome RO noise penalty, however, they suffer from increased in-band noise due to reference clock, phase-detector and charge-pump. The RO phase noise is determined by the noise coming from active devices, supply, ground and substrate. The authors adopt an auxiliary circuit with inverse delay sensitivity to supply noise, which compensates for the delay variation of inverter cells. Feed-forward noise-cancelling architecture that improves phase noise characteristic of RO based PLLs is presented. The proposed circuit dynamically attenuates RO phase noise contribution outside the PLL bandwidth, or in a preferred band. The implemented noise-cancelling loop potentially enables application of RO based PLL for demanding frequency synthesizers applications, such as optical links or high-speed serial I/Os.

ContributorsMin, Seungkee (Author) / Kiaei, Sayfe (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Ozev, Sule (Committee member) / Towe, Bruce (Committee member) / Arizona State University (Publisher)

Created2011

Filtering by