Search Content

Invariant human pose feature extraction for movement recognition and pose estimation

Description

Reliable extraction of human pose features that are invariant to view angle and body shape changes is critical for advancing human movement analysis. In this dissertation, the multifactor analysis techniques, including the multilinear analysis and the multifactor Gaussian process methods, have been exploited to extract such invariant pose features from…

Reliable extraction of human pose features that are invariant to view angle and body shape changes is critical for advancing human movement analysis. In this dissertation, the multifactor analysis techniques, including the multilinear analysis and the multifactor Gaussian process methods, have been exploited to extract such invariant pose features from video data by decomposing various key contributing factors, such as pose, view angle, and body shape, in the generation of the image observations. Experimental results have shown that the resulting pose features extracted using the proposed methods exhibit excellent invariance properties to changes in view angles and body shapes. Furthermore, using the proposed invariant multifactor pose features, a suite of simple while effective algorithms have been developed to solve the movement recognition and pose estimation problems. Using these proposed algorithms, excellent human movement analysis results have been obtained, and most of them are superior to those obtained from state-of-the-art algorithms on the same testing datasets. Moreover, a number of key movement analysis challenges, including robust online gesture spotting and multi-camera gesture recognition, have also been addressed in this research. To this end, an online gesture spotting framework has been developed to automatically detect and learn non-gesture movement patterns to improve gesture localization and recognition from continuous data streams using a hidden Markov network. In addition, the optimal data fusion scheme has been investigated for multicamera gesture recognition, and the decision-level camera fusion scheme using the product rule has been found to be optimal for gesture recognition using multiple uncalibrated cameras. Furthermore, the challenge of optimal camera selection in multi-camera gesture recognition has also been tackled. A measure to quantify the complementary strength across cameras has been proposed. Experimental results obtained from a real-life gesture recognition dataset have shown that the optimal camera combinations identified according to the proposed complementary measure always lead to the best gesture recognition results.

ContributorsPeng, Bo (Author) / Qian, Gang (Thesis advisor) / Ye, Jieping (Committee member) / Li, Baoxin (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)

Created2011

Ethernet passive optical network dynamic bandwidth allocation study

Description

Fiber-Wireless (FiWi) network is the future network configuration that uses optical fiber as backbone transmission media and enables wireless network for the end user. Our study focuses on the Dynamic Bandwidth Allocation (DBA) algorithm for EPON upstream transmission. DBA, if designed properly, can dramatically improve the packet transmission delay and…

Fiber-Wireless (FiWi) network is the future network configuration that uses optical fiber as backbone transmission media and enables wireless network for the end user. Our study focuses on the Dynamic Bandwidth Allocation (DBA) algorithm for EPON upstream transmission. DBA, if designed properly, can dramatically improve the packet transmission delay and overall bandwidth utilization. With new DBA components coming out in research, a comprehensive study of DBA is conducted in this thesis, adding in Double Phase Polling coupled with novel Limited with Share credits Excess distribution method. By conducting a series simulation of DBAs using different components, we found out that grant sizing has the strongest impact on average packet delay and grant scheduling also has a signiﬁcant impact on the average packet delay; grant scheduling has the strongest impact on the stability limit or maximum achievable channel utilization. Whereas the grant sizing only has a modest impact on the stability limit; the SPD grant scheduling policy in the Double Phase Polling scheduling framework coupled with Limited with Share credits Excess distribution grant sizing produced both the lowest average packet delay and the highest stability limit.

ContributorsZhao, Du (Author) / Reisslein, Martin (Thesis advisor) / McGarry, Michael (Committee member) / Fowler, John (Committee member) / Arizona State University (Publisher)

Created2011

Practical coding schemes for multi-user communications

Description

There are many wireless communication and networking applications that require high transmission rates and reliability with only limited resources in terms of bandwidth, power, hardware complexity etc.. Real-time video streaming, gaming and social networking are a few such examples. Over the years many problems have been addressed towards the goal…

There are many wireless communication and networking applications that require high transmission rates and reliability with only limited resources in terms of bandwidth, power, hardware complexity etc.. Real-time video streaming, gaming and social networking are a few such examples. Over the years many problems have been addressed towards the goal of enabling such applications; however, significant challenges still remain, particularly, in the context of multi-user communications. With the motivation of addressing some of these challenges, the main focus of this dissertation is the design and analysis of capacity approaching coding schemes for several (wireless) multi-user communication scenarios. Specifically, three main themes are studied: superposition coding over broadcast channels, practical coding for binary-input binary-output broadcast channels, and signalling schemes for two-way relay channels. As the first contribution, we propose an analytical tool that allows for reliable comparison of different practical codes and decoding strategies over degraded broadcast channels, even for very low error rates for which simulations are impractical. The second contribution deals with binary-input binary-output degraded broadcast channels, for which an optimal encoding scheme that achieves the capacity boundary is found, and a practical coding scheme is given by concatenation of an outer low density parity check code and an inner (non-linear) mapper that induces desired distribution of "one" in a codeword. The third contribution considers two-way relay channels where the information exchange between two nodes takes place in two transmission phases using a coding scheme called physical-layer network coding. At the relay, a near optimal decoding strategy is derived using a list decoding algorithm, and an approximation is obtained by a joint decoding approach. For the latter scheme, an analytical approximation of the word error rate based on a union bounding technique is computed under the assumption that linear codes are employed at the two nodes exchanging data. Further, when the wireless channel is frequency selective, two decoding strategies at the relay are developed, namely, a near optimal decoding scheme implemented using list decoding, and a reduced complexity detection/decoding scheme utilizing a linear minimum mean squared error based detector followed by a network coded sequence decoder.

ContributorsBhat, Uttam (Author) / Duman, Tolga M. (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Li, Baoxin (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)

Created2011

Performance of single layer H.264 SVC video over error prone networks

Description

With tremendous increase in the popularity of networked multimedia applications, video data is expected to account for a large portion of the traffic on the Internet and more importantly next-generation wireless systems. To be able to satisfy a broad range of customers requirements, two major problems need to be solved.…

With tremendous increase in the popularity of networked multimedia applications, video data is expected to account for a large portion of the traffic on the Internet and more importantly next-generation wireless systems. To be able to satisfy a broad range of customers requirements, two major problems need to be solved. The first problem is the need for a scalable representation of the input video. The recently developed scalable extension of the state-of-the art H.264/MPEG-4 AVC video coding standard, also known as H.264/SVC (Scalable Video Coding) provides a solution to this problem. The second problem is that wireless transmission medium typically introduce errors in the bit stream due to noise, congestion and fading on the channel. Protection against these channel impairments can be realized by the use of forward error correcting (FEC) codes. In this research study, the performance of scalable video coding in the presence of bit errors is studied. The encoded video is channel coded using Reed Solomon codes to provide acceptable performance in the presence of channel impairments. In the scalable bit stream, some parts of the bit stream are more important than other parts. Parity bytes are assigned to the video packets based on their importance in unequal error protection scheme. In equal error protection scheme, parity bytes are assigned based on the length of the message. A quantitative comparison of the two schemes, along with the case where no channel coding is employed is performed. H.264 SVC single layer video streams for long video sequences of different genres is considered in this study which serves as a means of effective video characterization. JSVM reference software, in its current version, does not support decoding of erroneous bit streams. A framework to obtain H.264 SVC compatible bit stream is modeled in this study. It is concluded that assigning of parity bytes based on the distribution of data for different types of frames provides optimum performance. Application of error protection to the bit stream enhances the quality of the decoded video with minimal overhead added to the bit stream.

ContributorsSundararaman, Hari (Author) / Reisslein, Martin (Thesis advisor) / Seeling, Patrick (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)

Created2011

On asynchronous communication systems: capacity bounds and relaying schemes

Description

Practical communication systems are subject to errors due to imperfect time alignment among the communicating nodes. Timing errors can occur in different forms depending on the underlying communication scenario. This doctoral study considers two different classes of asynchronous systems; point-to-point (P2P) communication systems with synchronization errors, and asynchronous cooperative systems.…

Practical communication systems are subject to errors due to imperfect time alignment among the communicating nodes. Timing errors can occur in different forms depending on the underlying communication scenario. This doctoral study considers two different classes of asynchronous systems; point-to-point (P2P) communication systems with synchronization errors, and asynchronous cooperative systems. In particular, the focus is on an information theoretic analysis for P2P systems with synchronization errors and developing new signaling solutions for several asynchronous cooperative communication systems. The first part of the dissertation presents several bounds on the capacity of the P2P systems with synchronization errors. First, binary insertion and deletion channels are considered where lower bounds on the mutual information between the input and output sequences are computed for independent uniformly distributed (i.u.d.) inputs. Then, a channel suffering from both synchronization errors and additive noise is considered as a serial concatenation of a synchronization error-only channel and an additive noise channel. It is proved that the capacity of the original channel is lower bounded in terms of the synchronization error-only channel capacity and the parameters of both channels. On a different front, to better characterize the deletion channel capacity, the capacity of three independent deletion channels with different deletion probabilities are related through an inequality resulting in the tightest upper bound on the deletion channel capacity for deletion probabilities larger than 0.65. Furthermore, the first non-trivial upper bound on the 2K-ary input deletion channel capacity is provided by relating the 2K-ary input deletion channel capacity with the binary deletion channel capacity through an inequality. The second part of the dissertation develops two new relaying schemes to alleviate asynchronism issues in cooperative communications. The first one is a single carrier (SC)-based scheme providing a spectrally efficient Alamouti code structure at the receiver under flat fading channel conditions by reducing the overhead needed to overcome the asynchronism and obtain spatial diversity. The second one is an orthogonal frequency division multiplexing (OFDM)-based approach useful for asynchronous cooperative systems experiencing excessive relative delays among the relays under frequency-selective channel conditions to achieve a delay diversity structure at the receiver and extract spatial diversity.

ContributorsRahmati, Mojtaba (Author) / Duman, Tolga M. (Thesis advisor) / Zhang, Junshan (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Reisslein, Martin (Committee member) / Arizona State University (Publisher)

Created2013

Performance characterization of communication channels through asymptotic and partial ordering analysis

Description

Asymptotic comparisons of ergodic channel capacity at high and low signal-to-noise ratios (SNRs) are provided for several adaptive transmission schemes over fading channels with general distributions, including optimal power and rate adaptation, rate adaptation only, channel inversion and its variants. Analysis of the high-SNR pre-log constants of the ergodic capacity…

Asymptotic comparisons of ergodic channel capacity at high and low signal-to-noise ratios (SNRs) are provided for several adaptive transmission schemes over fading channels with general distributions, including optimal power and rate adaptation, rate adaptation only, channel inversion and its variants. Analysis of the high-SNR pre-log constants of the ergodic capacity reveals the existence of constant capacity difference gaps among the schemes with a pre-log constant of 1. Closed-form expressions for these high-SNR capacity difference gaps are derived, which are proportional to the SNR loss between these schemes in dB scale. The largest one of these gaps is found to be between the optimal power and rate adaptation scheme and the channel inversion scheme. Based on these expressions it is shown that the presence of space diversity or multi-user diversity makes channel inversion arbitrarily close to achieving optimal capacity at high SNR with sufficiently large number of antennas or users. A low-SNR analysis also reveals that the presence of fading provably always improves capacity at sufficiently low SNR, compared to the additive white Gaussian noise (AWGN) case. Numerical results are shown to corroborate our analytical results. This dissertation derives high-SNR asymptotic average error rates over fading channels by relating them to the outage probability, under mild assumptions. The analysis is based on the Tauberian theorem for Laplace-Stieltjes transforms which is grounded on the notion of regular variation, and applies to a wider range of channel distributions than existing approaches. The theory of regular variation is argued to be the proper mathematical framework for finding sufficient and necessary conditions for outage events to dominate high-SNR error rate performance. It is proved that the diversity order being d and the cumulative distribution function (CDF) of the channel power gain having variation exponent d at 0 imply each other, provided that the instantaneous error rate is upper-bounded by an exponential function of the instantaneous SNR. High-SNR asymptotic average error rates are derived for specific instantaneous error rates. Compared to existing approaches in the literature, the asymptotic expressions are related to the channel distribution in a much simpler manner herein, and related with outage more intuitively. The high-SNR asymptotic error rate is also characterized under diversity combining schemes with the channel power gain of each branch having a regularly varying CDF. Numerical results are shown to corroborate our theoretical analysis. This dissertation studies several problems concerning channel inclusion, which is a partial ordering between discrete memoryless channels (DMCs) proposed by Shannon. Specifically, majorization-based conditions are derived for channel inclusion between certain DMCs. Furthermore, under general conditions, channel equivalence defined through Shannon ordering is shown to be the same as permutation of input and output symbols. The determination of channel inclusion is considered as a convex optimization problem, and the sparsity of the weights related to the representation of the worse DMC in terms of the better one is revealed when channel inclusion holds between two DMCs. For the exploitation of this sparsity, an effective iterative algorithm is established based on modifying the orthogonal matching pursuit algorithm. The extension of channel inclusion to continuous channels and its application in ordering phase noises are briefly addressed.

ContributorsZhang, Yuan (Author) / Tepedelenlioğlu, Cihan (Thesis advisor) / Zhang, Junshan (Committee member) / Reisslein, Martin (Committee member) / Spanias, Andreas (Committee member) / Arizona State University (Publisher)

Created2013

Ensuring safety of model-based generated code for pervasive health monitoring systems

Description

Wireless technologies for health monitoring systems have seen considerable interest in recent years owing to it's potential to achieve vision of pervasive healthcare, that is healthcare to anyone, anywhere and anytime. Development of wearable wireless medical devices which have the capability to sense, compute, and send physiological information to a…

Wireless technologies for health monitoring systems have seen considerable interest in recent years owing to it's potential to achieve vision of pervasive healthcare, that is healthcare to anyone, anywhere and anytime. Development of wearable wireless medical devices which have the capability to sense, compute, and send physiological information to a mobile gateway, forming a Body Sensor Network (BSN) is considered as a step towards achieving the vision of pervasive health monitoring systems (PHMS). PHMS consisting of wearable body sensors encourages unsupervised long-term monitoring, reducing frequent visit to hospital and nursing cost. Therefore, it is of utmost importance that operation of PHMS must be reliable, safe and have longer lifetime. A model-based automatic code generation provides a state-of-art code generation of sensor and smart phone code from high-level specification of a PHMS. Code generator intakes meta-model of PHMS specification, uses codebase containing code templates and algorithms, and generates platform specific code. Health-Dev, a framework for model-based development of PHMS, uses code generation to implement PHMS in sensor and smart phone. As a part of this thesis, model-based automatic code generation was evaluated and experimentally validated. The generated code was found to be safe in terms of ensuring no race condition, array, or pointer related errors in the generated code and more optimized as compared to hand-written BSN benchmark code in terms of lesser unreachable code.

ContributorsVerma, Sunit (Author) / Gupta, Sandeep (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Reisslein, Martin (Committee member) / Arizona State University (Publisher)

Created2013

Traffic characterization and modeling of H.264 scalable & multi view encoded video

Description

Present day Internet Protocol (IP) based video transport and dissemination systems are heterogeneous in that they differ in network bandwidth, display resolutions and processing capabilities. One important objective in such an environment is the flexible adaptation of once-encoded content and to achieve this, one popular method is the scalable video…

Present day Internet Protocol (IP) based video transport and dissemination systems are heterogeneous in that they differ in network bandwidth, display resolutions and processing capabilities. One important objective in such an environment is the flexible adaptation of once-encoded content and to achieve this, one popular method is the scalable video coding (SVC) technique. The SVC extension of the H.264/AVC standard has higher compression efficiency when compared to the previous scalable video standards. The network transport of 3D video, which is obtained by superimposing two views of a video scene, poses significant challenges due to the increased video data compared to conventional single-view video. Addressing these challenges requires a thorough understanding of the traffic and multiplexing characteristics of the different representation formats of 3D video. In this study, H.264 quality scalability and multiview representation formats are examined. As H.264/AVC, it's SVC and multiview extensions are expected to become widely adopted for the network transport of video, it is important to thoroughly study their network traffic characteristics, including the bit rate variability. Primarily the focus is on the SVC amendment of the H.264/AVC standard, with particular focus on Coarse-Grain Scalability (CGS) and Medium-Grain Scalability (MGS). In this study, we report on a large-scale study of the rate-distortion (RD) and rate variability-distortion (VD) characteristics of CGS and MGS. We also examine the RD and VD characteristics of three main multiview (3D) representation formats. Specifically, we compare multiview video (MV) representation and encoding, frame sequential (FS) representation, and side-by-side (SBS) representation; whereby conventional single-view encoding is employed for the FS and SBS representations. As a last step, we also examine Video traffic modeling which plays a major part in network traffic analysis. It is imperative to network design and simulation, providing Quality of Service (QoS) to network applications, besides providing insights into the coding process and structure of video sequences. We propose our models on top of the recent unified traffic model developed by Dai et al. [1], for modeling MPEG-4 and H.264 VBR video traffic. We exploit the hierarchical predication structure inherent in H.264 for intra-GoP (group of pictures) analysis.

ContributorsPulipaka, Venkata Sai Akshay (Author) / Reisslein, Martin (Thesis advisor) / Karam, Lina (Thesis advisor) / Li, Baoxin (Committee member) / Seeling, Patrick (Committee member) / Arizona State University (Publisher)

Created2012

Exploring video denoising using matrix completion

Description

Video denoising has been an important task in many multimedia and computer vision applications. Recent developments in the matrix completion theory and emergence of new numerical methods which can efficiently solve the matrix completion problem have paved the way for exploration of new techniques for some classical image processing tasks.…

Video denoising has been an important task in many multimedia and computer vision applications. Recent developments in the matrix completion theory and emergence of new numerical methods which can efficiently solve the matrix completion problem have paved the way for exploration of new techniques for some classical image processing tasks. Recent literature shows that many computer vision and image processing problems can be solved by using the matrix completion theory. This thesis explores the application of matrix completion in video denoising. A state-of-the-art video denoising algorithm in which the denoising task is modeled as a matrix completion problem is chosen for detailed study. The contribution of this thesis lies in both providing extensive analysis to bridge the gap in existing literature on matrix completion frame work for video denoising and also in proposing some novel techniques to improve the performance of the chosen denoising algorithm. The chosen algorithm is implemented for thorough analysis. Experiments and discussions are presented to enable better understanding of the problem. Instability shown by the algorithm at some parameter values in a particular case of low levels of pure Gaussian noise is identified. Artifacts introduced in such cases are analyzed. A novel way of grouping structurally-relevant patches is proposed to improve the algorithm. Experiments show that this technique is useful, especially in videos containing high amounts of motion. Based on the observation that matrix completion is not suitable for denoising patches containing relatively low amount of image details, a framework is designed to separate patches corresponding to low structured regions from a noisy image. Experiments are conducted by not subjecting such patches to matrix completion, instead denoising such patches in a different way. The resulting improvement in performance suggests that denoising low structured patches does not require a complex method like matrix completion and in fact it is counter-productive to subject such patches to matrix completion. These results also indicate the inherent limitation of matrix completion to deal with cases in which noise dominates the structural properties of an image. A novel method for introducing priorities to the ranked patches in matrix completion is also presented. Results showed that this method yields improved performance in general. It is observed that the artifacts in presence of low levels of pure Gaussian noise appear differently after introducing priorities to the patches and the artifacts occur at a wider range of parameter values. Results and discussion suggesting future ways to explore this problem are also presented.

ContributorsMaguluri, Hima Bindu (Author) / Li, Baoxin (Thesis advisor) / Turaga, Pavan (Committee member) / Claveau, Claude (Committee member) / Arizona State University (Publisher)

Created2013

Distributed inference using bounded transmissions

Description

Distributed inference has applications in a wide range of fields such as source localization, target detection, environment monitoring, and healthcare. In this dissertation, distributed inference schemes which use bounded transmit power are considered. The performance of the proposed schemes are studied for a variety of inference problems. In the first…

Distributed inference has applications in a wide range of fields such as source localization, target detection, environment monitoring, and healthcare. In this dissertation, distributed inference schemes which use bounded transmit power are considered. The performance of the proposed schemes are studied for a variety of inference problems. In the first part of the dissertation, a distributed detection scheme where the sensors transmit with constant modulus signals over a Gaussian multiple access channel is considered. The deflection coefficient of the proposed scheme is shown to depend on the characteristic function of the sensing noise, and the error exponent for the system is derived using large deviation theory. Optimization of the deflection coefficient and error exponent are considered with respect to a transmission phase parameter for a variety of sensing noise distributions including impulsive ones. The proposed scheme is also favorably compared with existing amplify-and-forward (AF) and detect-and-forward (DF) schemes. The effect of fading is shown to be detrimental to the detection performance and simulations are provided to corroborate the analytical results. The second part of the dissertation studies a distributed inference scheme which uses bounded transmission functions over a Gaussian multiple access channel. The conditions on the transmission functions under which consistent estimation and reliable detection are possible is characterized. For the distributed estimation problem, an estimation scheme that uses bounded transmission functions is proved to be strongly consistent provided that the variance of the noise samples are bounded and that the transmission function is one-to-one. The proposed estimation scheme is compared with the amplify and forward technique and its robustness to impulsive sensing noise distributions is highlighted. It is also shown that bounded transmissions suffer from inconsistent estimates if the sensing noise variance goes to infinity. For the distributed detection problem, similar results are obtained by studying the deflection coefficient. Simulations corroborate our analytical results. In the third part of this dissertation, the problem of estimating the average of samples distributed at the nodes of a sensor network is considered. A distributed average consensus algorithm in which every sensor transmits with bounded peak power is proposed. In the presence of communication noise, it is shown that the nodes reach consensus asymptotically to a finite random variable whose expectation is the desired sample average of the initial observations with a variance that depends on the step size of the algorithm and the variance of the communication noise. The asymptotic performance is characterized by deriving the asymptotic covariance matrix using results from stochastic approximation theory. It is shown that using bounded transmissions results in slower convergence compared to the linear consensus algorithm based on the Laplacian heuristic. Simulations corroborate our analytical findings. Finally, a robust distributed average consensus algorithm in which every sensor performs a nonlinear processing at the receiver is proposed. It is shown that non-linearity at the receiver nodes makes the algorithm robust to a wide range of channel noise distributions including the impulsive ones. It is shown that the nodes reach consensus asymptotically and similar results are obtained as in the case of transmit non-linearity. Simulations corroborate our analytical findings and highlight the robustness of the proposed algorithm.

ContributorsDasarathan, Sivaraman (Author) / Tepedelenlioğlu, Cihan (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / Reisslein, Martin (Committee member) / Goryll, Michael (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by