Matching Items (8)

135457-Thumbnail Image.png

Improved Finite Sample Estimate of A Nonparametric Divergence Measure

Description

This work details the bootstrap estimation of a nonparametric information divergence measure, the Dp divergence measure, using a power law model. To address the challenge posed by computing accurate divergence estimates given finite size data, the bootstrap approach is used

This work details the bootstrap estimation of a nonparametric information divergence measure, the Dp divergence measure, using a power law model. To address the challenge posed by computing accurate divergence estimates given finite size data, the bootstrap approach is used in conjunction with a power law curve to calculate an asymptotic value of the divergence estimator. Monte Carlo estimates of Dp are found for increasing values of sample size, and a power law fit is used to relate the divergence estimates as a function of sample size. The fit is also used to generate a confidence interval for the estimate to characterize the quality of the estimate. We compare the performance of this method with the other estimation methods. The calculated divergence is applied to the binary classification problem. Using the inherent relation between divergence measures and classification error rate, an analysis of the Bayes error rate of several data sets is conducted using the asymptotic divergence estimate.

Contributors

Agent

Created

Date Created
2016-05

135475-Thumbnail Image.png

A Non-Parametric Semi-Supervised f-Divergence

Description

Divergence functions are both highly useful and fundamental to many areas in information theory and machine learning, but require either parametric approaches or prior knowledge of labels on the full data set. This paper presents a method to estimate the

Divergence functions are both highly useful and fundamental to many areas in information theory and machine learning, but require either parametric approaches or prior knowledge of labels on the full data set. This paper presents a method to estimate the divergence between two data sets in the absence of fully labeled data. This semi-labeled case is common in many domains where labeling data by hand is expensive or time-consuming, or wherever large data sets are present. The theory derived in this paper is demonstrated on a simulated example, and then applied to a feature selection and classification problem from pathological speech analysis.

Contributors

Agent

Created

Date Created
2016-05

150108-Thumbnail Image.png

Directional information flow and applications

Description

In the late 1960s, Granger published a seminal study on causality in time series, using linear interdependencies and information transfer. Recent developments in the field of information theory have introduced new methods to investigate the transfer of information in dynamical

In the late 1960s, Granger published a seminal study on causality in time series, using linear interdependencies and information transfer. Recent developments in the field of information theory have introduced new methods to investigate the transfer of information in dynamical systems. Using concepts from Chaos and Markov theory, much of these methods have evolved to capture non-linear relations and information flow between coupled dynamical systems with applications to fields like biomedical signal processing. This thesis deals with the application of information theory to non-linear multivariate time series and develops measures of information flow to identify significant drivers and response (driven) components in networks of coupled sub-systems with variable coupling in strength and direction (uni- or bi-directional) for each connection. Transfer Entropy (TE) is used to quantify pairwise directional information. Four TE-based measures of information flow are proposed, namely TE Outflow (TEO), TE Inflow (TEI), TE Net flow (TEN), and Average TE flow (ATE). First, the reliability of the information flow measures on models, with and without noise, is evaluated. The driver and response sub-systems in these models are identified. Second, these measures are applied to electroencephalographic (EEG) data from two patients with focal epilepsy. The analysis showed dominant directions of information flow between brain sites and identified the epileptogenic focus as the system component typically with the highest value for the proposed measures (for example, ATE). Statistical tests between pre-seizure (preictal) and post-seizure (postictal) information flow also showed a breakage of the driving of the brain by the focus after seizure onset. The above findings shed light on the function of the epileptogenic focus and understanding of ictogenesis. It is expected that they will contribute to the diagnosis of epilepsy, for example by accurate identification of the epileptogenic focus from interictal periods, as well as the development of better seizure detection, prediction and control methods, for example by isolating pathologic areas of excessive information flow through electrical stimulation.

Contributors

Agent

Created

Date Created
2011

155255-Thumbnail Image.png

RF Convergence of Radar and Communications: Metrics, Bounds, and Systems

Description

RF convergence of radar and communications users is rapidly becoming an issue for a multitude of stakeholders. To hedge against growing spectral congestion, research into cooperative radar and communications systems has been identified as a critical necessity for the United

RF convergence of radar and communications users is rapidly becoming an issue for a multitude of stakeholders. To hedge against growing spectral congestion, research into cooperative radar and communications systems has been identified as a critical necessity for the United States and other countries. Further, the joint sensing-communicating paradigm appears imminent in several technological domains. In the pursuit of co-designing radar and communications systems that work cooperatively and benefit from each other's existence, joint radar-communications metrics are defined and bounded as a measure of performance. Estimation rate is introduced, a novel measure of radar estimation information as a function of time. Complementary to communications data rate, the two systems can now be compared on the same scale. An information-centric approach has a number of advantages, defining precisely what is gained through radar illumination and serves as a measure of spectral efficiency. Bounding radar estimation rate and communications data rate jointly, systems can be designed as a joint optimization problem.

Contributors

Agent

Created

Date Created
2017

157817-Thumbnail Image.png

Distributed Reception in the Presence of Gaussian Interference

Description

An analysis is presented of a network of distributed receivers encumbered by strong in-band interference. The structure of information present across such receivers and how they might collaborate to recover a signal of interest is studied. Unstructured (random coding) and

An analysis is presented of a network of distributed receivers encumbered by strong in-band interference. The structure of information present across such receivers and how they might collaborate to recover a signal of interest is studied. Unstructured (random coding) and structured (lattice coding) strategies are studied towards this purpose for a certain adaptable system model. Asymptotic performances of these strategies and algorithms to compute them are developed. A jointly-compressed lattice code with proper configuration performs best of all strategies investigated.

Contributors

Agent

Created

Date Created
2019

156751-Thumbnail Image.png

Data-Driven and Game-Theoretic Approaches for Privacy

Description

In the past few decades, there has been a remarkable shift in the boundary between public and private information. The application of information technology and electronic communications allow service providers (businesses) to collect a large amount of data. However, this

In the past few decades, there has been a remarkable shift in the boundary between public and private information. The application of information technology and electronic communications allow service providers (businesses) to collect a large amount of data. However, this ``data collection" process can put the privacy of users at risk and also lead to user reluctance in accepting services or sharing data. This dissertation first investigates privacy sensitive consumer-retailers/service providers interactions under different scenarios, and then focuses on a unified framework for various information-theoretic privacy and privacy mechanisms that can be learned directly from data.

Existing approaches such as differential privacy or information-theoretic privacy try to quantify privacy risk but do not capture the subjective experience and heterogeneous expression of privacy-sensitivity. The first part of this dissertation introduces models to study consumer-retailer interaction problems and to better understand how retailers/service providers can balance their revenue objectives while being sensitive to user privacy concerns. This dissertation considers the following three scenarios: (i) the consumer-retailer interaction via personalized advertisements; (ii) incentive mechanisms that electrical utility providers need to offer for privacy sensitive consumers with alternative energy sources; (iii) the market viability of offering privacy guaranteed free online services. We use game-theoretic models to capture the behaviors of both consumers and retailers, and provide insights for retailers to maximize their profits when interacting with privacy sensitive consumers.

Preserving the utility of published datasets while simultaneously providing provable privacy guarantees is a well-known challenge. In the second part, a novel context-aware privacy framework called generative adversarial privacy (GAP) is introduced. Inspired by recent advancements in generative adversarial networks, GAP allows the data holder to learn the privatization mechanism directly from the data. Under GAP, finding the optimal privacy mechanism is formulated as a constrained minimax game between a privatizer and an adversary. For appropriately chosen adversarial loss functions, GAP provides privacy guarantees against strong information-theoretic adversaries. Both synthetic and real-world datasets are used to show that GAP can greatly reduce the adversary's capability of inferring private information at a small cost of distorting the data.

Contributors

Agent

Created

Date Created
2018

156145-Thumbnail Image.png

Fundamental Limits on Performance for Cooperative Radar-Communications Coexistence

Description

Spectral congestion is quickly becoming a problem for the telecommunications sector. In order to alleviate spectral congestion and achieve electromagnetic radio frequency (RF) convergence, communications and radar systems are increasingly encouraged to share bandwidth. In direct opposition to the traditional

Spectral congestion is quickly becoming a problem for the telecommunications sector. In order to alleviate spectral congestion and achieve electromagnetic radio frequency (RF) convergence, communications and radar systems are increasingly encouraged to share bandwidth. In direct opposition to the traditional spectrum sharing approach between radar and communications systems of complete isolation (temporal, spectral or spatial), both systems can be jointly co-designed from the ground up to maximize their joint performance for mutual benefit. In order to properly characterize and understand cooperative spectrum sharing between radar and communications systems, the fundamental limits on performance of a cooperative radar-communications system are investigated. To facilitate this investigation, performance metrics are chosen in this dissertation that allow radar and communications to be compared on the same scale. To that effect, information is chosen as the performance metric and an information theoretic radar performance metric compatible with the communications data rate, the radar estimation rate, is developed. The estimation rate measures the amount of information learned by illuminating a target. With the development of the estimation rate, standard multi-user communications performance bounds are extended with joint radar-communications users to produce bounds on the performance of a joint radar-communications system. System performance for variations of the standard spectrum sharing problem defined in this dissertation are investigated, and inner bounds on performance are extended to account for the effect of continuous radar waveform optimization, multiple radar targets, clutter, phase noise, and radar detection. A detailed interpretation of the estimation rate and a brief discussion on how to use these performance bounds to select an optimal operating point and achieve RF convergence are provided.

Contributors

Agent

Created

Date Created
2018

154587-Thumbnail Image.png

Graph-based estimation of information divergence functions

Description

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is rarely the case in real-word scenarios. Non-parametric density estimators are characterized by a very large number of parameters that have to be tuned with costly cross-validation. In this dissertation we focus on a specific family of non-parametric estimators, called direct estimators, that bypass density estimation completely and directly estimate the quantity of interest from the data. We introduce a new divergence measure, the $D_p$-divergence, that can be estimated directly from samples without parametric assumptions on the distribution. We show that the $D_p$-divergence bounds the binary, cross-domain, and multi-class Bayes error rates and, in certain cases, provides provably tighter bounds than the Hellinger divergence. In addition, we also propose a new methodology that allows the experimenter to construct direct estimators for existing divergence measures or to construct new divergence measures with custom properties that are tailored to the application. To examine the practical efficacy of these new methods, we evaluate them in a statistical learning framework on a series of real-world data science problems involving speech-based monitoring of neuro-motor disorders.

Contributors

Agent

Created

Date Created
2017