Matching Items (163)
Filtering by

Clear all filters

151689-Thumbnail Image.png
Description
Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.
ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)
Created2013
151690-Thumbnail Image.png
Description
Practical communication systems are subject to errors due to imperfect time alignment among the communicating nodes. Timing errors can occur in different forms depending on the underlying communication scenario. This doctoral study considers two different classes of asynchronous systems; point-to-point (P2P) communication systems with synchronization errors, and asynchronous cooperative systems.

Practical communication systems are subject to errors due to imperfect time alignment among the communicating nodes. Timing errors can occur in different forms depending on the underlying communication scenario. This doctoral study considers two different classes of asynchronous systems; point-to-point (P2P) communication systems with synchronization errors, and asynchronous cooperative systems. In particular, the focus is on an information theoretic analysis for P2P systems with synchronization errors and developing new signaling solutions for several asynchronous cooperative communication systems. The first part of the dissertation presents several bounds on the capacity of the P2P systems with synchronization errors. First, binary insertion and deletion channels are considered where lower bounds on the mutual information between the input and output sequences are computed for independent uniformly distributed (i.u.d.) inputs. Then, a channel suffering from both synchronization errors and additive noise is considered as a serial concatenation of a synchronization error-only channel and an additive noise channel. It is proved that the capacity of the original channel is lower bounded in terms of the synchronization error-only channel capacity and the parameters of both channels. On a different front, to better characterize the deletion channel capacity, the capacity of three independent deletion channels with different deletion probabilities are related through an inequality resulting in the tightest upper bound on the deletion channel capacity for deletion probabilities larger than 0.65. Furthermore, the first non-trivial upper bound on the 2K-ary input deletion channel capacity is provided by relating the 2K-ary input deletion channel capacity with the binary deletion channel capacity through an inequality. The second part of the dissertation develops two new relaying schemes to alleviate asynchronism issues in cooperative communications. The first one is a single carrier (SC)-based scheme providing a spectrally efficient Alamouti code structure at the receiver under flat fading channel conditions by reducing the overhead needed to overcome the asynchronism and obtain spatial diversity. The second one is an orthogonal frequency division multiplexing (OFDM)-based approach useful for asynchronous cooperative systems experiencing excessive relative delays among the relays under frequency-selective channel conditions to achieve a delay diversity structure at the receiver and extract spatial diversity.
ContributorsRahmati, Mojtaba (Author) / Duman, Tolga M. (Thesis advisor) / Zhang, Junshan (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Reisslein, Martin (Committee member) / Arizona State University (Publisher)
Created2013
151716-Thumbnail Image.png
Description
The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of humongous amounts of digital data (in the form of images, videos and text). This has expanded the possibility of solving real world problems using computational learning frameworks. However, while gathering a large amount of data is cheap and easy, annotating them with class labels is an expensive process in terms of time, labor and human expertise. This has paved the way for research in the field of active learning. Such algorithms automatically select the salient and exemplar instances from large quantities of unlabeled data and are effective in reducing human labeling effort in inducing classification models. To utilize the possible presence of multiple labeling agents, there have been attempts towards a batch mode form of active learning, where a batch of data instances is selected simultaneously for manual annotation. This dissertation is aimed at the development of novel batch mode active learning algorithms to reduce manual effort in training classification models in real world multimedia pattern recognition applications. Four major contributions are proposed in this work: $(i)$ a framework for dynamic batch mode active learning, where the batch size and the specific data instances to be queried are selected adaptively through a single formulation, based on the complexity of the data stream in question, $(ii)$ a batch mode active learning strategy for fuzzy label classification problems, where there is an inherent imprecision and vagueness in the class label definitions, $(iii)$ batch mode active learning algorithms based on convex relaxations of an NP-hard integer quadratic programming (IQP) problem, with guaranteed bounds on the solution quality and $(iv)$ an active matrix completion algorithm and its application to solve several variants of the active learning problem (transductive active learning, multi-label active learning, active feature acquisition and active learning for regression). These contributions are validated on the face recognition and facial expression recognition problems (which are commonly encountered in real world applications like robotics, security and assistive technology for the blind and the visually impaired) and also on collaborative filtering applications like movie recommendation.
ContributorsChakraborty, Shayok (Author) / Panchanathan, Sethuraman (Thesis advisor) / Balasubramanian, Vineeth N. (Committee member) / Li, Baoxin (Committee member) / Mittelmann, Hans (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2013
152194-Thumbnail Image.png
Description
Distributed estimation uses many inexpensive sensors to compose an accurate estimate of a given parameter. It is frequently implemented using wireless sensor networks. There have been several studies on optimizing power allocation in wireless sensor networks used for distributed estimation, the vast majority of which assume linear radio-frequency amplifiers. Linear

Distributed estimation uses many inexpensive sensors to compose an accurate estimate of a given parameter. It is frequently implemented using wireless sensor networks. There have been several studies on optimizing power allocation in wireless sensor networks used for distributed estimation, the vast majority of which assume linear radio-frequency amplifiers. Linear amplifiers are inherently inefficient, so in this dissertation nonlinear amplifiers are examined to gain efficiency while operating distributed sensor networks. This research presents a method to boost efficiency by operating the amplifiers in the nonlinear region of operation. Operating amplifiers nonlinearly presents new challenges. First, nonlinear amplifier characteristics change across manufacturing process variation, temperature, operating voltage, and aging. Secondly, the equations conventionally used for estimators and performance expectations in linear amplify-and-forward systems fail. To compensate for the first challenge, predistortion is utilized not to linearize amplifiers but rather to force them to fit a common nonlinear limiting amplifier model close to the inherent amplifier performance. This minimizes the power impact and the training requirements for predistortion. Second, new estimators are required that account for transmitter nonlinearity. This research derives analytically and confirms via simulation new estimators and performance expectation equations for use in nonlinear distributed estimation. An additional complication when operating nonlinear amplifiers in a wireless environment is the influence of varied and potentially unknown channel gains. The impact of these varied gains and both measurement and channel noise sources on estimation performance are analyzed in this paper. Techniques for minimizing the estimate variance are developed. It is shown that optimizing transmitter power allocation to minimize estimate variance for the most-compressed parameter measurement is equivalent to the problem for linear sensors. Finally, a method for operating distributed estimation in a multipath environment is presented that is capable of developing robust estimates for a wide range of Rician K-factors. This dissertation demonstrates that implementing distributed estimation using nonlinear sensors can boost system efficiency and is compatible with existing techniques from the literature for boosting efficiency at the system level via sensor power allocation. Nonlinear transmitters work best when channel gains are known and channel noise and receiver noise levels are low.
ContributorsSantucci, Robert (Author) / Spanias, Andreas (Thesis advisor) / Tepedelenlioðlu, Cihan (Committee member) / Bakkaloglu, Bertan (Committee member) / Tsakalis, Kostas (Committee member) / Arizona State University (Publisher)
Created2013
152143-Thumbnail Image.png
Description
Radio frequency (RF) transceivers require a disproportionately high effort in terms of test development time, test equipment cost, and test time. The relatively high test cost stems from two contributing factors. First, RF transceivers require the measurement of a diverse set of specifications, requiring multiple test set-ups and long test

Radio frequency (RF) transceivers require a disproportionately high effort in terms of test development time, test equipment cost, and test time. The relatively high test cost stems from two contributing factors. First, RF transceivers require the measurement of a diverse set of specifications, requiring multiple test set-ups and long test times, which complicates load-board design, debug, and diagnosis. Second, high frequency operation necessitates the use of expensive equipment, resulting in higher per second test time cost compared with mixed-signal or digital circuits. Moreover, in terms of the non-recurring engineering cost, the need to measure complex specfications complicates the test development process and necessitates a long learning process for test engineers. Test time is dominated by changing and settling time for each test set-up. Thus, single set-up test solutions are desirable. Loop-back configuration where the transmitter output is connected to the receiver input are used as the desirable test set- up for RF transceivers, since it eliminates the reliance on expensive instrumentation for RF signal analysis and enables measuring multiple parameters at once. In-phase and Quadrature (IQ) imbalance, non-linearity, DC offset and IQ time skews are some of the most detrimental imperfections in transceiver performance. Measurement of these parameters in the loop-back mode is challenging due to the coupling between the receiver (RX) and transmitter (TX) parameters. Loop-back based solutions are proposed in this work to resolve this issue. A calibration algorithm for a subset of the above mentioned impairments is also presented. Error Vector Magnitude (EVM) is a system-level parameter that is specified for most advanced communication standards. EVM measurement often takes extensive test development efforts, tester resources, and long test times. EVM is analytically related to system impairments, which are typically measured in a production test i environment. Thus, EVM test can be eliminated from the test list if the relations between EVM and system impairments are derived independent of the circuit implementation and manufacturing process. In this work, the focus is on the WLAN standard, and deriving the relations between EVM and three of the most detrimental impairments for QAM/OFDM based systems (IQ imbalance, non-linearity, and noise). Having low cost test techniques for measuring the RF transceivers imperfections and being able to analytically compute EVM from the measured parameters is a complete test solution for RF transceivers. These techniques along with the proposed calibration method can be used in improving the yield by widening the pass/fail boundaries for transceivers imperfections. For all of the proposed methods, simulation and hardware measurements prove that the proposed techniques provide accurate characterization of RF transceivers.
ContributorsNassery, Afsaneh (Author) / Ozev, Sule (Thesis advisor) / Bakkaloglu, Bertan (Committee member) / Kiaei, Sayfe (Committee member) / Kitchen, Jennifer (Committee member) / Arizona State University (Publisher)
Created2013
152113-Thumbnail Image.png
Description
The rapid advancement of wireless technology has instigated the broad deployment of wireless networks. Different types of networks have been developed, including wireless sensor networks, mobile ad hoc networks, wireless local area networks, and cellular networks. These networks have different structures and applications, and require different control algorithms. The focus

The rapid advancement of wireless technology has instigated the broad deployment of wireless networks. Different types of networks have been developed, including wireless sensor networks, mobile ad hoc networks, wireless local area networks, and cellular networks. These networks have different structures and applications, and require different control algorithms. The focus of this thesis is to design scheduling and power control algorithms in wireless networks, and analyze their performances. In this thesis, we first study the multicast capacity of wireless ad hoc networks. Gupta and Kumar studied the scaling law of the unicast capacity of wireless ad hoc networks. They derived the order of the unicast throughput, as the number of nodes in the network goes to infinity. In our work, we characterize the scaling of the multicast capacity of large-scale MANETs under a delay constraint D. We first derive an upper bound on the multicast throughput, and then propose a lower bound on the multicast capacity by proposing a joint coding-scheduling algorithm that achieves a throughput within logarithmic factor of the upper bound. We then study the power control problem in ad-hoc wireless networks. We propose a distributed power control algorithm based on the Gibbs sampler, and prove that the algorithm is throughput optimal. Finally, we consider the scheduling algorithm in collocated wireless networks with flow-level dynamics. Specifically, we study the delay performance of workload-based scheduling algorithm with SRPT as a tie-breaking rule. We demonstrate the superior flow-level delay performance of the proposed algorithm using simulations.
ContributorsZhou, Shan (Author) / Ying, Lei (Thesis advisor) / Zhang, Yanchao (Committee member) / Zhang, Junshan (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)
Created2013
152260-Thumbnail Image.png
Description
Autonomous vehicle control systems utilize real-time kinematic Global Navigation Satellite Systems (GNSS) receivers to provide a position within two-centimeter of truth. GNSS receivers utilize the satellite signal time of arrival estimates to solve for position; and multipath corrupts the time of arrival estimates with a time-varying bias. Time of arrival

Autonomous vehicle control systems utilize real-time kinematic Global Navigation Satellite Systems (GNSS) receivers to provide a position within two-centimeter of truth. GNSS receivers utilize the satellite signal time of arrival estimates to solve for position; and multipath corrupts the time of arrival estimates with a time-varying bias. Time of arrival estimates are based upon accurate direct sequence spread spectrum (DSSS) code and carrier phase tracking. Current multipath mitigating GNSS solutions include fixed radiation pattern antennas and windowed delay-lock loop code phase discriminators. A new multipath mitigating code tracking algorithm is introduced that utilizes a non-symmetric correlation kernel to reject multipath. Independent parameters provide a means to trade-off code tracking discriminant gain against multipath mitigation performance. The algorithm performance is characterized in terms of multipath phase error bias, phase error estimation variance, tracking range, tracking ambiguity and implementation complexity. The algorithm is suitable for modernized GNSS signals including Binary Phase Shift Keyed (BPSK) and a variety of Binary Offset Keyed (BOC) signals. The algorithm compensates for unbalanced code sequences to ensure a code tracking bias does not result from the use of asymmetric correlation kernels. The algorithm does not require explicit knowledge of the propagation channel model. Design recommendations for selecting the algorithm parameters to mitigate precorrelation filter distortion are also provided.
ContributorsMiller, Steven (Author) / Spanias, Andreas (Thesis advisor) / Tepedelenlioğlu, Cihan (Committee member) / Tsakalis, Konstantinos (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)
Created2013
151867-Thumbnail Image.png
Description
Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located

Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located within natural-language text and their semantic type is determined. This step is critical for later tasks in an information extraction pipeline, including normalization and relationship extraction. BANNER is a benchmark biomedical NER system using linear-chain conditional random fields and the rich feature set approach. A case study with BANNER locating genes and proteins in biomedical literature is described. The first corpus for disease NER adequate for use as training data is introduced, and employed in a case study of disease NER. The first corpus locating adverse drug reactions (ADRs) in user posts to a health-related social website is also described, and a system to locate and identify ADRs in social media text is created and evaluated. The rich feature set approach to creating NER feature sets is argued to be subject to diminishing returns, implying that additional improvements may require more sophisticated methods for creating the feature set. This motivates the first application of multivariate feature selection with filters and false discovery rate analysis to biomedical NER, resulting in a feature set at least 3 orders of magnitude smaller than the set created by the rich feature set approach. Finally, two novel approaches to NER by modeling the semantics of token sequences are introduced. The first method focuses on the sequence content by using language models to determine whether a sequence resembles entries in a lexicon of entity names or text from an unlabeled corpus more closely. The second method models the distributional semantics of token sequences, determining the similarity between a potential mention and the token sequences from the training data by analyzing the contexts where each sequence appears in a large unlabeled corpus. The second method is shown to improve the performance of BANNER on multiple data sets.
ContributorsLeaman, James Robert (Author) / Gonzalez, Graciela (Thesis advisor) / Baral, Chitta (Thesis advisor) / Cohen, Kevin B (Committee member) / Liu, Huan (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2013
151982-Thumbnail Image.png
Description
The rapid advances in wireless communications and networking have given rise to a number of emerging heterogeneous wireless and mobile networks along with novel networking paradigms, including wireless sensor networks, mobile crowdsourcing, and mobile social networking. While offering promising solutions to a wide range of new applications, their widespread adoption

The rapid advances in wireless communications and networking have given rise to a number of emerging heterogeneous wireless and mobile networks along with novel networking paradigms, including wireless sensor networks, mobile crowdsourcing, and mobile social networking. While offering promising solutions to a wide range of new applications, their widespread adoption and large-scale deployment are often hindered by people's concerns about the security, user privacy, or both. In this dissertation, we aim to address a number of challenging security and privacy issues in heterogeneous wireless and mobile networks in an attempt to foster their widespread adoption. Our contributions are mainly fivefold. First, we introduce a novel secure and loss-resilient code dissemination scheme for wireless sensor networks deployed in hostile and harsh environments. Second, we devise a novel scheme to enable mobile users to detect any inauthentic or unsound location-based top-k query result returned by an untrusted location-based service providers. Third, we develop a novel verifiable privacy-preserving aggregation scheme for people-centric mobile sensing systems. Fourth, we present a suite of privacy-preserving profile matching protocols for proximity-based mobile social networking, which can support a wide range of matching metrics with different privacy levels. Last, we present a secure combination scheme for crowdsourcing-based cooperative spectrum sensing systems that can enable robust primary user detection even when malicious cognitive radio users constitute the majority.
ContributorsZhang, Rui (Author) / Zhang, Yanchao (Thesis advisor) / Duman, Tolga Mete (Committee member) / Xue, Guoliang (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)
Created2013
151994-Thumbnail Image.png
Description
Under the framework of intelligent management of power grids by leveraging advanced information, communication and control technologies, a primary objective of this study is to develop novel data mining and data processing schemes for several critical applications that can enhance the reliability of power systems. Specifically, this study is broadly

Under the framework of intelligent management of power grids by leveraging advanced information, communication and control technologies, a primary objective of this study is to develop novel data mining and data processing schemes for several critical applications that can enhance the reliability of power systems. Specifically, this study is broadly organized into the following two parts: I) spatio-temporal wind power analysis for wind generation forecast and integration, and II) data mining and information fusion of synchrophasor measurements toward secure power grids. Part I is centered around wind power generation forecast and integration. First, a spatio-temporal analysis approach for short-term wind farm generation forecasting is proposed. Specifically, using extensive measurement data from an actual wind farm, the probability distribution and the level crossing rate of wind farm generation are characterized using tools from graphical learning and time-series analysis. Built on these spatial and temporal characterizations, finite state Markov chain models are developed, and a point forecast of wind farm generation is derived using the Markov chains. Then, multi-timescale scheduling and dispatch with stochastic wind generation and opportunistic demand response is investigated. Part II focuses on incorporating the emerging synchrophasor technology into the security assessment and the post-disturbance fault diagnosis of power systems. First, a data-mining framework is developed for on-line dynamic security assessment by using adaptive ensemble decision tree learning of real-time synchrophasor measurements. Under this framework, novel on-line dynamic security assessment schemes are devised, aiming to handle various factors (including variations of operating conditions, forced system topology change, and loss of critical synchrophasor measurements) that can have significant impact on the performance of conventional data-mining based on-line DSA schemes. Then, in the context of post-disturbance analysis, fault detection and localization of line outage is investigated using a dependency graph approach. It is shown that a dependency graph for voltage phase angles can be built according to the interconnection structure of power system, and line outage events can be detected and localized through networked data fusion of the synchrophasor measurements collected from multiple locations of power grids. Along a more practical avenue, a decentralized networked data fusion scheme is proposed for efficient fault detection and localization.
ContributorsHe, Miao (Author) / Zhang, Junshan (Thesis advisor) / Vittal, Vijay (Thesis advisor) / Hedman, Kory (Committee member) / Si, Jennie (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2013