Search Content

Structured sparse learning and its applications to biomedical and biological data

Description

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups…

Sparsity has become an important modeling tool in areas such as genetics, signal and audio processing, medical image processing, etc. Via the penalization of l-1 norm based regularization, the structured sparse learning algorithms can produce highly accurate models while imposing various predefined structures on the data, such as feature groups or graphs. In this thesis, I first propose to solve a sparse learning model with a general group structure, where the predefined groups may overlap with each other. Then, I present three real world applications which can benefit from the group structured sparse learning technique. In the first application, I study the Alzheimer's Disease diagnosis problem using multi-modality neuroimaging data. In this dataset, not every subject has all data sources available, exhibiting an unique and challenging block-wise missing pattern. In the second application, I study the automatic annotation and retrieval of fruit-fly gene expression pattern images. Combined with the spatial information, sparse learning techniques can be used to construct effective representation of the expression images. In the third application, I present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores help us to illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes.

ContributorsYuan, Lei (Author) / Ye, Jieping (Thesis advisor) / Wang, Yalin (Committee member) / Xue, Guoliang (Committee member) / Kumar, Sudhir (Committee member) / Arizona State University (Publisher)

Created2013

Sparse learning package with stability selection and application to alzheimer's disease

Description

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of…

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of focus. In supervised learning like regression, the data consists of many features and only a subset of the features may be responsible for the result. Also, the features might require special structural requirements, which introduces additional complexity for feature selection. The sparse learning package, provides a set of algorithms for learning a sparse set of the most relevant features for both regression and classification problems. Structural dependencies among features which introduce additional requirements are also provided as part of the package. The features may be grouped together, and there may exist hierarchies and over- lapping groups among these, and there may be requirements for selecting the most relevant groups among them. In spite of getting sparse solutions, the solutions are not guaranteed to be robust. For the selection to be robust, there are certain techniques which provide theoretical justification of why certain features are selected. The stability selection, is a method for feature selection which allows the use of existing sparse learning methods to select the stable set of features for a given training sample. This is done by assigning probabilities for the features: by sub-sampling the training data and using a specific sparse learning technique to learn the relevant features, and repeating this a large number of times, and counting the probability as the number of times a feature is selected. Cross-validation which is used to determine the best parameter value over a range of values, further allows to select the best parameter value. This is done by selecting the parameter value which gives the maximum accuracy score. With such a combination of algorithms, with good convergence guarantees, stable feature selection properties and the inclusion of various structural dependencies among features, the sparse learning package will be a powerful tool for machine learning research. Modular structure, C implementation, ATLAS integration for fast linear algebraic subroutines, make it one of the best tool for a large sparse setting. The varied collection of algorithms, support for group sparsity, batch algorithms, are a few of the notable functionality of the SLEP package, and these features can be used in a variety of fields to infer relevant elements. The Alzheimer Disease(AD) is a neurodegenerative disease, which gradually leads to dementia. The SLEP package is used for feature selection for getting the most relevant biomarkers from the available AD dataset, and the results show that, indeed, only a subset of the features are required to gain valuable insights.

ContributorsThulasiram, Ramesh (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Committee member) / Sen, Arunabha (Committee member) / Arizona State University (Publisher)

Created2011

Using Games to Explore Collective Action on International Scales

Description

One of the salient challenges of sustainability is the Tragedy of the Commons, where individuals acting independently and rationally deplete a common resource despite their understanding that it is not in the group's long term best interest to do so. Hardin presents this dilemma as nearly intractable and solvable only…

One of the salient challenges of sustainability is the Tragedy of the Commons, where individuals acting independently and rationally deplete a common resource despite their understanding that it is not in the group's long term best interest to do so. Hardin presents this dilemma as nearly intractable and solvable only by drastic, government-mandated social reforms, while Ostrom's empirical work demonstrates that community-scale collaboration can circumvent tragedy without any elaborate outside intervention. Though more optimistic, Ostrom's work provides scant insight into larger-scale dilemmas such as climate change. Consequently, it remains unclear if the sustainable management of global resources is possible without significant government mediation. To investigate, we conducted two game theoretic experiments that challenged students in different countries to collaborate digitally and manage a hypothetical common resource. One experiment involved students attending Arizona State University and the Rochester Institute of Technology in the US and Mountains of the Moon University in Uganda, while the other included students at Arizona State and the Management Development Institute in India. In both experiments, students were randomly assigned to one of three production roles: Luxury, Intermediate, and Subsistence. Students then made individual decisions about how many units of goods they wished to produce up to a set maximum per production class. Luxury players gain the most profit (i.e. grade points) per unit produced, but they also emit the most externalities, or social costs, which directly subtract from the profit of everybody else in the game; Intermediate players produce a medium amount of profit and externalities per unit, and Subsistence players produce a low amount of profit and externalities per unit. Variables influencing and/or inhibiting collaboration were studied using pre- and post-game surveys. This research sought to answer three questions: 1) Are international groups capable of self-organizing in a way that promotes sustainable resource management?, 2) What are the key factors that inhibit or foster collective action among international groups?, and 3) How well do Hardin's theories and Ostrom's empirical models predict the observed behavior of students in the game? The results of gameplay suggest that international cooperation is possible, though likely sub-optimal. Statistical analysis of survey data revealed that heterogeneity and levels of trust significantly influenced game behavior. Specific traits of heterogeneity among students found to be significant were income, education, assigned production role, number of people in one's household, college class, college major, and military service. Additionally, it was found that Ostrom's collective action framework was a better predictor of game outcome than Hardin's theories. Overall, this research lends credence to the plausibility of international cooperation in tragedy of the commons scenarios such as climate change, though much work remains to be done.

ContributorsStanton, Albert Grayson (Author) / Clark, Susan Spierre (Thesis director) / Seager, Thomas (Committee member) / Civil, Environmental and Sustainable Engineering Programs (Contributor) / Barrett, The Honors College (Contributor)

Created2014-12

Structured sparse methods for imaging genetics

Description

Imaging genetics is an emerging and promising technique that investigates how genetic variations affect brain development, structure, and function. By exploiting disorder-related neuroimaging phenotypes, this class of studies provides a novel direction to reveal and understand the complex genetic mechanisms. Oftentimes, imaging genetics studies are challenging due to the relatively…

Imaging genetics is an emerging and promising technique that investigates how genetic variations affect brain development, structure, and function. By exploiting disorder-related neuroimaging phenotypes, this class of studies provides a novel direction to reveal and understand the complex genetic mechanisms. Oftentimes, imaging genetics studies are challenging due to the relatively small number of subjects but extremely high-dimensionality of both imaging data and genomic data. In this dissertation, I carry on my research on imaging genetics with particular focuses on two tasks---building predictive models between neuroimaging data and genomic data, and identifying disorder-related genetic risk factors through image-based biomarkers. To this end, I consider a suite of structured sparse methods---that can produce interpretable models and are robust to overfitting---for imaging genetics. With carefully-designed sparse-inducing regularizers, different biological priors are incorporated into learning models. More specifically, in the Allen brain image--gene expression study, I adopt an advanced sparse coding approach for image feature extraction and employ a multi-task learning approach for multi-class annotation. Moreover, I propose a label structured-based two-stage learning framework, which utilizes the hierarchical structure among labels, for multi-label annotation. In the Alzheimer's disease neuroimaging initiative (ADNI) imaging genetics study, I employ Lasso together with EDPP (enhanced dual polytope projections) screening rules to fast identify Alzheimer's disease risk SNPs. I also adopt the tree-structured group Lasso with MLFre (multi-layer feature reduction) screening rules to incorporate linkage disequilibrium information into modeling. Moreover, I propose a novel absolute fused Lasso model for ADNI imaging genetics. This method utilizes SNP spatial structure and is robust to the choice of reference alleles of genotype coding. In addition, I propose a two-level structured sparse model that incorporates gene-level networks through a graph penalty into SNP-level model construction. Lastly, I explore a convolutional neural network approach for accurate predicting Alzheimer's disease related imaging phenotypes. Experimental results on real-world imaging genetics applications demonstrate the efficiency and effectiveness of the proposed structured sparse methods.

ContributorsYang, Tao (Author) / Ye, Jieping (Thesis advisor) / Xue, Guoliang (Thesis advisor) / He, Jingrui (Committee member) / Li, Baoxin (Committee member) / Li, Jing (Committee member) / Arizona State University (Publisher)

Created2017

Coping with selfish behavior in networks using game theory

Description

While network problems have been addressed using a central administrative domain with a single objective, the devices in most networks are actually not owned by a single entity but by many individual entities. These entities make their decisions independently and selfishly, and maybe cooperate with a small group of other…

While network problems have been addressed using a central administrative domain with a single objective, the devices in most networks are actually not owned by a single entity but by many individual entities. These entities make their decisions independently and selfishly, and maybe cooperate with a small group of other entities only when this form of coalition yields a better return. The interaction among multiple independent decision-makers necessitates the use of game theory, including economic notions related to markets and incentives. In this dissertation, we are interested in modeling, analyzing, addressing network problems caused by the selfish behavior of network entities. First, we study how the selfish behavior of network entities affects the system performance while users are competing for limited resource. For this resource allocation domain, we aim to study the selfish routing problem in networks with fair queuing on links, the relay assignment problem in cooperative networks, and the channel allocation problem in wireless networks. Another important aspect of this dissertation is the study of designing efficient mechanisms to incentivize network entities to achieve certain system objective. For this incentive mechanism domain, we aim to motivate wireless devices to serve as relays for cooperative communication, and to recruit smartphones for crowdsourcing. In addition, we apply different game theoretic approaches to problems in security and privacy domain. For this domain, we aim to analyze how a user could defend against a smart jammer, who can quickly learn about the user's transmission power. We also design mechanisms to encourage mobile phone users to participate in location privacy protection, in order to achieve k-anonymity.

ContributorsYang, Dejun (Author) / Xue, Guoliang (Thesis advisor) / Richa, Andrea (Committee member) / Sen, Arunabha (Committee member) / Zhang, Junshan (Committee member) / Arizona State University (Publisher)

Created2013

The What, When, and How of Strategic Movement in Adversarial Settings: A Syncretic View of AI and Security

Description

The field of cyber-defenses has played catch-up in the cat-and-mouse game of finding vulnerabilities followed by the invention of patches to defend against them. With the complexity and scale of modern-day software, it is difficult to ensure that all known vulnerabilities are patched; moreover, the attacker, with reconnaissance on their…

The field of cyber-defenses has played catch-up in the cat-and-mouse game of finding vulnerabilities followed by the invention of patches to defend against them. With the complexity and scale of modern-day software, it is difficult to ensure that all known vulnerabilities are patched; moreover, the attacker, with reconnaissance on their side, will eventually discover and leverage them. To take away the attacker's inherent advantage of reconnaissance, researchers have proposed the notion of proactive defenses such as Moving Target Defense (MTD) in cyber-security. In this thesis, I make three key contributions that help to improve the effectiveness of MTD.

First, I argue that naive movement strategies for MTD systems, designed based on intuition, are detrimental to both security and performance. To answer the question of how to move, I (1) model MTD as a leader-follower game and formally characterize the notion of optimal movement strategies, (2) leverage expert-curated public data and formal representation methods used in cyber-security to obtain parameters of the game, and (3) propose optimization methods to infer strategies at Strong Stackelberg Equilibrium, addressing issues pertaining to scalability and switching costs. Second, when one cannot readily obtain the parameters of the game-theoretic model but can interact with a system, I propose a novel multi-agent reinforcement learning approach that finds the optimal movement strategy. Third, I investigate the novel use of MTD in three domains-- cyber-deception, machine learning, and critical infrastructure networks. I show that the question of what to move poses non-trivial challenges in these domains. To address them, I propose methods for patch-set selection in the deployment of honey-patches, characterize the notion of differential immunity in deep neural networks, and develop optimization problems that guarantee differential immunity for dynamic sensor placement in power-networks.

ContributorsSengupta, Sailik (Author) / Kambhampati, Subbarao (Thesis advisor) / Bao, Tiffany (Youzhi) (Committee member) / Huang, Dijiang (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by

Structured sparse learning and its applications to biomedical and biological data

Sparse learning package with stability selection and application to alzheimer's disease

Using Games to Explore Collective Action on International Scales

Structured sparse methods for imaging genetics

Coping with selfish behavior in networks using game theory

The What, When, and How of Strategic Movement in Adversarial Settings: A Syncretic View of AI and Security