Search Content

Simultaneous variable and feature group selection in heterogeneous learning: optimization and applications

Description

Advances in data collection technologies have made it cost-effective to obtain heterogeneous data from multiple data sources. Very often, the data are of very high dimension and feature selection is preferred in order to reduce noise, save computational cost and learn interpretable models. Due to the multi-modality nature of heterogeneous…

Advances in data collection technologies have made it cost-effective to obtain heterogeneous data from multiple data sources. Very often, the data are of very high dimension and feature selection is preferred in order to reduce noise, save computational cost and learn interpretable models. Due to the multi-modality nature of heterogeneous data, it is interesting to design efficient machine learning models that are capable of performing variable selection and feature group (data source) selection simultaneously (a.k.a bi-level selection). In this thesis, I carry out research along this direction with a particular focus on designing efficient optimization algorithms. I start with a unified bi-level learning model that contains several existing feature selection models as special cases. Then the proposed model is further extended to tackle the block-wise missing data, one of the major challenges in the diagnosis of Alzheimer's Disease (AD). Moreover, I propose a novel interpretable sparse group feature selection model that greatly facilitates the procedure of parameter tuning and model selection. Last but not least, I show that by solving the sparse group hard thresholding problem directly, the sparse group feature selection model can be further improved in terms of both algorithmic complexity and efficiency. Promising results are demonstrated in the extensive evaluation on multiple real-world data sets.

ContributorsXiang, Shuo (Author) / Ye, Jieping (Thesis advisor) / Mittelmann, Hans D (Committee member) / Davulcu, Hasan (Committee member) / He, Jingrui (Committee member) / Arizona State University (Publisher)

Created2014

Adaptive sampling and learning in recommendation systems

Description

This thesis studies recommendation systems and considers joint sampling and learning. Sampling in recommendation systems is to obtain users' ratings on specific items chosen by the recommendation platform, and learning is to infer the unknown ratings of users to items given the existing data. In this thesis, the problem is…

This thesis studies recommendation systems and considers joint sampling and learning. Sampling in recommendation systems is to obtain users' ratings on specific items chosen by the recommendation platform, and learning is to infer the unknown ratings of users to items given the existing data. In this thesis, the problem is formulated as an adaptive matrix completion problem in which sampling is to reveal the unknown entries of a $U\times M$ matrix where $U$ is the number of users, $M$ is the number of items, and each entry of the $U\times M$ matrix represents the rating of a user to an item. In the literature, this matrix completion problem has been studied under a static setting, i.e., recovering the matrix based on a set of partial ratings. This thesis considers both sampling and learning, and proposes an adaptive algorithm. The algorithm adapts its sampling and learning based on the existing data. The idea is to sample items that reveal more information based on the previous sampling results and then learn based on clustering. Performance of the proposed algorithm has been evaluated using simulations.

ContributorsZhu, Lingfang (Author) / Xue, Guoliang (Thesis advisor) / He, Jingrui (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)

Created2015

Diffusion in Networks: Source Localization, History Reconstruction and Real-Time Network Robustification

Description

Diffusion processes in networks can be used to model many real-world processes, such as the propagation of a rumor on social networks and cascading failures on power networks. Analysis of diffusion processes in networks can help us answer important questions such as the role and the importance of each node…

Diffusion processes in networks can be used to model many real-world processes, such as the propagation of a rumor on social networks and cascading failures on power networks. Analysis of diffusion processes in networks can help us answer important questions such as the role and the importance of each node in the network for spreading the diffusion and how to top or contain a cascading failure in the network. This dissertation consists of three parts.

In the first part, we study the problem of locating multiple diffusion sources in networks under the Susceptible-Infected-Recovered (SIR) model. Given a complete snapshot of the network, we developed a sample-path-based algorithm, named clustering and localization, and proved that for regular trees, the estimators produced by the proposed algorithm are within a constant distance from the real sources with a high probability. Then, we considered the case in which only a partial snapshot is observed and proposed a new algorithm, named Optimal-Jordan-Cover (OJC). The algorithm first extracts a subgraph using a candidate selection algorithm that selects source candidates based on the number of observed infected nodes in their neighborhoods. Then, in the extracted subgraph, OJC finds a set of nodes that "cover" all observed infected nodes with the minimum radius. The set of nodes is called the Jordan cover, and is regarded as the set of diffusion sources. We proved that OJC can locate all sources with probability one asymptotically with partial observations in the Erdos-Renyi (ER) random graph. Multiple experiments on different networks were done, which show our algorithms outperform others.

In the second part, we tackle the problem of reconstructing the diffusion history from partial observations. We formulated the diffusion history reconstruction problem as a maximum a posteriori (MAP) problem and proved the problem is NP hard. Then we proposed a step-by- step reconstruction algorithm, which can always produce a diffusion history that is consistent with the partial observations. Our experimental results based on synthetic and real networks show that the algorithm significantly outperforms some existing methods.

In the third part, we consider the problem of improving the robustness of an interdependent network by rewiring a small number of links during a cascading attack. We formulated the problem as a Markov decision process (MDP) problem. While the problem is NP-hard, we developed an effective and efficient algorithm, RealWire, to robustify the network and to mitigate the damage during the attack. Extensive experimental results show that our algorithm outperforms other algorithms on most of the robustness metrics.

ContributorsChen, Zhen (Author) / Ying, Lei (Thesis advisor) / Tong, Hanghang (Thesis advisor) / Zhang, Junshan (Committee member) / He, Jingrui (Committee member) / Arizona State University (Publisher)

Created2018

Multi-layered HITS on Multi-sourced Networks

Description

Network mining has been attracting a lot of research attention because of the prevalence of networks. As the world is becoming increasingly connected and correlated, networks arising from inter-dependent application domains are often collected from different sources, forming the so-called multi-sourced networks. Examples of such multi-sourced networks include critical infrastructure…

Network mining has been attracting a lot of research attention because of the prevalence of networks. As the world is becoming increasingly connected and correlated, networks arising from inter-dependent application domains are often collected from different sources, forming the so-called multi-sourced networks. Examples of such multi-sourced networks include critical infrastructure networks, multi-platform social networks, cross-domain collaboration networks, and many more. Compared with single-sourced network, multi-sourced networks bear more complex structures and therefore could potentially contain more valuable information.

This thesis proposes a multi-layered HITS (Hyperlink-Induced Topic Search) algorithm to perform the ranking task on multi-sourced networks. Specifically, each node in the network receives an authority score and a hub score for evaluating the value of the node itself and the value of its outgoing links respectively. Based on a recent multi-layered network model, which allows more flexible dependency structure across different sources (i.e., layers), the proposed algorithm leverages both within-layer smoothness and cross-layer consistency. This essentially allows nodes from different layers to be ranked accordingly. The multi-layered HITS is formulated as a regularized optimization problem with non-negative constraint and solved by an iterative update process. Extensive experimental evaluations demonstrate the effectiveness and explainability of the proposed algorithm.

ContributorsYu, Haichao (Author) / Tong, Hanghang (Thesis advisor) / He, Jingrui (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2018

Deep Temporal Clustering: Fully Unsupervised Learning of Time-Domain Features

Description

Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. This thesis presents a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for…

Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. This thesis presents a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objective. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, a visualization method is applied that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion.

ContributorsMadiraju, NaveenSai (Author) / Liang, Jianming (Thesis advisor) / Wang, Yalin (Thesis advisor) / He, Jingrui (Committee member) / Arizona State University (Publisher)

Created2018

Adaptive Curvature for Stochastic Optimization

Description

This thesis presents a family of adaptive curvature methods for gradient-based stochastic optimization. In particular, a general algorithmic framework is introduced along with a practical implementation that yields an efficient, adaptive curvature gradient descent algorithm. To this end, a theoretical and practical link between curvature matrix estimation and shrinkage methods…

This thesis presents a family of adaptive curvature methods for gradient-based stochastic optimization. In particular, a general algorithmic framework is introduced along with a practical implementation that yields an efficient, adaptive curvature gradient descent algorithm. To this end, a theoretical and practical link between curvature matrix estimation and shrinkage methods for covariance matrices is established. The use of shrinkage improves estimation accuracy of the curvature matrix when data samples are scarce. This thesis also introduce several insights that result in data- and computation-efficient update equations. Empirical results suggest that the proposed method compares favorably with existing second-order techniques based on the Fisher or Gauss-Newton and with adaptive stochastic gradient descent methods on both supervised and reinforcement learning tasks.

ContributorsBarron, Trevor (Author) / Ben Amor, Heni (Thesis advisor) / He, Jingrui (Committee member) / Levihn, Martin (Committee member) / Arizona State University (Publisher)

Created2019

Model Based Automatic and Robust Spike Sorting for Large Volumes of Multi-channel Extracellular Data

Description

Spike sorting is a critical step for single-unit-based analysis of neural activities extracellularly and simultaneously recorded using multi-channel electrodes. When dealing with recordings from very large numbers of neurons, existing methods, which are mostly semiautomatic in nature, become inadequate.

This dissertation aims at automating the spike sorting process. A high performance,…

Spike sorting is a critical step for single-unit-based analysis of neural activities extracellularly and simultaneously recorded using multi-channel electrodes. When dealing with recordings from very large numbers of neurons, existing methods, which are mostly semiautomatic in nature, become inadequate.

This dissertation aims at automating the spike sorting process. A high performance, automatic and computationally efficient spike detection and clustering system, namely, the M-Sorter2 is presented. The M-Sorter2 employs the modified multiscale correlation of wavelet coefficients (MCWC) for neural spike detection. At the center of the proposed M-Sorter2 are two automatic spike clustering methods. They share a common hierarchical agglomerative modeling (HAM) model search procedure to strategically form a sequence of mixture models, and a new model selection criterion called difference of model evidence (DoME) to automatically determine the number of clusters. The M-Sorter2 employs two methods differing by how they perform clustering to infer model parameters: one uses robust variational Bayes (RVB) and the other uses robust Expectation-Maximization (REM) for Student’s 𝑡-mixture modeling. The M-Sorter2 is thus a significantly improved approach to sorting as an automatic procedure.

M-Sorter2 was evaluated and benchmarked with popular algorithms using simulated, artificial and real data with truth that are openly available to researchers. Simulated datasets with known statistical distributions were first used to illustrate how the clustering algorithms, namely REMHAM and RVBHAM, provide robust clustering results under commonly experienced performance degrading conditions, such as random initialization of parameters, high dimensionality of data, low signal-to-noise ratio (SNR), ambiguous clusters, and asymmetry in cluster sizes. For the artificial dataset from single-channel recordings, the proposed sorter outperformed Wave_Clus, Plexon’s Offline Sorter and Klusta in most of the comparison cases. For the real dataset from multi-channel electrodes, tetrodes and polytrodes, the proposed sorter outperformed all comparison algorithms in terms of false positive and false negative rates. The software package presented in this dissertation is available for open access.

ContributorsMa, Weichao (Author) / Si, Jennie (Thesis advisor) / Papandreou-Suppappola, Antonia (Committee member) / He, Jingrui (Committee member) / Helms Tillery, Stephen (Committee member) / Arizona State University (Publisher)

Created2019

Gender and Army ROTC at ASU: Women are hyper-visible and under-recognized within masculine military culture

Description

This study asks the question: does gender-based discrimination exists within Arizona State University's Army Reserve Officer Training Corps (ROTC), and if so, what are the effects of such discrimination? Within this study, discrimination is defined as: the treatment or consideration of, or making a distinction in favor of or against,…

This study asks the question: does gender-based discrimination exists within Arizona State University's Army Reserve Officer Training Corps (ROTC), and if so, what are the effects of such discrimination? Within this study, discrimination is defined as: the treatment or consideration of, or making a distinction in favor of or against, a person or thing based on the group, class, or category to which that person or thing belongs, rather than on individual merit. The researcher predicted that this study would show that gender-based discrimination operates within the masculine military culture of Army ROTC at ASU, resulting from women's hyper-visibility and evidenced by their lack of positive recognition and disbelief in having a voice in the program. These expectations were based on background research claiming that the token status of women in military roles causes them to be more heavily scrutinized, and they consequentially try to attain success by adapting to the masculine military culture by which they are constantly measured. For the purposes of this study, success is defined as: the attainment of wealth, favor, or eminence . This study relies on exploratory interviews and an online survey conducted with male and female Army ROTC cadets of all grade levels at Arizona State University. The interviews and survey collected demographic information and perspectives on individual experiences to establish an understanding of privilege and marginalization within the program. These results do support the prediction that women in Army ROTC at ASU face discrimination based on their unique visibility and lack of positive recognition and voice in the program. Likewise, the survey results indicate that race also has a significant impact on one's experience in Army ROTC, which is discussed later in this study in regard to needs for future research. ASU Army ROTC includes approximately 100 cadets, and approximately 30-40 of those cadets participated in this study. Additionally, the University of Arizona and the Northern Arizona University Army ROTC programs were invited to participate in this study and declined to do so, which would have offered a greater sample population. Nonetheless, the results of this research will be useful for analysis and further discussion of gender-equality in Army ROTC at Arizona State University.

ContributorsAllemang, Lindsey Ann (Author) / Wood, Reed (Thesis director) / Switzer, Heather (Committee member) / School of Politics and Global Studies (Contributor) / School of Social Transformation (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

What Students Say Can Pave the Way: Creating Open Dialogue for Study Abroad Experiences

Description

The number of undergraduate students participating in short-term experiences in global health (STEGHs) abroad has increased dramatically in recent years (Eyler 2002, Drain et al. 2007). These experiences, in tandem with classroom learning, are designed to help students master skills related to global health competencies, including cultural humility and sensitivity,…

The number of undergraduate students participating in short-term experiences in global health (STEGHs) abroad has increased dramatically in recent years (Eyler 2002, Drain et al. 2007). These experiences, in tandem with classroom learning, are designed to help students master skills related to global health competencies, including cultural humility and sensitivity, collaborating with community partners, and sociocultural and political awareness. Although STEGHs offer potential benefits to both students and to sending institutions, these experiences can sometimes be problematic and raise ethical challenges. As the number of students engaged in STEGHs continues to increase, it is important to better understand the impact of these programs on student learning. Current ethical and best practice guidelines for STEGHs state that programs should establish evaluation methods to solicit feedback from students both during and on completion of the program (Crump et al. 2010). However, there is currently no established method for gathering this feedback because of the many different global health competency frameworks, types and duration of programs, and different models of student engagement in such programs. Assessing the quality of a STEGH is a profoundly important and difficult question that cannot be answered as succinctly and quantitatively as classroom performance, which has more standard and established assessment metrics. The goal of this project is to identify the most appropriate and useful assessment metric(s) for determining educational quality and impact for STEGHs at ASU by comparing a typical quantitative evaluation tool (pre-post survey with brief open-ended questions) to a more in-depth qualitative method (key informant interviews). In performing my analysis I seek to examine if the latter can produce a richer narrative of student experiences to inform ongoing program evaluations. My research questions are: 1. What are the current qualitative and quantitative evaluation methods available to assess student learning during short-term experiences in global health? 2. How can current methodology for assessing student experiences with short-term experiences in global health be adapted to collect the most information from students? 3. How do student knowledge and attitudes change before and after their short-term experience in global health? Why is understanding those changes important for adapting programs? My end goal would be to use these new, optimal assessment methods for gathering student perspectives and experiences to adapt pre-departure trainings and post-experience debriefings for study abroad programs, both of which I believe will lead to more sustainable partnerships and a healthier understanding of global health work for students.

ContributorsHale, Brittany Ann (Author) / Jehn, Megan (Thesis director) / Wutich, Amber (Committee member) / School of Human Evolution and Social Change (Contributor) / School of Social Transformation (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Genocide and The Anti-Imperialist Perspective

Description

Genocide studies have traditionally focused on the perpetrator’s intent to eradicate a particular identity-based group, using the Holocaust as their model and point of comparison. Although some aspects of the Holocaust were undoubtedly unique, recent scholars have sought to challenge the notion that it was a singular phenomenon. Instead, they…

Genocide studies have traditionally focused on the perpetrator’s intent to eradicate a particular identity-based group, using the Holocaust as their model and point of comparison. Although some aspects of the Holocaust were undoubtedly unique, recent scholars have sought to challenge the notion that it was a singular phenomenon. Instead, they draw attention to a recurring pattern of genocidal events throughout history by shifting the focus from intent to structure. One particular branch of scholars seeks to connect the ideology and tactics of imperialism with certain genocidal events. These anti-imperialist genocide scholars concede that their model cannot account for all genocides, but still claim that it creates meaningful connections between genocides committed by Western colonialist powers and those that have occurred in a neoimperialist world order shaped according to Western interests. The latter includes genocides in postcolonial states, which these scholars believe were shaped by the scars of their colonial past, as well as genocides in which imperial hegemons assisted local perpetrators. Imperialist and former colonial powers have contributed meaningfully to all of these kinds of genocides, yet their contributions have largely been ignored due to their own influence on the creation of the current international order. Incorporating the anti-imperialist perspective into the core doctrine of genocide studies may lead to breakthroughs in areas of related policy and practice, such as prevention and accountability.

ContributorsParker, Ashleigh Mae (Author) / Thies, Cameron (Thesis director) / Sivak, Henry (Committee member) / School of Politics and Global Studies (Contributor) / School of Social Transformation (Contributor) / Department of English (Contributor) / Barrett, The Honors College (Contributor)

Created2020-05

Filtering by