Search Content

New and Provable Results for Network Inference Problems and Multi-agent Optimization Algorithms

Description

Our ability to understand networks is important to many applications, from the analysis and modeling of biological networks to analyzing social networks. Unveiling network dynamics allows us to make predictions and decisions. Moreover, network dynamics models have inspired new ideas for computational methods involving multi-agent cooperation, offering effective solutions for…

Our ability to understand networks is important to many applications, from the analysis and modeling of biological networks to analyzing social networks. Unveiling network dynamics allows us to make predictions and decisions. Moreover, network dynamics models have inspired new ideas for computational methods involving multi-agent cooperation, offering effective solutions for optimization tasks. This dissertation presents new theoretical results on network inference and multi-agent optimization, split into two parts -

The first part deals with modeling and identification of network dynamics. I study two types of network dynamics arising from social and gene networks. Based on the network dynamics, the proposed network identification method works like a `network RADAR', meaning that interaction strengths between agents are inferred by injecting `signal' into the network and observing the resultant reverberation. In social networks, this is accomplished by stubborn agents whose opinions do not change throughout a discussion. In gene networks, genes are suppressed to create desired perturbations. The steady-states under these perturbations are characterized. In contrast to the common assumption of full rank input, I take a laxer assumption where low-rank input is used, to better model the empirical network data. Importantly, a network is proven to be identifiable from low rank data of rank that grows proportional to the network's sparsity. The proposed method is applied to synthetic and empirical data, and is shown to offer superior performance compared to prior work. The second part is concerned with algorithms on networks. I develop three consensus-based algorithms for multi-agent optimization. The first method is a decentralized Frank-Wolfe (DeFW) algorithm. The main advantage of DeFW lies on its projection-free nature, where we can replace the costly projection step in traditional algorithms by a low-cost linear optimization step. I prove the convergence rates of DeFW for convex and non-convex problems. I also develop two consensus-based alternating optimization algorithms --- one for least square problems and one for non-convex problems. These algorithms exploit the problem structure for faster convergence and their efficacy is demonstrated by numerical simulations.

I conclude this dissertation by describing future research directions.

ContributorsWai, Hoi To (Author) / Scaglione, Anna (Thesis advisor) / Berisha, Visar (Committee member) / Nedich, Angelia (Committee member) / Ying, Lei (Committee member) / Arizona State University (Publisher)

Created2017

Signal Processing and Machine Learning Techniques Towards Various Real-World Applications

Description

Machine learning (ML) has played an important role in several modern technological innovations and has become an important tool for researchers in various fields of interest. Besides engineering, ML techniques have started to spread across various departments of study, like health-care, medicine, diagnostics, social science, finance, economics etc. These techniques…

Machine learning (ML) has played an important role in several modern technological innovations and has become an important tool for researchers in various fields of interest. Besides engineering, ML techniques have started to spread across various departments of study, like health-care, medicine, diagnostics, social science, finance, economics etc. These techniques require data to train the algorithms and model a complex system and make predictions based on that model. Due to development of sophisticated sensors it has become easier to collect large volumes of data which is used to make necessary hypotheses using ML. The promising results obtained using ML have opened up new opportunities of research across various departments and this dissertation is a manifestation of it. Here, some unique studies have been presented, from which valuable inference have been drawn for a real-world complex system. Each study has its own unique sets of motivation and relevance to the real world. An ensemble of signal processing (SP) and ML techniques have been explored in each study. This dissertation provides the detailed systematic approach and discusses the results achieved in each study. Valuable inferences drawn from each study play a vital role in areas of science and technology, and it is worth further investigation. This dissertation also provides a set of useful SP and ML tools for researchers in various fields of interest.

ContributorsDutta, Arindam (Author) / Bliss, Daniel W (Thesis advisor) / Berisha, Visar (Committee member) / Richmond, Christ (Committee member) / Corman, Steven (Committee member) / Arizona State University (Publisher)

Created2018

Sensitivity Analysis of a Spatiotemporal Correlation Based Seizure Prediction Algorithm

Description

Epilepsy affects numerous people around the world and is characterized by recurring seizures, prompting the ability to predict them so precautionary measures may be employed. One promising algorithm extracts spatiotemporal correlation based features from intracranial electroencephalography signals for use with support vector machines. The robustness of this methodology is tested…

Epilepsy affects numerous people around the world and is characterized by recurring seizures, prompting the ability to predict them so precautionary measures may be employed. One promising algorithm extracts spatiotemporal correlation based features from intracranial electroencephalography signals for use with support vector machines. The robustness of this methodology is tested through a sensitivity analysis. Doing so also provides insight about how to construct more effective feature vectors.

ContributorsMa, Owen (Author) / Bliss, Daniel (Thesis director) / Berisha, Visar (Committee member) / Barrett, The Honors College (Contributor) / Electrical Engineering Program (Contributor)

Created2015-05

Predicting the Outcome of UFC Fights Using Machine Learning Models

Description

Abstract: The Ultimate Fighting Championship or UFC as it is commonly known, was founded in 1993 and has quickly built itself into the world's foremost authority on all things MMA (mixed martial arts) related. With pay-per-view and cable television deals in hand, the UFC has become a huge competitor in…

Abstract: The Ultimate Fighting Championship or UFC as it is commonly known, was founded in 1993 and has quickly built itself into the world's foremost authority on all things MMA (mixed martial arts) related. With pay-per-view and cable television deals in hand, the UFC has become a huge competitor in the sports market, rivaling the popularity of boxing for almost a decade. As with most other sports, the UFC has seen an influx of various analytics and data science over the past five to seven years. We see this revolution in football with the broadcast first down markers, basketball with tracking player movement, and baseball with locating pitches for strikes and balls, and now the UFC has partnered with statistics company Fightmetric, to provide in-depth statistical analysis of its fights. ESPN has their win probability metrics, and statistical predictive modeling has begun to spread throughout sports. All these stats were made to showcase the information about a fighter that one wouldn't typically know, giving insight into how the fight might go. But, can these fights be predicted? Based off of the research of prior individuals and combining the thought processes of relevant research into other sports leagues, I sought to use the arsenal of statistical analyses done by Fightmetric, along with the official UFC fighter database to answer the question of whether UFC fights could be predicted. Specifically, by using only data that would be known about a fighter prior to stepping into the cage, could I predict with any degree of certainty who was going to win the fight?

ContributorsMoorman, Taylor D. (Author) / Simon, Alan (Thesis director) / Simon, Phil (Committee member) / W.P. Carey School of Business (Contributor) / Department of Information Systems (Contributor) / Department of Management and Entrepreneurship (Contributor) / Barrett, The Honors College (Contributor)

Created2018-05

Graph-based estimation of information divergence functions

Description

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric…

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is rarely the case in real-word scenarios. Non-parametric density estimators are characterized by a very large number of parameters that have to be tuned with costly cross-validation. In this dissertation we focus on a specific family of non-parametric estimators, called direct estimators, that bypass density estimation completely and directly estimate the quantity of interest from the data. We introduce a new divergence measure, the $D_p$-divergence, that can be estimated directly from samples without parametric assumptions on the distribution. We show that the $D_p$-divergence bounds the binary, cross-domain, and multi-class Bayes error rates and, in certain cases, provides provably tighter bounds than the Hellinger divergence. In addition, we also propose a new methodology that allows the experimenter to construct direct estimators for existing divergence measures or to construct new divergence measures with custom properties that are tailored to the application. To examine the practical efficacy of these new methods, we evaluate them in a statistical learning framework on a series of real-world data science problems involving speech-based monitoring of neuro-motor disorders.

ContributorsWisler, Alan (Author) / Berisha, Visar (Thesis advisor) / Spanias, Andreas (Thesis advisor) / Liss, Julie (Committee member) / Bliss, Daniel (Committee member) / Arizona State University (Publisher)

Created2017

Improved Finite Sample Estimate of A Nonparametric Divergence Measure

Description

This work details the bootstrap estimation of a nonparametric information divergence measure, the Dp divergence measure, using a power law model. To address the challenge posed by computing accurate divergence estimates given finite size data, the bootstrap approach is used in conjunction with a power law curve to calculate an…

This work details the bootstrap estimation of a nonparametric information divergence measure, the Dp divergence measure, using a power law model. To address the challenge posed by computing accurate divergence estimates given finite size data, the bootstrap approach is used in conjunction with a power law curve to calculate an asymptotic value of the divergence estimator. Monte Carlo estimates of Dp are found for increasing values of sample size, and a power law fit is used to relate the divergence estimates as a function of sample size. The fit is also used to generate a confidence interval for the estimate to characterize the quality of the estimate. We compare the performance of this method with the other estimation methods. The calculated divergence is applied to the binary classification problem. Using the inherent relation between divergence measures and classification error rate, an analysis of the Bayes error rate of several data sets is conducted using the asymptotic divergence estimate.

ContributorsKadambi, Pradyumna Sanjay (Author) / Berisha, Visar (Thesis director) / Bliss, Daniel (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

A Non-Parametric Semi-Supervised f-Divergence

Description

Divergence functions are both highly useful and fundamental to many areas in information theory and machine learning, but require either parametric approaches or prior knowledge of labels on the full data set. This paper presents a method to estimate the divergence between two data sets in the absence of fully…

Divergence functions are both highly useful and fundamental to many areas in information theory and machine learning, but require either parametric approaches or prior knowledge of labels on the full data set. This paper presents a method to estimate the divergence between two data sets in the absence of fully labeled data. This semi-labeled case is common in many domains where labeling data by hand is expensive or time-consuming, or wherever large data sets are present. The theory derived in this paper is demonstrated on a simulated example, and then applied to a feature selection and classification problem from pathological speech analysis.

ContributorsGilton, Davis Leland (Author) / Berisha, Visar (Thesis director) / Cochran, Douglas (Committee member) / Electrical Engineering Program (Contributor) / Barrett, The Honors College (Contributor)

Created2016-05

Using ML to Predict Online Course Ratings

Description

The pandemic that hit in 2020 has boosted the growth of online learning that involves the booming of Massive Open Online Course (MOOC). To support this situation, it will be helpful to have tools that can help students in choosing between the different courses and can help instructors to understand…

The pandemic that hit in 2020 has boosted the growth of online learning that involves the booming of Massive Open Online Course (MOOC). To support this situation, it will be helpful to have tools that can help students in choosing between the different courses and can help instructors to understand what the students need. One of those tools is an online course ratings predictor. Using the predictor, online course instructors can learn the qualities that majority course takers deem as important, and thus they can adjust their lesson plans to fit those qualities. Meanwhile, students will be able to use it to help them in choosing the course to take by comparing the ratings. This research aims to find the best way to predict the rating of online courses using machine learning (ML). To create the ML model, different combinations of the length of the course, the number of materials it contains, the price of the course, the number of students taking the course, the course’s difficulty level, the usage of jargons or technical terms in the course description, the course’s instructors’ rating, the number of reviews the instructors got, and the number of classes the instructors have created on the same platform are used as the inputs. Meanwhile, the output of the model would be the average rating of a course. Data from 350 courses are used for this model, where 280 of them are used for training, 35 for testing, and the last 35 for validation. After trying out different machine learning models, wide neural networks model constantly gives the best training results while the medium tree model gives the best testing results. However, further research needs to be conducted as none of the results are not accurate, with 0.51 R-squared test result for the tree model.

ContributorsWidodo, Herlina (Author) / VanLehn, Kurt (Thesis director) / Craig, Scotty (Committee member) / Barrett, The Honors College (Contributor) / Department of Management and Entrepreneurship (Contributor) / Computer Science and Engineering Program (Contributor)

Created2021-12

Machine Learning for the Design of Screening Tests: General Principles and Applications in Criminology and Digital Medicine

Description

This dissertation explores applications of machine learning methods in service of the design of screening tests, which are ubiquitous in applications from social work, to criminology, to healthcare. In the first part, a novel Bayesian decision theory framework is presented for designing tree-based adaptive tests. On an application to youth…

This dissertation explores applications of machine learning methods in service of the design of screening tests, which are ubiquitous in applications from social work, to criminology, to healthcare. In the first part, a novel Bayesian decision theory framework is presented for designing tree-based adaptive tests. On an application to youth delinquency in Honduras, the method produces a 15-item instrument that is almost as accurate as a full-length 150+ item test. The framework includes specific considerations for the context in which the test will be administered, and provides uncertainty quantification around the trade-offs of shortening lengthy tests. In the second part, classification complexity is explored via theoretical and empirical results from statistical learning theory, information theory, and empirical data complexity measures. A simulation study that explicitly controls two key aspects of classification complexity is performed to relate the theoretical and empirical approaches. Throughout, a unified language and notation that formalizes classification complexity is developed; this same notation is used in subsequent chapters to discuss classification complexity in the context of a speech-based screening test. In the final part, the relative merits of task and feature engineering when designing a speech-based cognitive screening test are explored. Through an extensive classification analysis on a clinical speech dataset from patients with normal cognition and Alzheimer’s disease, the speech elicitation task is shown to have a large impact on test accuracy; carefully performed task and feature engineering are required for best results. A new framework for objectively quantifying speech elicitation tasks is introduced, and two methods are proposed for automatically extracting insights into the aspects of the speech elicitation task that are driving classification performance. The dissertation closes with recommendations for how to evaluate the obtained insights and use them to guide future design of speech-based screening tests.

ContributorsKrantsevich, Chelsea (Author) / Hahn, P. Richard (Thesis advisor) / Berisha, Visar (Committee member) / Lopes, Hedibert (Committee member) / Renaut, Rosemary (Committee member) / Zheng, Yi (Committee member) / Arizona State University (Publisher)

Created2023

A Study on the Relationship between Data Leakage and Overoptimistic Estimates of the Performance of Machine Learning Models

Description

In this thesis, six experiments which were computer simulations were conducted in order to replicate the negative association between sample size and accuracy that is repeatedly found in ML literature by accounting for data leakage and publication bias. The reason why it is critical to understand why this negative association…

In this thesis, six experiments which were computer simulations were conducted in order to replicate the negative association between sample size and accuracy that is repeatedly found in ML literature by accounting for data leakage and publication bias. The reason why it is critical to understand why this negative association is occurring is that in published studies, there have been multiple reports that the accuracies in ML models are overoptimistic leading to cases where the results are irreproducible despite conducting multiple trials and experiments. Additionally, after replicating the negative association between sample size and accuracy, parametric curves (learning curves with the parametric function) were fitted along the empirical learning curves in order to evaluate the performance. It was found that there is a significant variance in accuracies when the sample size is small, but little to no variation when the sample size is large. In other words, the empirical learning curves with data leakage and publication bias were able to achieve the same accuracy as the learning curve without data leakage at a large sample size.

ContributorsKottooru, Rishab (Author) / Berisha, Visar (Thesis director) / Dasarathy, Gautam (Committee member) / Saidi, Pouria (Committee member) / Barrett, The Honors College (Contributor) / Chemical Engineering Program (Contributor)

Created2023-05

Filtering by