Search Content

Interpretable Question Answering using Deep Embedded Knowledge Reasoning to Solve Qualitative Word Problems

Description

One of the measures to determine the intelligence of a system is through Question Answering, as it requires a system to comprehend a question and reason using its knowledge base to accurately answer it. Qualitative word problems are an important subset of such problems, as they require a system to…

One of the measures to determine the intelligence of a system is through Question Answering, as it requires a system to comprehend a question and reason using its knowledge base to accurately answer it. Qualitative word problems are an important subset of such problems, as they require a system to recognize and reason with qualitative knowledge expressed in natural language. Traditional approaches in this domain include multiple modules to parse a given problem and to perform the required reasoning. Recent approaches involve using large pre-trained Language models like the Bidirection Encoder Representations from Transformers for downstream question answering tasks through supervision. These approaches however either suffer from errors between multiple modules, or are not interpretable with respect to the reasoning process employed. The proposed solution in this work aims to overcome these drawbacks through a single end-to-end trainable model that performs both the required parsing and reasoning. The parsing is achieved through an attention mechanism, whereas the reasoning is performed in vector space using soft logic operations. The model also enforces constraints in the form of auxiliary loss terms to increase the interpretability of the underlying reasoning process. The work achieves state of the art accuracy on the QuaRel dataset and matches that of the QuaRTz dataset with additional interpretability.

ContributorsNarayana, Sanjay (Author) / Baral, Chitta (Thesis advisor) / Mitra, Arindam (Committee member) / Anwar, Saadat (Committee member) / Arizona State University (Publisher)

Created2020

Hidden Fear: Evaluating the Effectiveness of Messages on Social Media

Description

The development of the internet provided new means for people to communicate effectively and share their ideas. There has been a decline in the consumption of newspapers and traditional broadcasting media toward online social mediums in recent years. Social media has been introduced as a new way of increasing democratic…

The development of the internet provided new means for people to communicate effectively and share their ideas. There has been a decline in the consumption of newspapers and traditional broadcasting media toward online social mediums in recent years. Social media has been introduced as a new way of increasing democratic discussions on political and social matters. Among social media, Twitter is widely used by politicians, government officials, communities, and parties to make announcements and reach their voice to their followers. This greatly increases the acceptance domain of the medium.

The usage of social media during social and political campaigns has been the subject of a lot of social science studies including the Occupy Wall Street movement, The Arab Spring, the United States (US) election, more recently The Brexit campaign. The wide

spread usage of social media in this space and the active participation of people in the discussions on social media made this communication channel a suitable place for spreading propaganda to alter public opinion.

An interesting feature of twitter is the feasibility of which bots can be programmed to operate on this platform. Social media bots are automated agents engineered to emulate the activity of a human being by tweeting some specific content, replying to users, magnifying certain topics by retweeting them. Network on these bots is called botnets and describing the collaboration of connected computers with programs that communicates across multiple devices to perform some task.

In this thesis, I will study how bots can influence the opinion, finding which parameters are playing a role in shrinking or coalescing the communities, and finally logically proving the effectiveness of each of the hypotheses.

ContributorsAhmadi, Mohsen (Author) / Davulcu, Hasan (Thesis advisor) / Sen, Arunabha (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2020

Towards Building an Intelligent Tutor for Gestural Languages using Concept Level Explainable AI

Description

Languages, specially gestural and sign languages, are best learned in immersive environments with rich feedback. Computer-Aided Language Learning (CALL) solu- tions for spoken languages have successfully incorporated some feedback mechanisms, but no such solution exists for signed languages. Computer Aided Sign Language Learning (CASLL) is a recent and promising field…

Languages, specially gestural and sign languages, are best learned in immersive environments with rich feedback. Computer-Aided Language Learning (CALL) solu- tions for spoken languages have successfully incorporated some feedback mechanisms, but no such solution exists for signed languages. Computer Aided Sign Language Learning (CASLL) is a recent and promising field of research which is made feasible by advances in Computer Vision and Sign Language Recognition(SLR). Leveraging existing SLR systems for feedback based learning is not feasible because their decision processes are not human interpretable and do not facilitate conceptual feedback to learners. Thus, fundamental research is needed towards designing systems that are modular and explainable. The explanations from these systems can then be used to produce feedback to aid in the learning process.

In this work, I present novel approaches for the recognition of location, movement and handshape that are components of American Sign Language (ASL) using both wrist-worn sensors as well as webcams. Finally, I present Learn2Sign(L2S), a chat- bot based AI tutor that can provide fine-grained conceptual feedback to learners of ASL using the modular recognition approaches. L2S is designed to provide feedback directly relating to the fundamental concepts of ASL using an explainable AI. I present the system performance results in terms of Precision, Recall and F-1 scores as well as validation results towards the learning outcomes of users. Both retention and execution tests for 26 participants for 14 different ASL words learned using learn2sign is presented. Finally, I also present the results of a post-usage usability survey for all the participants. In this work, I found that learners who received live feedback on their executions improved their execution as well as retention performances. The average increase in execution performance was 28% points and that for retention was 4% points.

ContributorsPaudyal, Prajwal (Author) / Gupta, Sandeep (Thesis advisor) / Banerjee, Ayan (Committee member) / Hsiao, Ihan (Committee member) / Azuma, Tamiko (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2020

Kitsune: Structurally-Aware and Adaptable Plagiarism Detection

Description

Plagiarism is a huge problem in a learning environment. In programming classes especially, plagiarism can be hard to detect as source codes' appearance can be easily modified without changing the intent through simple formatting changes or refactoring. There are a number of plagiarism detection tools that attempt to encode knowledge…

Plagiarism is a huge problem in a learning environment. In programming classes especially, plagiarism can be hard to detect as source codes' appearance can be easily modified without changing the intent through simple formatting changes or refactoring. There are a number of plagiarism detection tools that attempt to encode knowledge about the programming languages they support in order to better detect obscured duplicates. Many such tools do not support a large number of languages because doing so requires too much code and therefore too much maintenance. It is also difficult to add support for new languages because each language is vastly different syntactically. Tools that are more extensible often do so by reducing the features of a language that are encoded and end up closer to text comparison tools than structurally-aware program analysis tools.

Kitsune attempts to remedy these issues by tying itself to Antlr, a pre-existing language recognition tool with over 200 currently supported languages. In addition, it provides an interface through which generic manipulations can be applied to the parse tree generated by Antlr. As Kitsune relies on language-agnostic structure modifications, it can be adapted with minimal effort to provide plagiarism detection for new languages. Kitsune has been evaluated for 10 of the languages in the Antlr grammar repository with success and could easily be extended to support all of the grammars currently developed by Antlr or future grammars which are developed as new languages are written.

ContributorsMonroe, Zachary Lynn (Author) / Bansal, Ajay (Thesis advisor) / Lindquist, Timothy (Committee member) / Acuna, Ruben (Committee member) / Arizona State University (Publisher)

Created2020

Identification of Compromised Nodes in Collaborative Intrusion Detection Systems for Large Scale Networks Due to Insider Attacks

Description

Large organizations have multiple networks that are subject to attacks, which can be detected by continuous monitoring and analyzing the network traffic by Intrusion Detection Systems. Collaborative Intrusion Detection Systems (CIDS) are used for efficient detection of distributed attacks by having a global view of the traffic events in large…

Large organizations have multiple networks that are subject to attacks, which can be detected by continuous monitoring and analyzing the network traffic by Intrusion Detection Systems. Collaborative Intrusion Detection Systems (CIDS) are used for efficient detection of distributed attacks by having a global view of the traffic events in large networks. However, CIDS are vulnerable to internal attacks, and these internal attacks decrease the mutual trust among the nodes in CIDS required for sharing of critical and sensitive alert data in CIDS. Without the data sharing, the nodes of CIDS cannot collaborate efficiently to form a comprehensive view of events in the networks monitored to detect distributed attacks. The compromised nodes will further decrease the accuracy of CIDS by generating false positives and false negatives of the traffic event classifications. In this thesis, an approach based on a trust score system is presented to detect and suspend the compromised nodes in CIDS to improve the trust among the nodes for efficient collaboration. This trust score-based approach is implemented as a consensus model on a private blockchain because private blockchain has the features to address the accountability, integrity and privacy requirements of CIDS. In this approach, the trust scores of malicious nodes are decreased with every reported false negative or false positive of the traffic event classifications. When the trust scores of any node falls below a threshold, the node is identified as compromised and suspended. The approach is evaluated for the accuracy of identifying malicious nodes in CIDS.

ContributorsYenugunti, Chandralekha (Author) / Yau, Stephen S. (Thesis advisor) / Yang, Yezhou (Committee member) / Zou, Jia (Committee member) / Arizona State University (Publisher)

Created2020

Understanding Disinformation: Learning with Weak Social Supervision

Description

Social media has become an important means of user-centered information sharing and communications in a gamut of domains, including news consumption, entertainment, marketing, public relations, and many more. The low cost, easy access, and rapid dissemination of information on social media draws a large audience but also exacerbate the wide…

Social media has become an important means of user-centered information sharing and communications in a gamut of domains, including news consumption, entertainment, marketing, public relations, and many more. The low cost, easy access, and rapid dissemination of information on social media draws a large audience but also exacerbate the wide propagation of disinformation including fake news, i.e., news with intentionally false information. Disinformation on social media is growing fast in volume and can have detrimental societal effects. Despite the importance of this problem, our understanding of disinformation in social media is still limited. Recent advancements of computational approaches on detecting disinformation and fake news have shown some early promising results. Novel challenges are still abundant due to its complexity, diversity, dynamics, multi-modality, and costs of fact-checking or annotation.

Social media data opens the door to interdisciplinary research and allows one to collectively study large-scale human behaviors otherwise impossible. For example, user engagements over information such as news articles, including posting about, commenting on, or recommending the news on social media, contain abundant rich information. Since social media data is big, incomplete, noisy, unstructured, with abundant social relations, solely relying on user engagements can be sensitive to noisy user feedback. To alleviate the problem of limited labeled data, it is important to combine contents and this new (but weak) type of information as supervision signals, i.e., weak social supervision, to advance fake news detection.

The goal of this dissertation is to understand disinformation by proposing and exploiting weak social supervision for learning with little labeled data and effectively detect disinformation via innovative research and novel computational methods. In particular, I investigate learning with weak social supervision for understanding disinformation with the following computational tasks: bringing the heterogeneous social context as auxiliary information for effective fake news detection; discovering explanations of fake news from social media for explainable fake news detection; modeling multi-source of weak social supervision for early fake news detection; and transferring knowledge across domains with adversarial machine learning for cross-domain fake news detection. The findings of the dissertation significantly expand the boundaries of disinformation research and establish a novel paradigm of learning with weak social supervision that has important implications in broad applications in social media.

ContributorsShu, Kai (Author) / Liu, Huan (Thesis advisor) / Bernard, H. Russell (Committee member) / Maciejewski, Ross (Committee member) / Xue, Guoliang (Committee member) / Arizona State University (Publisher)

Created2020

Structural Decomposition Methods for Sparse Large-Scale Optimization

Description

This dissertation focuses on three large-scale optimization problems and devising algorithms to solve them. In addition to the societal impact of each problem’s solution, this dissertation contributes to the optimization literature a set of decomposition algorithms for problems whose optimal solution is sparse. These algorithms exploit problem-specific properties and use…

This dissertation focuses on three large-scale optimization problems and devising algorithms to solve them. In addition to the societal impact of each problem’s solution, this dissertation contributes to the optimization literature a set of decomposition algorithms for problems whose optimal solution is sparse. These algorithms exploit problem-specific properties and use tailored strategies based on iterative refinement (outer-approximations). The proposed algorithms are not rooted in duality theory, providing an alternative to existing methods based on linear programming relaxations. However, it is possible to embed existing decomposition methods into the proposed framework. These general decomposition principles extend to other combinatorial optimization problems.

The first problem is a route assignment and scheduling problem in which a set of vehicles need to traverse a directed network while maintaining a minimum inter-vehicle distance at any time. This problem is inspired by applications in hazmat logistics and the coordination of autonomous agents. The proposed approach includes realistic features such as continuous-time vehicle scheduling, heterogeneous speeds, minimum and maximum waiting times at any node, among others.

The second problem is a fixed-charge network design, which aims to find a minimum-cost plan to transport a target amount of a commodity between known origins and destinations. In addition to the typical flow decisions, the model chooses the capacity of each arc and selects sources and sinks. The proposed algorithms admit any nondecreasing piecewise linear cost structure. This model is applied to the Carbon Capture and Storage (CCS) problem, which is to design a minimum-cost pipeline network to transport CO2 between industrial sources and geologic reservoirs for long-term storage.

The third problem extends the proposed decomposition framework to a special case of joint chance constraint programming with independent random variables. This model is applied to the probabilistic transportation problem, where demands are assumed stochastic and independent. Using an empirical probability distribution, this problem is formulated as an integer program with the goal of finding a minimum-cost distribution plan that satisfies all the demands with a minimum given probability. The proposed scalable algorithm is based on a concave envelop approximation of the empirical probability function, which is iteratively refined as needed.

ContributorsMatin Moghaddam, Navid (Author) / Sefair, Jorge (Thesis advisor) / Mirchandani, Pitu (Committee member) / Escobedo, Adolfo (Committee member) / Grubesic, Anthony (Committee member) / Arizona State University (Publisher)

Created2020

Improved Bi-criteria Approximation for the All-or-Nothing Multicommodity Flow Problem in Arbitrary Networks

Description

This thesis addresses the following fundamental maximum throughput routing problem: Given an arbitrary edge-capacitated n-node directed network and a set of k commodities, with source-destination pairs (s_i,t_i) and demands d_i> 0, admit and route the largest possible number of commodities -- i.e., the maximum throughput -- to satisfy their demands.…

This thesis addresses the following fundamental maximum throughput routing problem: Given an arbitrary edge-capacitated n-node directed network and a set of k commodities, with source-destination pairs (s_i,t_i) and demands d_i> 0, admit and route the largest possible number of commodities -- i.e., the maximum throughput -- to satisfy their demands.

The main contributions of this thesis are three-fold: First, a bi-criteria approximation algorithm is presented for this all-or-nothing multicommodity flow (ANF) problem. This algorithm is the first to achieve a constant approximation of the maximum throughput with an edge capacity violation ratio that is at most logarithmic in n, with high probability. The approach used is based on a version of randomized rounding that keeps splittable flows, rather than approximating those via a non-splittable path for each commodity: This allows it to work for arbitrary directed edge-capacitated graphs, unlike most of the prior work on the ANF problem. The algorithm also works if a weighted throughput is considered, where the benefit gained by fully satisfying the demand for commodity i is determined by a given weight w_i>0. Second, a derandomization of the algorithm is presented that maintains the same approximation bounds, using novel pessimistic estimators for Bernstein's inequality. In addition, it is shown how the framework can be adapted to achieve a polylogarithmic fraction of the maximum throughput while maintaining a constant edge capacity violation, if the network capacity is large enough. Lastly, one important aspect of the randomized and derandomized algorithms is their simplicity, which lends to efficient implementations in practice. The implementations of both randomized rounding and derandomized algorithms for the ANF problem are presented and show their efficiency in practice.

ContributorsChaturvedi, Anya (Author) / Richa, Andréa W. (Thesis advisor) / Sen, Arunabha (Committee member) / Schmid, Stefan (Committee member) / Arizona State University (Publisher)

Created2020

Towards Advanced Malware Classification: A Reused Code Analysis of Mirai Bonnet and Ransomware

Description

Due to the increase in computer and database dependency, the damage caused by malicious codes increases. Moreover, gravity and the magnitude of malicious attacks by hackers grow at an unprecedented rate. A key challenge lies on detecting such malicious attacks and codes in real-time by the use of existing methods,…

Due to the increase in computer and database dependency, the damage caused by malicious codes increases. Moreover, gravity and the magnitude of malicious attacks by hackers grow at an unprecedented rate. A key challenge lies on detecting such malicious attacks and codes in real-time by the use of existing methods, such as a signature-based detection approach. To this end, computer scientists have attempted to classify heterogeneous types of malware on the basis of their observable characteristics. Existing literature focuses on classifying binary codes, due to the greater accessibility of malware binary than source code. Also, for the improved speed and scalability, machine learning-based approaches are widely used. Despite such merits, the machine learning-based approach critically lacks the interpretability of its outcome, thus restricts understandings of why a given code belongs to a particular type of malicious malware and, importantly, why some portions of a code are reused very often by hackers. In this light, this study aims to enhance understanding of malware by directly investigating reused codes and uncovering their characteristics.

To examine reused codes in malware, both malware with source code and malware with binary code are considered in this thesis. For malware with source code, reused code chunks in the Mirai botnet. This study lists frequently reused code chunks and analyzes the characteristics and location of the code. For malware with binary code, this study performs reverse engineering on the binary code for human readers to comprehend, visually inspects reused codes in binary ransomware code, and illustrates the functionality of the reused codes on the basis of similar behaviors and tactics.

This study makes a novel contribution to the literature by directly investigating the characteristics of reused code in malware. The findings of the study can help cybersecurity practitioners and scholars increase the performance of malware classification.

ContributorsLEe, Yeonjung (Author) / Bao, Youzhi (Thesis advisor) / Doupe, Adam (Committee member) / Shoshitaishvili, Yan (Committee member) / Arizona State University (Publisher)

Created2020

Poincare Embeddings for Visualizing Eigenvector Centrality

Description

Hyperbolic geometry, which is a geometry which concerns itself with hyperbolic space, has caught the eye of certain circles in the machine learning community as of late. Lauded for its ability to encapsulate strong clustering as well as latent hierarchies in complex and social networks, hyperbolic geometry has proven itself…

Hyperbolic geometry, which is a geometry which concerns itself with hyperbolic space, has caught the eye of certain circles in the machine learning community as of late. Lauded for its ability to encapsulate strong clustering as well as latent hierarchies in complex and social networks, hyperbolic geometry has proven itself to be an enduring presence in the network science community throughout the 2010s, with no signs of fading into obscurity anytime soon. Hyperbolic embeddings, which map a given graph to hyperbolic space, have particularly proven to be a powerful and dynamic tool for studying complex networks. Hyperbolic embeddings are exploited in this thesis to illustrate centrality in a graph. In network science, centrality quantifies the influence of individual nodes in a graph. Eigenvector centrality is one type of such measure, and assigns an influence weight to each node in a graph by solving for an eigenvector equation. A procedure is defined to embed a given network in a model of hyperbolic space, known as the Poincare disk, according to the influence weights computed by three eigenvector centrality measures: the PageRank algorithm, the Hyperlink-Induced Topic Search (HITS) algorithm, and the Pinski-Narin algorithm. The resulting embeddings are shown to accurately and meaningfully reflect each node's influence and proximity to influential nodes.

ContributorsChang, Alena (Author) / Xue, Guoliang (Thesis advisor) / Yang, Dejun (Committee member) / Yang, Yezhou (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by