Search Content

RAProp: ranking tweets by exploiting the tweet/user/web ecosystem

Description

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a…

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is used to power two novel methods of ranking tweets by propagating the reputation over an agreement graph based on tweets' content similarity. Additionally, I show how the agreement graph helps counter tweet spam. An evaluation of my method on 16~million tweets from the TREC 2011 Microblog Dataset shows that it doubles the precision over baseline Twitter Search and achieves higher precision than current state of the art method. I present a detailed internal empirical evaluation of RAProp in comparison to several alternative approaches proposed by me, as well as external evaluation in comparison to the current state of the art method.

ContributorsRavikumar, Srijith (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)

Created2013

An adaptive approach to securing ubiquitous smart devices in IoT environment with probabilistic user behavior prediction

Description

Cyber systems, including IoT (Internet of Things), are increasingly being used ubiquitously to vastly improve the efficiency and reduce the cost of critical application areas, such as finance, transportation, defense, and healthcare. Over the past two decades, computing efficiency and hardware cost have dramatically been improved. These improvements have made…

Cyber systems, including IoT (Internet of Things), are increasingly being used ubiquitously to vastly improve the efficiency and reduce the cost of critical application areas, such as finance, transportation, defense, and healthcare. Over the past two decades, computing efficiency and hardware cost have dramatically been improved. These improvements have made cyber systems omnipotent, and control many aspects of human lives. Emerging trends in successful cyber system breaches have shown increasing sophistication in attacks and that attackers are no longer limited by resources, including human and computing power. Most existing cyber defense systems for IoT systems have two major issues: (1) they do not incorporate human user behavior(s) and preferences in their approaches, and (2) they do not continuously learn from dynamic environment and effectively adapt to thwart sophisticated cyber-attacks. Consequently, the security solutions generated may not be usable or implementable by the user(s) thereby drastically reducing the effectiveness of these security solutions.

In order to address these major issues, a comprehensive approach to securing ubiquitous smart devices in IoT environment by incorporating probabilistic human user behavioral inputs is presented. The approach will include techniques to (1) protect the controller device(s) [smart phone or tablet] by continuously learning and authenticating the legitimate user based on the touch screen finger gestures in the background, without requiring users’ to provide their finger gesture inputs intentionally for training purposes, and (2) efficiently configure IoT devices through controller device(s), in conformance with the probabilistic human user behavior(s) and preferences, to effectively adapt IoT devices to the changing environment. The effectiveness of the approach will be demonstrated with experiments that are based on collected user behavioral data and simulations.

ContributorsBuduru, Arun Balaji (Author) / Yau, Sik-Sang (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Davulcu, Hasan (Committee member) / Zhang, Yanchao (Committee member) / Arizona State University (Publisher)

Created2016

BioEve: user interface framework bridging IE and IR

Description

Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to…

Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to rapidly and effectively survey the literature is necessary for the creation of large scale models of the relationships among biomedical entities as well as hypothesis generation to guide biomedical research. To reduce the effort and time spent in performing these activities, an intelligent search system is required. Even though many systems aid in navigating through this wide collection of documents, the vastness and depth of this information overload can be overwhelming. An automated extraction system coupled with a cognitive search and navigation service over these document collections would not only save time and effort, but also facilitate discovery of the unknown information implicitly conveyed in the texts. This thesis presents the different approaches used for large scale biomedical named entity recognition, and the challenges faced in each. It also proposes BioEve: an integrative framework to fuse a faceted search with information extraction to provide a search service that addresses the user's desire for "completeness" of the query results, not just the top-ranked ones. This information extraction system enables discovery of important semantic relationships between entities such as genes, diseases, drugs, and cell lines and events from biomedical text on MEDLINE, which is the largest publicly available database of the world's biomedical journal literature. It is an innovative search and discovery service that makes it easier to search
avigate and discover knowledge hidden in life sciences literature. To demonstrate the utility of this system, this thesis also details a prototype enterprise quality search and discovery service that helps researchers with a guided step-by-step query refinement, by suggesting concepts enriched in intermediate results, and thereby facilitating the "discover more as you search" paradigm.

ContributorsKanwar, Pradeep (Author) / Davulcu, Hasan (Thesis advisor) / Dinu, Valentin (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2010

Kitsune: Structurally-Aware and Adaptable Plagiarism Detection

Description

Plagiarism is a huge problem in a learning environment. In programming classes especially, plagiarism can be hard to detect as source codes' appearance can be easily modified without changing the intent through simple formatting changes or refactoring. There are a number of plagiarism detection tools that attempt to encode knowledge…

Plagiarism is a huge problem in a learning environment. In programming classes especially, plagiarism can be hard to detect as source codes' appearance can be easily modified without changing the intent through simple formatting changes or refactoring. There are a number of plagiarism detection tools that attempt to encode knowledge about the programming languages they support in order to better detect obscured duplicates. Many such tools do not support a large number of languages because doing so requires too much code and therefore too much maintenance. It is also difficult to add support for new languages because each language is vastly different syntactically. Tools that are more extensible often do so by reducing the features of a language that are encoded and end up closer to text comparison tools than structurally-aware program analysis tools.

Kitsune attempts to remedy these issues by tying itself to Antlr, a pre-existing language recognition tool with over 200 currently supported languages. In addition, it provides an interface through which generic manipulations can be applied to the parse tree generated by Antlr. As Kitsune relies on language-agnostic structure modifications, it can be adapted with minimal effort to provide plagiarism detection for new languages. Kitsune has been evaluated for 10 of the languages in the Antlr grammar repository with success and could easily be extended to support all of the grammars currently developed by Antlr or future grammars which are developed as new languages are written.

ContributorsMonroe, Zachary Lynn (Author) / Bansal, Ajay (Thesis advisor) / Lindquist, Timothy (Committee member) / Acuna, Ruben (Committee member) / Arizona State University (Publisher)

Created2020

Filtering by

RAProp: ranking tweets by exploiting the tweet/user/web ecosystem

An adaptive approach to securing ubiquitous smart devices in IoT environment with probabilistic user behavior prediction

BioEve: user interface framework bridging IE and IR

Kitsune: Structurally-Aware and Adaptable Plagiarism Detection