Search Content

RAProp: ranking tweets by exploiting the tweet/user/web ecosystem

Description

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a…

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is used to power two novel methods of ranking tweets by propagating the reputation over an agreement graph based on tweets' content similarity. Additionally, I show how the agreement graph helps counter tweet spam. An evaluation of my method on 16~million tweets from the TREC 2011 Microblog Dataset shows that it doubles the precision over baseline Twitter Search and achieves higher precision than current state of the art method. I present a detailed internal empirical evaluation of RAProp in comparison to several alternative approaches proposed by me, as well as external evaluation in comparison to the current state of the art method.

ContributorsRavikumar, Srijith (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)

Created2013

Answer set programming and other computing paradigms

Description

Answer Set Programming (ASP) is one of the most prominent and successful knowledge representation paradigms. The success of ASP is due to its expressive non-monotonic modeling language and its efficient computational methods originating from building propositional satisfiability solvers. The wide adoption of ASP has motivated several extensions to its modeling…

Answer Set Programming (ASP) is one of the most prominent and successful knowledge representation paradigms. The success of ASP is due to its expressive non-monotonic modeling language and its efficient computational methods originating from building propositional satisfiability solvers. The wide adoption of ASP has motivated several extensions to its modeling language in order to enhance expressivity, such as incorporating aggregates and interfaces with ontologies. Also, in order to overcome the grounding bottleneck of computation in ASP, there are increasing interests in integrating ASP with other computing paradigms, such as Constraint Programming (CP) and Satisfiability Modulo Theories (SMT). Due to the non-monotonic nature of the ASP semantics, such enhancements turned out to be non-trivial and the existing extensions are not fully satisfactory. We observe that one main reason for the difficulties rooted in the propositional semantics of ASP, which is limited in handling first-order constructs (such as aggregates and ontologies) and functions (such as constraint variables in CP and SMT) in natural ways. This dissertation presents a unifying view on these extensions by viewing them as instances of formulas with generalized quantifiers and intensional functions. We extend the first-order stable model semantics by by Ferraris, Lee, and Lifschitz to allow generalized quantifiers, which cover aggregate, DL-atoms, constraints and SMT theory atoms as special cases. Using this unifying framework, we study and relate different extensions of ASP. We also present a tight integration of ASP with SMT, based on which we enhance action language C+ to handle reasoning about continuous changes. Our framework yields a systematic approach to study and extend non-monotonic languages.

ContributorsMeng, Yunsong (Author) / Lee, Joohyung (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Baral, Chitta (Committee member) / Fainekos, Georgios (Committee member) / Lifschitz, Vladimir (Committee member) / Arizona State University (Publisher)

Created2013

Connecting users with similar interests for group understanding

Description

In most social networking websites, users are allowed to perform interactive activities. One of the fundamental features that these sites provide is to connecting with users of their kind. On one hand, this activity makes online connections visible and tangible; on the other hand, it enables the exploration of our…

In most social networking websites, users are allowed to perform interactive activities. One of the fundamental features that these sites provide is to connecting with users of their kind. On one hand, this activity makes online connections visible and tangible; on the other hand, it enables the exploration of our connections and the expansion of our social networks easier. The aggregation of people who share common interests forms social groups, which are fundamental parts of our social lives. Social behavioral analysis at a group level is an active research area and attracts many interests from the industry. Challenges of my work mainly arise from the scale and complexity of user generated behavioral data. The multiple types of interactions, highly dynamic nature of social networking and the volatile user behavior suggest that these data are complex and big in general. Effective and efficient approaches are required to analyze and interpret such data. My work provide effective channels to help connect the like-minded and, furthermore, understand user behavior at a group level. The contributions of this dissertation are in threefold: (1) proposing novel representation of collective tagging knowledge via tag networks; (2) proposing the new information spreader identification problem in egocentric soical networks; (3) defining group profiling as a systematic approach to understanding social groups. In sum, the research proposes novel concepts and approaches for connecting the like-minded, enables the understanding of user groups, and exposes interesting research opportunities.

ContributorsWang, Xufei (Author) / Liu, Huan (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Sundaram, Hari (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)

Created2013

When is temporal planning really temporal

Description

In this dissertation I develop a deep theory of temporal planning well-suited to analyzing, understanding, and improving the state of the art implementations (as of 2012). At face-value the work is strictly theoretical; nonetheless its impact is entirely real and practical. The easiest portion of that impact to highlight concerns…

In this dissertation I develop a deep theory of temporal planning well-suited to analyzing, understanding, and improving the state of the art implementations (as of 2012). At face-value the work is strictly theoretical; nonetheless its impact is entirely real and practical. The easiest portion of that impact to highlight concerns the notable improvements to the format of the temporal fragment of the International Planning Competitions (IPCs). Particularly: the theory I expound upon here is the primary cause of--and justification for--the altered (i) selection of benchmark problems, and (ii) notion of "winning temporal planner". For higher level motivation: robotics, web service composition, industrial manufacturing, business process management, cybersecurity, space exploration, deep ocean exploration, and logistics all benefit from applying domain-independent automated planning technique. Naturally, actually carrying out such case studies has much to offer. For example, we may extract the lesson that reasoning carefully about deadlines is rather crucial to planning in practice. More generally, effectively automating specifically temporal planning is well-motivated from applications. Entirely abstractly, the aim is to improve the theory of automated temporal planning by distilling from its practice. My thesis is that the key feature of computational interest is concurrency. To support, I demonstrate by way of compilation methods, worst-case counting arguments, and analysis of algorithmic properties such as completeness that the more immediately pressing computational obstacles (facing would-be temporal generalizations of classical planning systems) can be dealt with in theoretically efficient manner. So more accurately the technical contribution here is to demonstrate: The computationally significant obstacle to automated temporal planning that remains is just concurrency.

ContributorsCushing, William Albemarle (Author) / Kambhampati, Subbarao (Thesis advisor) / Weld, Daniel S. (Committee member) / Smith, David E. (Committee member) / Baral, Chitta (Committee member) / Davalcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2012

Automated event-driven security assessment

Description

With the growth of IT products and sophisticated software in various operating systems, I observe that security risks in systems are skyrocketing constantly. Consequently, Security Assessment is now considered as one of primary security mechanisms to measure assurance of systems since systems that are not compliant with security requirements may…

With the growth of IT products and sophisticated software in various operating systems, I observe that security risks in systems are skyrocketing constantly. Consequently, Security Assessment is now considered as one of primary security mechanisms to measure assurance of systems since systems that are not compliant with security requirements may lead adversaries to access critical information by circumventing security practices. In order to ensure security, considerable efforts have been spent to develop security regulations by facilitating security best-practices. Applying shared security standards to the system is critical to understand vulnerabilities and prevent well-known threats from exploiting vulnerabilities. However, many end users tend to change configurations of their systems without paying attention to the security. Hence, it is not straightforward to protect systems from being changed by unconscious users in a timely manner. Detecting the installation of harmful applications is not sufficient since attackers may exploit risky software as well as commonly used software. In addition, checking the assurance of security configurations periodically is disadvantageous in terms of time and cost due to zero-day attacks and the timing attacks that can leverage the window between each security checks. Therefore, event-driven monitoring approach is critical to continuously assess security of a target system without ignoring a particular window between security checks and lessen the burden of exhausted task to inspect the entire configurations in the system. Furthermore, the system should be able to generate a vulnerability report for any change initiated by a user if such changes refer to the requirements in the standards and turn out to be vulnerable. Assessing various systems in distributed environments also requires to consistently applying standards to each environment. Such a uniformed consistent assessment is important because the way of assessment approach for detecting security vulnerabilities may vary across applications and operating systems. In this thesis, I introduce an automated event-driven security assessment framework to overcome and accommodate the aforementioned issues. I also discuss the implementation details that are based on the commercial-off-the-self technologies and testbed being established to evaluate approach. Besides, I describe evaluation results that demonstrate the effectiveness and practicality of the approaches.

ContributorsSeo, Jeong-Jin (Author) / Ahn, Gail-Joon (Thesis advisor) / Yau, Stephen S. (Committee member) / Lee, Joohyung (Committee member) / Arizona State University (Publisher)

Created2014

Automated testing for RBAC policies

Description

Access control is necessary for information assurance in many of today's applications such as banking and electronic health record. Access control breaches are critical security problems that can result from unintended and improper implementation of security policies. Security testing can help identify security vulnerabilities early and avoid unexpected expensive cost…

Access control is necessary for information assurance in many of today's applications such as banking and electronic health record. Access control breaches are critical security problems that can result from unintended and improper implementation of security policies. Security testing can help identify security vulnerabilities early and avoid unexpected expensive cost in handling breaches for security architects and security engineers. The process of security testing which involves creating tests that effectively examine vulnerabilities is a challenging task. Role-Based Access Control (RBAC) has been widely adopted to support fine-grained access control. However, in practice, due to its complexity including role management, role hierarchy with hundreds of roles, and their associated privileges and users, systematically testing RBAC systems is crucial to ensure the security in various domains ranging from cyber-infrastructure to mission-critical applications. In this thesis, we introduce i) a security testing technique for RBAC systems considering the principle of maximum privileges, the structure of the role hierarchy, and a new security test coverage criterion; ii) a MTBDD (Multi-Terminal Binary Decision Diagram) based representation of RBAC security policy including RHMTBDD (Role Hierarchy MTBDD) to efficiently generate effective positive and negative security test cases; and iii) a security testing framework which takes an XACML-based RBAC security policy as an input, parses it into a RHMTBDD representation and then generates positive and negative test cases. We also demonstrate the efficacy of our approach through case studies.

ContributorsGupta, Poonam (Author) / Ahn, Gail-Joon (Thesis advisor) / Collofello, James (Committee member) / Huang, Dijiang (Committee member) / Arizona State University (Publisher)

Created2014

Utility of considering multiple alternative rectifications in data cleaning

Description

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many…

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many cases, where the most a cleaning system can do is to generate a (hopefully small) set of clean candidates for each dirty tuple. When the cleaning system is required to output a deterministic database, it is forced to pick one clean candidate (say the "most likely" candidate) per tuple. Such an approach can lead to loss of information. For example, consider a situation where there are three equally likely clean candidates of a dirty tuple. An appealing alternative that avoids such an information loss is to abandon the requirement that the output database be deterministic. In other words, even though the input (dirty) database is deterministic, I allow the reconstructed database to be probabilistic. Although such an approach does avoid the information loss, it also brings forth several challenges. For example, how many alternatives should be kept per tuple in the reconstructed database? Maintaining too many alternatives increases the size of the reconstructed database, and hence the query processing time. Second, while processing queries on the probabilistic database may well increase recall, how would they affect the precision of the query processing? In this thesis, I investigate these questions. My investigation is done in the context of a data cleaning system called BayesWipe that has the capability of producing multiple clean candidates per each dirty tuple, along with the probability that they are the correct cleaned version. I represent these alternatives as tuples in a tuple disjoint probabilistic database, and use the Mystiq system to process queries on it. This probabilistic reconstruction (called BayesWipe–PDB) is compared to a deterministic reconstruction (called BayesWipe–DET)—where the most likely clean candidate for each tuple is chosen, and the rest of the alternatives discarded.

ContributorsRihan, Preet Inder Singh (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2013

A framework for extended acquisition and uniform representation of forensic email evidence

Description

The digital forensics community has neglected email forensics as a process, despite the fact that email remains an important tool in the commission of crime. Current forensic practices focus mostly on that of disk forensics, while email forensics is left as an analysis task stemming from that practice. As there…

The digital forensics community has neglected email forensics as a process, despite the fact that email remains an important tool in the commission of crime. Current forensic practices focus mostly on that of disk forensics, while email forensics is left as an analysis task stemming from that practice. As there is no well-defined process to be used for email forensics the comprehensiveness, extensibility of tools, uniformity of evidence, usefulness in collaborative/distributed environments, and consistency of investigations are hindered. At present, there exists little support for discovering, acquiring, and representing web-based email, despite its widespread use. To remedy this, a systematic process which includes discovering, acquiring, and representing web-based email for email forensics which is integrated into the normal forensic analysis workflow, and which accommodates the distinct characteristics of email evidence will be presented. This process focuses on detecting the presence of non-obvious artifacts related to email accounts, retrieving the data from the service provider, and representing email in a well-structured format based on existing standards. As a result, developers and organizations can collaboratively create and use analysis tools that can analyze email evidence from any source in the same fashion and the examiner can access additional data relevant to their forensic cases. Following, an extensible framework implementing this novel process-driven approach has been implemented in an attempt to address the problems of comprehensiveness, extensibility, uniformity, collaboration/distribution, and consistency within forensic investigations involving email evidence.

ContributorsPaglierani, Justin W (Author) / Ahn, Gail-Joon (Thesis advisor) / Yau, Stephen S. (Committee member) / Santanam, Raghu T (Committee member) / Arizona State University (Publisher)

Created2013

An ontology-based approach to attribute management in ABAC environment

Description

Attribute Based Access Control (ABAC) mechanisms have been attracting a lot of interest from the research community in recent times. This is especially because of the flexibility and extensibility it provides by using attributes assigned to subjects as the basis for access control. ABAC enables an administrator of a server…

Attribute Based Access Control (ABAC) mechanisms have been attracting a lot of interest from the research community in recent times. This is especially because of the flexibility and extensibility it provides by using attributes assigned to subjects as the basis for access control. ABAC enables an administrator of a server to enforce access policies on the data, services and other such resources fairly easily. It also accommodates new policies and changes to existing policies gracefully, thereby making it a potentially good mechanism for implementing access control in large systems, particularly in today's age of Cloud Computing. However management of the attributes in ABAC environment is an area that has been little touched upon. Having a mechanism to allow multiple ABAC based systems to share data and resources can go a long way in making ABAC scalable. At the same time each system should be able to specify their own attribute sets independently. In the research presented in this document a new mechanism is proposed that would enable users to share resources and data in a cloud environment using ABAC techniques in a distributed manner. The focus is mainly on decentralizing the access policy specifications for the shared data so that each data owner can specify the access policy independent of others. The concept of ontologies and semantic web is introduced in the ABAC paradigm that would help in giving a scalable structure to the attributes and also allow systems having different sets of attributes to communicate and share resources.

ContributorsPrabhu Verleker, Ashwin Narayan (Author) / Huang, Dijiang (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Dasgupta, Partha (Committee member) / Arizona State University (Publisher)

Created2014

Detection of advanced bots in smartphones through user profiling

Description

This thesis addresses the ever increasing threat of botnets in the smartphone domain and focuses on the Android platform and the botnets using Online Social Networks (OSNs) as Command and Control (C&C;) medium. With any botnet, C&C; is one of the components on which the survival of botnet depends. Individual…

This thesis addresses the ever increasing threat of botnets in the smartphone domain and focuses on the Android platform and the botnets using Online Social Networks (OSNs) as Command and Control (C&C;) medium. With any botnet, C&C; is one of the components on which the survival of botnet depends. Individual bots use the C&C; channel to receive commands and send the data. This thesis develops active host based approach for identifying the presence of bot based on the anomalies in the usage patterns of the user before and after the bot is installed on the user smartphone and alerting the user to the presence of the bot. A profile is constructed for each user based on the regular web usage patterns (achieved by intercepting the http(s) traffic) and implementing machine learning techniques to continuously learn the user's behavior and changes in the behavior and all the while looking for any anomalies in the user behavior above a threshold which will cause the user to be notified of the anomalous traffic. A prototype bot which uses OSN s as C&C; channel is constructed and used for testing. Users are given smartphones(Nexus 4 and Galaxy Nexus) running Application proxy which intercepts http(s) traffic and relay it to a server which uses the traffic and constructs the model for a particular user and look for any signs of anomalies. This approach lays the groundwork for the future host-based counter measures for smartphone botnets using OSN s as C&C; channel.

ContributorsKilari, Vishnu Teja (Author) / Xue, Guoliang (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Dasgupta, Partha (Committee member) / Arizona State University (Publisher)

Created2013

Filtering by