Search Content

Tragedy Plus Time: Capturing Human Unintended Activities from Weakly-Labeled Videos

Description

In videos that contain actions performed unintentionally, agents do not achieve their desired goals. In such videos, it is challenging for computer vision systems to understand high-level concepts such as goal-directed behavior. On the other hand, from a very early age, humans are able to understand the relation between an…

In videos that contain actions performed unintentionally, agents do not achieve their desired goals. In such videos, it is challenging for computer vision systems to understand high-level concepts such as goal-directed behavior. On the other hand, from a very early age, humans are able to understand the relation between an agent and their ultimate goal even if the action gets disrupted or unintentional effects occur. Inculcating this ability in artificially intelligent agents would make them better social learners by not just learning from their own mistakes, i.e, reinforcement learning, but also learning from other's mistakes. For example, this could greatly reduce the search space for artificially intelligent agents for finding the correct action sequence when trying to achieve a new goal, since they would be able to learn from others what not to do as well as how/when actions result in undesired outcomes.To validate this ability of deep learning models to perform this task, the Weakly Augmented Oops (W-Oops) dataset is proposed, built upon the Oops dataset. W-Oops consists of 2,100 unintentional human action videos, with 44 goal-directed and 33 unintentional video-level activity labels collected through human annotations. Inspired by previous methods on tasks such as weakly supervised action localization which show promise for achieving good localization results without ground truth segment annotations, this paper proposes a weakly supervised algorithm for localizing the goal-directed as well as the unintentional temporal region of a video using only video-level labels. In particular, an attention mechanism based strategy is employed that predicts the temporal regions which contributes the most to a classification task, leveraging solely video-level labels. Meanwhile, our designed overlap regularization allows the model to focus on distinct portions of the video for inferring the goal-directed and unintentional activity, while guaranteeing their temporal ordering. Extensive quantitative experiments verify the validity of our localization method.

ContributorsChakravarthy, Arnav (Author) / Yang, Yezhou (Thesis advisor) / Davulcu, Hasan (Committee member) / Pavlic, Theodore (Committee member) / Arizona State University (Publisher)

Created2021

Visual analytics for spatiotemporal cluster analysis

Description

Traditionally, visualization is one of the most important and commonly used methods of generating insight into large scale data. Particularly for spatiotemporal data, the translation of such data into a visual form allows users to quickly see patterns, explore summaries and relate domain knowledge about underlying geographical phenomena that would…

Traditionally, visualization is one of the most important and commonly used methods of generating insight into large scale data. Particularly for spatiotemporal data, the translation of such data into a visual form allows users to quickly see patterns, explore summaries and relate domain knowledge about underlying geographical phenomena that would not be apparent in tabular form. However, several critical challenges arise when visualizing and exploring these large spatiotemporal datasets. While, the underlying geographical component of the data lends itself well to univariate visualization in the form of traditional cartographic representations (e.g., choropleth, isopleth, dasymetric maps), as the data becomes multivariate, cartographic representations become more complex. To simplify the visual representations, analytical methods such as clustering and feature extraction are often applied as part of the classification phase. The automatic classification can then be rendered onto a map; however, one common issue in data classification is that items near a classification boundary are often mislabeled.

This thesis explores methods to augment the automated spatial classification by utilizing interactive machine learning as part of the cluster creation step. First, this thesis explores the design space for spatiotemporal analysis through the development of a comprehensive data wrangling and exploratory data analysis platform. Second, this system is augmented with a novel method for evaluating the visual impact of edge cases for multivariate geographic projections. Finally, system features and functionality are demonstrated through a series of case studies, with key features including similarity analysis, multivariate clustering, and novel visual support for cluster comparison.

ContributorsZhang, Yifan (Author) / Maciejewski, Ross (Thesis advisor) / Mack, Elizabeth (Committee member) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2016

Improved, scalable, and personalized context recovery system: E-TweetSense

Description

Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who…

Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who has chosen to subscribe to the tweets from the originator by following the originator. Although, hashtags are used to tag tweets in an effort to attach context to the tweets, many tweets do not have a hashtag. Such tweets are called orphan tweets and they adversely affect the experience of a browser.

A hashtag is a type of label or meta-data tag used in social networks and micro-blogging services which makes it easier for users to find messages with a specific theme or content. The context of a tweet can be defined as a set of one or more hashtags. Users often do not use hashtags to tag their tweets. This leads to the problem of missing context for tweets. To address the problem of missing hashtags, a statistical method was proposed which predicts most likely hashtags based on the social circle of an originator.

In this thesis, we propose to improve on the existing context recovery system by selectively limiting the candidate set of hashtags to be derived from the intimate circle of the originator rather than from every user in the social network of the originator. This helps in reducing the computation, increasing speed of prediction, scaling the system to originators with large social networks while still preserving most of the accuracy of the predictions. We also propose to not only derive the candidate hashtags from the social network of the originator but also derive the candidate hashtags based on the content of the tweet. We further propose to learn personalized statistical models according to the adoption patterns of different originators. This helps in not only identifying the personalized candidate set of hashtags based on the social circle and content of the tweets but also in customizing the hashtag adoption pattern to the originator of the tweet.

ContributorsMallapura Umamaheshwar, Tejas (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2015

Real Time Cross Platform Collaboration Between Virtual Reality & Mixed Reality

Description

Virtual Reality (hereafter VR) and Mixed Reality (hereafter MR) have opened a new line of applications and possibilities. Amidst a vast network of potential applications, little research has been done to provide real time collaboration capability between users of VR and MR. The idea of this thesis study is to…

Virtual Reality (hereafter VR) and Mixed Reality (hereafter MR) have opened a new line of applications and possibilities. Amidst a vast network of potential applications, little research has been done to provide real time collaboration capability between users of VR and MR. The idea of this thesis study is to develop and test a real time collaboration system between VR and MR. The system works similar to a Google document where two or more users can see what others are doing i.e. writing, modifying, viewing, etc. Similarly, the system developed during this study will enable users in VR and MR to collaborate in real time.

The study of developing a real-time cross-platform collaboration system between VR and MR takes into consideration a scenario in which multiple device users are connected to a multiplayer network where they are guided to perform various tasks concurrently.

Usability testing was conducted to evaluate participant perceptions of the system. Users were required to assemble a chair in alternating turns; thereafter users were required to fill a survey and give an audio interview. Results collected from the participants showed positive feedback towards using VR and MR for collaboration. However, there are several limitations with the current generation of devices that hinder mass adoption. Devices with better performance factors will lead to wider adoption.

ContributorsSeth, Nayan Sateesh (Author) / Nelson, Brian (Thesis advisor) / Walker, Erin (Committee member) / Atkinson, Robert (Committee member) / Arizona State University (Publisher)

Created2017

Predicting demographic and financial attributes in a bank marketing dataset

Description

Bank institutions employ several marketing strategies to maximize new customer acquisition as well as current customer retention. Telemarketing is one such approach taken where individual customers are contacted by bank representatives with offers. These telemarketing strategies can be improved in combination with data mining techniques that allow predictability…

Bank institutions employ several marketing strategies to maximize new customer acquisition as well as current customer retention. Telemarketing is one such approach taken where individual customers are contacted by bank representatives with offers. These telemarketing strategies can be improved in combination with data mining techniques that allow predictability of customer information and interests. In this thesis, bank telemarketing data from a Portuguese banking institution were analyzed to determine predictability of several client demographic and financial attributes and find most contributing factors in each. Data were preprocessed to ensure quality, and then data mining models were generated for the attributes with logistic regression, support vector machine (SVM) and random forest using Orange as the data mining tool. Results were analyzed using precision, recall and F1 score.

ContributorsEjaz, Samira (Author) / Davulcu, Hasan (Thesis advisor) / Balasooriya, Janaka (Committee member) / Candan, Kasim (Committee member) / Arizona State University (Publisher)

Created2016

Visual Event Cueing in Linked Spatiotemporal Data

Description

The media disperses a large amount of information daily pertaining to political events social movements, and societal conflicts. Media pertaining to these topics, no matter the format of publication used, are framed a particular way. Framing is used not for just guiding audiences to desired beliefs, but also to fuel…

The media disperses a large amount of information daily pertaining to political events social movements, and societal conflicts. Media pertaining to these topics, no matter the format of publication used, are framed a particular way. Framing is used not for just guiding audiences to desired beliefs, but also to fuel societal change or legitimize/delegitimize social movements. For this reason, tools that can help to clarify when changes in social discourse occur and identify their causes are of great use. This thesis presents a visual analytics framework that allows for the exploration and visualization of changes that occur in social climate with respect to space and time. Focusing on the links between data from the Armed Conflict Location and Event Data Project (ACLED) and a streaming RSS news data set, users can be cued into interesting events enabling them to form and explore hypothesis. This visual analytics framework also focuses on improving intervention detection, allowing users to hypothesize about correlations between events and happiness levels, and supports collaborative analysis.

ContributorsSteptoe, Michael (Author) / Maciejewski, Ross (Thesis advisor) / Davulcu, Hasan (Committee member) / Corman, Steven (Committee member) / Arizona State University (Publisher)

Created2017

Efficient Node Proximity and Node Significance Computations in Graphs

Description

Node proximity measures are commonly used for quantifying how nearby or otherwise related to two or more nodes in a graph are. Node significance measures are mainly used to find how much nodes are important in a graph. The measures of node proximity/significance have been highly effective in many predictions…

Node proximity measures are commonly used for quantifying how nearby or otherwise related to two or more nodes in a graph are. Node significance measures are mainly used to find how much nodes are important in a graph. The measures of node proximity/significance have been highly effective in many predictions and applications. Despite their effectiveness, however, there are various shortcomings. One such shortcoming is a scalability problem due to their high computation costs on large size graphs and another problem on the measures is low accuracy when the significance of node and its degree in the graph are not related. The other problem is that their effectiveness is less when information for a graph is uncertain. For an uncertain graph, they require exponential computation costs to calculate ranking scores with considering all possible worlds.

In this thesis, I first introduce Locality-sensitive, Re-use promoting, approximate Personalized PageRank (LR-PPR) which is an approximate personalized PageRank calculating node rankings for the locality information for seeds without calculating the entire graph and reusing the precomputed locality information for different locality combinations. For the identification of locality information, I present Impact Neighborhood Indexing (INI) to find impact neighborhoods with nodes' fingerprints propagation on the network. For the accuracy challenge, I introduce Degree Decoupled PageRank (D2PR) technique to improve the effectiveness of PageRank based knowledge discovery, especially considering the significance of neighbors and degree of a given node. To tackle the uncertain challenge, I introduce Uncertain Personalized PageRank (UPPR) to approximately compute personalized PageRank values on uncertainties of edge existence and Interval Personalized PageRank with Integration (IPPR-I) and Interval Personalized PageRank with Mean (IPPR-M) to compute ranking scores for the case when uncertainty exists on edge weights as interval values.

ContributorsKim, Jung Hyun (Author) / Candan, K. Selcuk (Thesis advisor) / Davulcu, Hasan (Committee member) / Tong, Hanghang (Committee member) / Sapino, Maria Luisa (Committee member) / Arizona State University (Publisher)

Created2017

Story detection using generalized concepts

Description

A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Generalized concepts are used to overcome this problem. Generalization may…

A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Generalized concepts are used to overcome this problem. Generalization may result into word sense disambiguation failing to find similarity. This is addressed by taking into account contextual synonyms. Concept discovery based on contextual synonyms reveal information about the semantic roles of the words leading to concepts. Merger engine generalize the concepts so that it can be used as features in learning algorithms.

ContributorsKedia, Nitesh (Author) / Davulcu, Hasan (Thesis advisor) / Corman, Steve R (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2015

Filtering by