Filtering by
- All Subjects: Machine Learning
- Resource Type: Text
This thesis develops a unique type of textual features that generalize triplets extracted from text, by clustering them into high-level concepts. These concepts are utilized as features to detect frames in text. Compared to uni-gram and bi-gram based models, classification and clustering using generalized concepts yield better discriminating features and a higher classification accuracy with a 12% boost (i.e. from 74% to 83% F-measure) and 0.91 clustering purity for Frame/Non-Frame detection.
The automatic discovery of complex causal chains among interlinked events and their participating actors has not yet been thoroughly studied. Previous studies related to extracting causal relationships from text were based on laborious and incomplete hand-developed lists of explicit causal verbs, such as “causes" and “results in." Such approaches result in limited recall because standard causal verbs may not generalize well to accommodate surface variations in texts when different keywords and phrases are used to express similar causal effects. Therefore, I present a system that utilizes generalized concepts to extract causal relationships. The proposed algorithms overcome surface variations in written expressions of causal relationships and discover the domino effects between climate events and human security. This semi-supervised approach alleviates the need for labor intensive keyword list development and annotated datasets. Experimental evaluations by domain experts achieve an average precision of 82%. Qualitative assessments of causal chains show that results are consistent with the 2014 IPCC report illuminating causal mechanisms underlying the linkages between climatic stresses and social instability.
Despite this uniqueness, almost no scientific work has been performed on this public social network. Thus, it is unclear what user interaction features present on other social networks exist on Twitch. Investigating the interactions between users and identifying which, if any, of the common user behaviors on social network exist on Twitch is an important step in understanding how Twitch fits in to the social media ecosystem. For example, there are users that have large followings on Twitch and amass a large number of viewers, but do those users exert influence over the behavior of other user the way that popular users on Twitter do?
This task, however, will not be trivial. The same hyper-focus on live content that makes Twitch unique in the social network space invalidates many of the traditional approaches to social network analysis. Thus, new algorithms and techniques must be developed in order to tap this data source. In this thesis, a novel algorithm for finding games whose releases have made a significant impact on the network is described as well as a novel algorithm for detecting and identifying influential players of games. In addition, the Twitch network is described in detail along with the data that was collected in order to power the two previously described algorithms.
Social media data opens the door to interdisciplinary research and allows one to collectively study large-scale human behaviors otherwise impossible. For example, user engagements over information such as news articles, including posting about, commenting on, or recommending the news on social media, contain abundant rich information. Since social media data is big, incomplete, noisy, unstructured, with abundant social relations, solely relying on user engagements can be sensitive to noisy user feedback. To alleviate the problem of limited labeled data, it is important to combine contents and this new (but weak) type of information as supervision signals, i.e., weak social supervision, to advance fake news detection.
The goal of this dissertation is to understand disinformation by proposing and exploiting weak social supervision for learning with little labeled data and effectively detect disinformation via innovative research and novel computational methods. In particular, I investigate learning with weak social supervision for understanding disinformation with the following computational tasks: bringing the heterogeneous social context as auxiliary information for effective fake news detection; discovering explanations of fake news from social media for explainable fake news detection; modeling multi-source of weak social supervision for early fake news detection; and transferring knowledge across domains with adversarial machine learning for cross-domain fake news detection. The findings of the dissertation significantly expand the boundaries of disinformation research and establish a novel paradigm of learning with weak social supervision that has important implications in broad applications in social media.
The focus of my honors thesis is to find ways to use deep learning in tandem with tools in statistical mechanics to derive new ways to solve problems in biophysics. More specifically, I’ve been interested in finding transition pathways between two known states of a biomolecule. This is because understanding the mechanisms in which proteins fold and ligands bind is crucial to creating new medicines and understanding biological processes. In this thesis, I work with individuals in the Singharoy lab to develop a formulation to utilize reinforcement learning and sampling-based robotics planning to derive low free energy transition pathways between two known states. Our formulation uses Jarzynski’s equality and the stiff-spring approximation to obtain point estimates of energy, and construct an informed path search with atomistic resolution. At the core of this framework, is our first ever attempt we use a policy driven adaptive steered molecular dynamics (SMD) to control our molecular dynamics simulations. We show that both the reinforcement learning (RL) and robotics planning realization of the RL-guided framework can solve for pathways on toy analytical surfaces and alanine dipeptide.