Filtering by
- All Subjects: Machine Learning
- All Subjects: Cyberbullying
- All Subjects: Semantic computing
Background: Cyberbullying and cyber-victimization are rising problems and are associated with increased risk for mental health problems in children. Methods for addressing cyberbullying are limited, however, interventions focused on promoting appropriate parental mediation strategies are a promising solution supported by evidence and by guided by the Theory of Parenting Styles.
Objective: To provide an educational session to parents of middle school students that promotes effective methods of preventing and addressing cyberbullying incidents. Design: The educational sessions were provided to eight parents middle school student. Surveys to assess parent perception of and planned response to cyberbullying incidents and Parent Adolescent Communication Scale (PACS) scores were collected pre-presentation, post-presentation, and at one-month follow up.
Results: Data analysis of pre- and post-presentation PACS using a Wilcoxon test found no significant difference (Z = -.405, p >.05). There was not enough response to the 1-month follow-up to perform a data analysis on follow-up data.
Conclusions: Due to low attendance and participation in the follow-up survey the results of this project are limited. However, parents did appear to benefit from communicating concerns about cyberbullying with school officials. Future studies should examine if a school-wide anti-cyberbullying program that actively involves parents effects parental response to cyberbullying.
The main objective of this dissertation is to provide a systematic study of misinformation detection in social media. To tackle the challenges of adversarial attacks, I propose adaptive detection algorithms to deal with the active manipulations of misinformation spreaders via content and networks. To facilitate content-based approaches, I analyze the contextual data of misinformation and propose to incorporate the specific contextual patterns of misinformation into a principled detection framework. Considering its rapidly growing nature, I study how misinformation can be detected at an early stage. In particular, I focus on the challenge of data scarcity and propose a novel framework to enable historical data to be utilized for emerging incidents that are seemingly irrelevant. With misinformation being viral, applications that rely on social media data face the challenge of corrupted data. To this end, I present robust statistical relational learning and personalization algorithms to minimize the negative effect of misinformation.
Many video feature extraction algorithms have been purposed, such as STIP, HOG3D, and Dense Trajectories. These algorithms are often referred to as “handcrafted” features as they were deliberately designed based on some reasonable considerations. However, these algorithms may fail when dealing with high-level tasks or complex scene videos. Due to the success of using deep convolution neural networks (CNNs) to extract global representations for static images, researchers have been using similar techniques to tackle video contents. Typical techniques first extract spatial features by processing raw images using deep convolution architectures designed for static image classifications. Then simple average, concatenation or classifier-based fusion/pooling methods are applied to the extracted features. I argue that features extracted in such ways do not acquire enough representative information since videos, unlike images, should be characterized as a temporal sequence of semantically coherent visual contents and thus need to be represented in a manner considering both semantic and spatio-temporal information.
In this thesis, I propose a novel architecture to learn semantic spatio-temporal embedding for videos to support high-level video analysis. The proposed method encodes video spatial and temporal information separately by employing a deep architecture consisting of two channels of convolutional neural networks (capturing appearance and local motion) followed by their corresponding Fully Connected Gated Recurrent Unit (FC-GRU) encoders for capturing longer-term temporal structure of the CNN features. The resultant spatio-temporal representation (a vector) is used to learn a mapping via a Fully Connected Multilayer Perceptron (FC-MLP) to the word2vec semantic embedding space, leading to a semantic interpretation of the video vector that supports high-level analysis. I evaluate the usefulness and effectiveness of this new video representation by conducting experiments on action recognition, zero-shot video classification, and semantic video retrieval (word-to-video) retrieval, using the UCF101 action recognition dataset.
received increasing attention in recent years. The availability of sheer amounts of
user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The abundant link information
in social networks could provide another rich source in deriving implicit information
for social data mining. However, the vast majority of existing studies overwhelmingly
focus on positive links between users while negative links are also prevailing in real-
world social networks such as distrust relations in Epinions and foe links in Slashdot.
Though recent studies show that negative links have some added value over positive
links, it is dicult to directly employ them because of its distinct characteristics from
positive interactions. Another challenge is that label information is rather limited
in social media as the labeling process requires human attention and may be very
expensive. Hence, alternative criteria are needed to guide the learning process for
many tasks such as feature selection and sentiment analysis.
To address above-mentioned issues, I study two novel problems for signed social
networks mining, (1) unsupervised feature selection in signed social networks; and
(2) unsupervised sentiment analysis with signed social networks. To tackle the first problem, I propose a novel unsupervised feature selection framework SignedFS. In
particular, I model positive and negative links simultaneously for user preference
learning, and then embed the user preference learning into feature selection. To study the second problem, I incorporate explicit sentiment signals in textual terms and
implicit sentiment signals from signed social networks into a coherent model Signed-
Senti. Empirical experiments on real-world datasets corroborate the effectiveness of
these two frameworks on the tasks of feature selection and sentiment analysis.
Protecting users' privacy while preserving utility for user-generated data is a challenging task. The reason is that users generate different types of data such as Web browsing histories, user-item interactions, and textual information. This data is heterogeneous, unstructured, noisy, and inherently different from relational and tabular data and thus requires quantifying users' privacy and utility in each context separately. In this dissertation, I investigate four aspects of protecting user privacy for user-generated data. First, a novel adversarial technique is introduced to assay privacy risks in heterogeneous user-generated data. Second, a novel framework is proposed to boost users' privacy while retaining high utility for Web browsing histories. Third, a privacy-aware recommendation system is developed to protect privacy w.r.t. the rich user-item interaction data by recommending relevant and privacy-preserving items. Fourth, a privacy-preserving framework for text representation learning is presented to safeguard user-generated textual data as it can reveal private information.
To address the above mentioned challenges, in this dissertation I investigate the propagation of online malicious information from two broad perspectives: (1) content posted by users and (2) information cascades formed by resharing mechanisms in social media. More specifically, first, non-parametric and semi-supervised learning algorithms are introduced to discern potential patterns of human trafficking activities that are of high interest to law enforcement. Second, a time-decay causality-based framework is introduced for early detection of “Pathogenic Social Media (PSM)” accounts (e.g., terrorist supporters). Third, due to the lack of sufficient annotated data for training PSM detection approaches, a semi-supervised causal framework is proposed that utilizes causal-related attributes from unlabeled instances to compensate for the lack of enough labeled data. Fourth, a feature-driven approach for PSM detection is introduced that leverages different sets of attributes from users’ causal activities, account-level and content-related information as well as those from URLs shared by users.