Matching Items (5)

155252-Thumbnail Image.png

Mining signed social networks using unsupervised learning algorithms

Description

Due to vast resources brought by social media services, social data mining has

received increasing attention in recent years. The availability of sheer amounts of

user-generated data presents data scientists both opportunities

Due to vast resources brought by social media services, social data mining has

received increasing attention in recent years. The availability of sheer amounts of

user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The abundant link information

in social networks could provide another rich source in deriving implicit information

for social data mining. However, the vast majority of existing studies overwhelmingly

focus on positive links between users while negative links are also prevailing in real-

world social networks such as distrust relations in Epinions and foe links in Slashdot.

Though recent studies show that negative links have some added value over positive

links, it is dicult to directly employ them because of its distinct characteristics from

positive interactions. Another challenge is that label information is rather limited

in social media as the labeling process requires human attention and may be very

expensive. Hence, alternative criteria are needed to guide the learning process for

many tasks such as feature selection and sentiment analysis.

To address above-mentioned issues, I study two novel problems for signed social

networks mining, (1) unsupervised feature selection in signed social networks; and

(2) unsupervised sentiment analysis with signed social networks. To tackle the first problem, I propose a novel unsupervised feature selection framework SignedFS. In

particular, I model positive and negative links simultaneously for user preference

learning, and then embed the user preference learning into feature selection. To study the second problem, I incorporate explicit sentiment signals in textual terms and

implicit sentiment signals from signed social networks into a coherent model Signed-

Senti. Empirical experiments on real-world datasets corroborate the effectiveness of

these two frameworks on the tasks of feature selection and sentiment analysis.

Contributors

Agent

Created

Date Created
  • 2017

157810-Thumbnail Image.png

Three Facets of Online Political Networks: Communities, Antagonisms, and Polarization

Description

Millions of users leave digital traces of their political engagements on social media platforms every day. Users form networks of interactions, produce textual content, like and share each others' content.

Millions of users leave digital traces of their political engagements on social media platforms every day. Users form networks of interactions, produce textual content, like and share each others' content. This creates an invaluable opportunity to better understand the political engagements of internet users. In this proposal, I present three algorithmic solutions to three facets of online political networks; namely, detection of communities, antagonisms and the impact of certain types of accounts on political polarization. First, I develop a multi-view community detection algorithm to find politically pure communities. I find that word usage among other content types (i.e. hashtags, URLs) complement user interactions the best in accurately detecting communities.

Second, I focus on detecting negative linkages between politically motivated social media users. Major social media platforms do not facilitate their users with built-in negative interaction options. However, many political network analysis tasks rely on not only positive but also negative linkages. Here, I present the SocLSFact framework to detect negative linkages among social media users. It utilizes three pieces of information; sentiment cues of textual interactions, positive interactions, and socially balanced triads. I evaluate the contribution of each three aspects in negative link detection performance on multiple tasks.

Third, I propose an experimental setup that quantifies the polarization impact of automated accounts on Twitter retweet networks. I focus on a dataset of tragic Parkland shooting event and its aftermath. I show that when automated accounts are removed from the retweet network the network polarization decrease significantly, while a same number of accounts to the automated accounts are removed randomly the difference is not significant. I also find that prominent predictors of engagement of automatically generated content is not very different than what previous studies point out in general engaging content on social media. Last but not least, I identify accounts which self-disclose their automated nature in their profile by using expressions such as bot, chat-bot, or robot. I find that human engagement to self-disclosing accounts compared to non-disclosing automated accounts is much smaller. This observational finding can motivate further efforts into automated account detection research to prevent their unintended impact.

Contributors

Agent

Created

Date Created
  • 2019

157308-Thumbnail Image.png

Image-based process monitoring via generative adversarial autoencoder with applications to rolling defect detection

Description

Image-based process monitoring has recently attracted increasing attention due to the advancement of the sensing technologies. However, existing process monitoring methods fail to fully utilize the spatial information of images

Image-based process monitoring has recently attracted increasing attention due to the advancement of the sensing technologies. However, existing process monitoring methods fail to fully utilize the spatial information of images due to their complex characteristics including the high dimensionality and complex spatial structures. Recent advancement of the unsupervised deep models such as a generative adversarial network (GAN) and generative adversarial autoencoder (AAE) has enabled to learn the complex spatial structures automatically. Inspired by this advancement, we propose an anomaly detection framework based on the AAE for unsupervised anomaly detection for images. AAE combines the power of GAN with the variational autoencoder, which serves as a nonlinear dimension reduction technique with regularization from the discriminator. Based on this, we propose a monitoring statistic efficiently capturing the change of the image data. The performance of the proposed AAE-based anomaly detection algorithm is validated through a simulation study and real case study for rolling defect detection.

Contributors

Agent

Created

Date Created
  • 2019

157818-Thumbnail Image.png

Unsupervised Attributed Graph Learning: Models and Applications

Description

Graph is a ubiquitous data structure, which appears in a broad range of real-world scenarios. Accordingly, there has been a surge of research to represent and learn from graphs in

Graph is a ubiquitous data structure, which appears in a broad range of real-world scenarios. Accordingly, there has been a surge of research to represent and learn from graphs in order to accomplish various machine learning and graph analysis tasks. However, most of these efforts only utilize the graph structure while nodes in real-world graphs usually come with a rich set of attributes. Typical examples of such nodes and their attributes are users and their profiles in social networks, scientific articles and their content in citation networks, protein molecules and their gene sets in biological networks as well as web pages and their content on the Web. Utilizing node features in such graphs---attributed graphs---can alleviate the graph sparsity problem and help explain various phenomena (e.g., the motives behind the formation of communities in social networks). Therefore, further study of attributed graphs is required to take full advantage of node attributes.

In the wild, attributed graphs are usually unlabeled. Moreover, annotating data is an expensive and time-consuming process, which suffers from many limitations such as annotators’ subjectivity, reproducibility, and consistency. The challenges of data annotation and the growing increase of unlabeled attributed graphs in various real-world applications significantly demand unsupervised learning for attributed graphs.

In this dissertation, I propose a set of novel models to learn from attributed graphs in an unsupervised manner. To better understand and represent nodes and communities in attributed graphs, I present different models in node and community levels. In node level, I utilize node features as well as the graph structure in attributed graphs to learn distributed representations of nodes, which can be useful in a variety of downstream machine learning applications. In community level, with a focus on social media, I take advantage of both node attributes and the graph structure to discover not only communities but also their sentiment-driven profiles and inter-community relations (i.e., alliance, antagonism, or no relation). The discovered community profiles and relations help to better understand the structure and dynamics of social media.

Contributors

Agent

Created

Date Created
  • 2019

154762-Thumbnail Image.png

Continuous assessment in agile learning using visualizations and clustering of activity data to analyze student behavior

Description

Software engineering education today is a technologically advanced and rapidly evolving discipline. Being a discipline where students not only design but also build new technology, it is important that they

Software engineering education today is a technologically advanced and rapidly evolving discipline. Being a discipline where students not only design but also build new technology, it is important that they receive a hands on learning experience in the form of project based courses. To maximize the learning benefit, students must conduct project-based learning activities in a consistent rhythm, or cadence. Project-based courses that are augmented with a system of frequent, formative feedback helps students constantly evaluate their progress and leads them away from a deadline driven approach to learning.

One aspect of this research is focused on evaluating the use of a tool that tracks student activity as a means of providing frequent, formative feedback. This thesis measures the impact of the tool on student compliance to the learning process. A personalized dashboard with quasi real time visual reports and notifications are provided to undergraduate and graduate software engineering students. The impact of these visual reports on compliance is measured using the log traces of dashboard activity and a survey instrument given multiple times during the course.

A second aspect of this research is the application of learning analytics to understand patterns of student compliance. This research employs unsupervised machine learning algorithms to identify unique patterns of student behavior observed in the context of a project-based course. Analyzing and labeling these unique patterns of behavior can help instructors understand typical student characteristics. Further, understanding these behavioral patterns can assist an instructor in making timely, targeted interventions. In this research, datasets comprising of student’s daily activity and graded scores from an under graduate software engineering course is utilized for the purpose of identifying unique patterns of student behavior.

Contributors

Agent

Created

Date Created
  • 2016