Matching Items (9)
Filtering by

Clear all filters

151605-Thumbnail Image.png
Description
In most social networking websites, users are allowed to perform interactive activities. One of the fundamental features that these sites provide is to connecting with users of their kind. On one hand, this activity makes online connections visible and tangible; on the other hand, it enables the exploration of our

In most social networking websites, users are allowed to perform interactive activities. One of the fundamental features that these sites provide is to connecting with users of their kind. On one hand, this activity makes online connections visible and tangible; on the other hand, it enables the exploration of our connections and the expansion of our social networks easier. The aggregation of people who share common interests forms social groups, which are fundamental parts of our social lives. Social behavioral analysis at a group level is an active research area and attracts many interests from the industry. Challenges of my work mainly arise from the scale and complexity of user generated behavioral data. The multiple types of interactions, highly dynamic nature of social networking and the volatile user behavior suggest that these data are complex and big in general. Effective and efficient approaches are required to analyze and interpret such data. My work provide effective channels to help connect the like-minded and, furthermore, understand user behavior at a group level. The contributions of this dissertation are in threefold: (1) proposing novel representation of collective tagging knowledge via tag networks; (2) proposing the new information spreader identification problem in egocentric soical networks; (3) defining group profiling as a systematic approach to understanding social groups. In sum, the research proposes novel concepts and approaches for connecting the like-minded, enables the understanding of user groups, and exposes interesting research opportunities.
ContributorsWang, Xufei (Author) / Liu, Huan (Thesis advisor) / Kambhampati, Subbarao (Committee member) / Sundaram, Hari (Committee member) / Ye, Jieping (Committee member) / Arizona State University (Publisher)
Created2013
153339-Thumbnail Image.png
Description
A myriad of social media services are emerging in recent years that allow people to communicate and express themselves conveniently and easily. The pervasive use of social media generates massive data at an unprecedented rate. It becomes increasingly difficult for online users to find relevant information or, in other words,

A myriad of social media services are emerging in recent years that allow people to communicate and express themselves conveniently and easily. The pervasive use of social media generates massive data at an unprecedented rate. It becomes increasingly difficult for online users to find relevant information or, in other words, exacerbates the information overload problem. Meanwhile, users in social media can be both passive content consumers and active content producers, causing the quality of user-generated content can vary dramatically from excellence to abuse or spam, which results in a problem of information credibility. Trust, providing evidence about with whom users can trust to share information and from whom users can accept information without additional verification, plays a crucial role in helping online users collect relevant and reliable information. It has been proven to be an effective way to mitigate information overload and credibility problems and has attracted increasing attention.

As the conceptual counterpart of trust, distrust could be as important as trust and its value has been widely recognized by social sciences in the physical world. However, little attention is paid on distrust in social media. Social media differs from the physical world - (1) its data is passively observed, large-scale, incomplete, noisy and embedded with rich heterogeneous sources; and (2) distrust is generally unavailable in social media. These unique properties of social media present novel challenges for computing distrust in social media: (1) passively observed social media data does not provide necessary information social scientists use to understand distrust, how can I understand distrust in social media? (2) distrust is usually invisible in social media, how can I make invisible distrust visible by leveraging unique properties of social media data? and (3) little is known about distrust and its role in social media applications, how can distrust help make difference in social media applications?

The chief objective of this dissertation is to figure out solutions to these challenges via innovative research and novel methods. In particular, computational tasks are designed to {\it understand distrust}, a innovative task, i.e., {\it predicting distrust} is proposed with novel frameworks to make invisible distrust visible, and principled approaches are develop to {\it apply distrust} in social media applications. Since distrust is a special type of negative links, I demonstrate the generalization of properties and algorithms of distrust to negative links, i.e., {\it generalizing findings of distrust}, which greatly expands the boundaries of research of distrust and largely broadens its applications in social media.
ContributorsTang, Jiliang (Author) / Liu, Huan (Thesis advisor) / Xue, Guoliang (Committee member) / Ye, Jieping (Committee member) / Aggarwal, Charu (Committee member) / Arizona State University (Publisher)
Created2015
153478-Thumbnail Image.png
Description
US Senate is the venue of political debates where the federal bills are formed and voted. Senators show their support/opposition along the bills with their votes. This information makes it possible to extract the polarity of the senators. Similarly, blogosphere plays an increasingly important role as a forum for public

US Senate is the venue of political debates where the federal bills are formed and voted. Senators show their support/opposition along the bills with their votes. This information makes it possible to extract the polarity of the senators. Similarly, blogosphere plays an increasingly important role as a forum for public debate. Authors display sentiment toward issues, organizations or people using a natural language.

In this research, given a mixed set of senators/blogs debating on a set of political issues from opposing camps, I use signed bipartite graphs for modeling debates, and I propose an algorithm for partitioning both the opinion holders (senators or blogs) and the issues (bills or topics) comprising the debate into binary opposing camps. Simultaneously, my algorithm scales the entities on a univariate scale. Using this scale, a researcher can identify moderate and extreme senators/blogs within each camp, and polarizing versus unifying issues. Through performance evaluations I show that my proposed algorithm provides an effective solution to the problem, and performs much better than existing baseline algorithms adapted to solve this new problem. In my experiments, I used both real data from political blogosphere and US Congress records, as well as synthetic data which were obtained by varying polarization and degree distribution of the vertices of the graph to show the robustness of my algorithm.

I also applied my algorithm on all the terms of the US Senate to the date for longitudinal analysis and developed a web based interactive user interface www.PartisanScale.com to visualize the analysis.

US politics is most often polarized with respect to the left/right alignment of the entities. However, certain issues do not reflect the polarization due to political parties, but observe a split correlating to the demographics of the senators, or simply receive consensus. I propose a hierarchical clustering algorithm that identifies groups of bills that share the same polarization characteristics. I developed a web based interactive user interface www.ControversyAnalysis.com to visualize the clusters while providing a synopsis through distribution charts, word clouds, and heat maps.
ContributorsGokalp, Sedat (Author) / Davulcu, Hasan (Thesis advisor) / Sen, Arunabha (Committee member) / Liu, Huan (Committee member) / Woodward, Mark (Committee member) / Arizona State University (Publisher)
Created2015
153140-Thumbnail Image.png
Description
The rapid urban expansion has greatly extended the physical boundary of our living area, along with a large number of POIs (points of interest) being developed. A POI is a specific location (e.g., hotel, restaurant, theater, mall) that a user may find useful or interesting. When exploring the city and

The rapid urban expansion has greatly extended the physical boundary of our living area, along with a large number of POIs (points of interest) being developed. A POI is a specific location (e.g., hotel, restaurant, theater, mall) that a user may find useful or interesting. When exploring the city and neighborhood, the increasing number of POIs could enrich people's daily life, providing them with more choices of life experience than before, while at the same time also brings the problem of "curse of choices", resulting in the difficulty for a user to make a satisfied decision on "where to go" in an efficient way. Personalized POI recommendation is a task proposed on purpose of helping users filter out uninteresting POIs and reduce time in decision making, which could also benefit virtual marketing.

Developing POI recommender systems requires observation of human mobility w.r.t. real-world POIs, which is infeasible with traditional mobile data. However, the recent development of location-based social networks (LBSNs) provides such observation. Typical location-based social networking sites allow users to "check in" at POIs with smartphones, leave tips and share that experience with their online friends. The increasing number of LBSN users has generated large amounts of LBSN data, providing an unprecedented opportunity to study human mobility for personalized POI recommendation in spatial, temporal, social, and content aspects.

Different from recommender systems in other categories, e.g., movie recommendation in NetFlix, friend recommendation in dating websites, item recommendation in online shopping sites, personalized POI recommendation on LBSNs has its unique challenges due to the stochastic property of human mobility and the mobile behavior indications provided by LBSN information layout. The strong correlations between geographical POI information and other LBSN information result in three major human mobile properties, i.e., geo-social correlations, geo-temporal patterns, and geo-content indications, which are neither observed in other recommender systems, nor exploited in current POI recommendation. In this dissertation, we investigate these properties on LBSNs, and propose personalized POI recommendation models accordingly. The performance evaluated on real-world LBSN datasets validates the power of these properties in capturing user mobility, and demonstrates the ability of our models for personalized POI recommendation.
ContributorsGao, Huiji (Author) / Liu, Huan (Thesis advisor) / Xue, Guoliang (Committee member) / Ye, Jieping (Committee member) / Caverlee, James (Committee member) / Arizona State University (Publisher)
Created2014
153269-Thumbnail Image.png
Description
Social media platforms such as Twitter, Facebook, and blogs have emerged as valuable

- in fact, the de facto - virtual town halls for people to discover, report, share and

communicate with others about various types of events. These events range from

widely-known events such as the U.S Presidential debate to smaller scale,

Social media platforms such as Twitter, Facebook, and blogs have emerged as valuable

- in fact, the de facto - virtual town halls for people to discover, report, share and

communicate with others about various types of events. These events range from

widely-known events such as the U.S Presidential debate to smaller scale, local events

such as a local Halloween block party. During these events, we often witness a large

amount of commentary contributed by crowds on social media. This burst of social

media responses surges with the "second-screen" behavior and greatly enriches the

user experience when interacting with the event and people's awareness of an event.

Monitoring and analyzing this rich and continuous flow of user-generated content can

yield unprecedentedly valuable information about the event, since these responses

usually offer far more rich and powerful views about the event that mainstream news

simply could not achieve. Despite these benefits, social media also tends to be noisy,

chaotic, and overwhelming, posing challenges to users in seeking and distilling high

quality content from that noise.

In this dissertation, I explore ways to leverage social media as a source of information and analyze events based on their social media responses collectively. I develop, implement and evaluate EventRadar, an event analysis toolbox which is able to identify, enrich, and characterize events using the massive amounts of social media responses. EventRadar contains three automated, scalable tools to handle three core event analysis tasks: Event Characterization, Event Recognition, and Event Enrichment. More specifically, I develop ET-LDA, a Bayesian model and SocSent, a matrix factorization framework for handling the Event Characterization task, i.e., modeling characterizing an event in terms of its topics and its audience's response behavior (via ET-LDA), and the sentiments regarding its topics (via SocSent). I also develop DeMa, an unsupervised event detection algorithm for handling the Event Recognition task, i.e., detecting trending events from a stream of noisy social media posts. Last, I develop CrowdX, a spatial crowdsourcing system for handling the Event Enrichment task, i.e., gathering additional first hand information (e.g., photos) from the field to enrich the given event's context.

Enabled by EventRadar, it is more feasible to uncover patterns that have not been

explored previously and re-validating existing social theories with new evidence. As a

result, I am able to gain deep insights into how people respond to the event that they

are engaged in. The results reveal several key insights into people's various responding

behavior over the event's timeline such the topical context of people's tweets does not

always correlate with the timeline of the event. In addition, I also explore the factors

that affect a person's engagement with real-world events on Twitter and find that

people engage in an event because they are interested in the topics pertaining to

that event; and while engaging, their engagement is largely affected by their friends'

behavior.
ContributorsHu, Yuheng (Author) / Kambhampati, Subbarao (Thesis advisor) / Horvitz, Eric (Committee member) / Krumm, John (Committee member) / Liu, Huan (Committee member) / Sundaram, Hari (Committee member) / Arizona State University (Publisher)
Created2014
149714-Thumbnail Image.png
Description
This thesis deals with the analysis of interpersonal communication dynamics in online social networks and social media. Our central hypothesis is that communication dynamics between individuals manifest themselves via three key aspects: the information that is the content of communication, the social engagement i.e. the sociological framework emergent of the

This thesis deals with the analysis of interpersonal communication dynamics in online social networks and social media. Our central hypothesis is that communication dynamics between individuals manifest themselves via three key aspects: the information that is the content of communication, the social engagement i.e. the sociological framework emergent of the communication process, and the channel i.e. the media via which communication takes place. Communication dynamics have been of interest to researchers from multi-faceted domains over the past several decades. However, today we are faced with several modern capabilities encompassing a host of social media websites. These sites feature variegated interactional affordances, ranging from blogging, micro-blogging, sharing media elements as well as a rich set of social actions such as tagging, voting, commenting and so on. Consequently, these communication tools have begun to redefine the ways in which we exchange information, our modes of social engagement, and mechanisms of how the media characteristics impact our interactional behavior. The outcomes of this research are manifold. We present our contributions in three parts, corresponding to the three key organizing ideas. First, we have observed that user context is key to characterizing communication between a pair of individuals. However interestingly, the probability of future communication seems to be more sensitive to the context compared to the delay, which appears to be rather habitual. Further, we observe that diffusion of social actions in a network can be indicative of future information cascades; that might be attributed to social influence or homophily depending on the nature of the social action. Second, we have observed that different modes of social engagement lead to evolution of groups that have considerable predictive capability in characterizing external-world temporal occurrences, such as stock market dynamics as well as collective political sentiments. Finally, characterization of communication on rich media sites have shown that conversations that are deemed "interesting" appear to have consequential impact on the properties of the social network they are associated with: in terms of degree of participation of the individuals in future conversations, thematic diffusion as well as emergent cohesiveness in activity among the concerned participants in the network. Based on all these outcomes, we believe that this research can make significant contribution into a better understanding of how we communicate online and how it is redefining our collective sociological behavior.
ContributorsDe Choudhury, Munmun (Author) / Sundaram, Hari (Thesis advisor) / Candan, K. Selcuk (Committee member) / Liu, Huan (Committee member) / Watts, Duncan J. (Committee member) / Seligmann, Doree D. (Committee member) / Arizona State University (Publisher)
Created2011
156735-Thumbnail Image.png
Description
The popularity of social media has generated abundant large-scale social networks, which advances research on network analytics. Good representations of nodes in a network can facilitate many network mining tasks. The goal of network representation learning (network embedding) is to learn low-dimensional vector representations of social network nodes that capture

The popularity of social media has generated abundant large-scale social networks, which advances research on network analytics. Good representations of nodes in a network can facilitate many network mining tasks. The goal of network representation learning (network embedding) is to learn low-dimensional vector representations of social network nodes that capture certain properties of the networks. With the learned node representations, machine learning and data mining algorithms can be applied for network mining tasks such as link prediction and node classification. Because of its ability to learn good node representations, network representation learning is attracting increasing attention and various network embedding algorithms are proposed.

Despite the success of these network embedding methods, the majority of them are dedicated to static plain networks, i.e., networks with fixed nodes and links only; while in social media, networks can present in various formats, such as attributed networks, signed networks, dynamic networks and heterogeneous networks. These social networks contain abundant rich information to alleviate the network sparsity problem and can help learn a better network representation; while plain network embedding approaches cannot tackle such networks. For example, signed social networks can have both positive and negative links. Recent study on signed networks shows that negative links have added value in addition to positive links for many tasks such as link prediction and node classification. However, the existence of negative links challenges the principles used for plain network embedding. Thus, it is important to study signed network embedding. Furthermore, social networks can be dynamic, where new nodes and links can be introduced anytime. Dynamic networks can reveal the concept drift of a user and require efficiently updating the representation when new links or users are introduced. However, static network embedding algorithms cannot deal with dynamic networks. Therefore, it is important and challenging to propose novel algorithms for tackling different types of social networks.

In this dissertation, we investigate network representation learning in social media. In particular, we study representative social networks, which includes attributed network, signed networks, dynamic networks and document networks. We propose novel frameworks to tackle the challenges of these networks and learn representations that not only capture the network structure but also the unique properties of these social networks.
ContributorsWang, Suhang (Author) / Liu, Huan (Thesis advisor) / Aggarwal, Charu (Committee member) / Sen, Arunabha (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)
Created2018
157057-Thumbnail Image.png
Description
The pervasive use of social media gives it a crucial role in helping the public perceive reliable information. Meanwhile, the openness and timeliness of social networking sites also allow for the rapid creation and dissemination of misinformation. It becomes increasingly difficult for online users to find accurate and trustworthy information.

The pervasive use of social media gives it a crucial role in helping the public perceive reliable information. Meanwhile, the openness and timeliness of social networking sites also allow for the rapid creation and dissemination of misinformation. It becomes increasingly difficult for online users to find accurate and trustworthy information. As witnessed in recent incidents of misinformation, it escalates quickly and can impact social media users with undesirable consequences and wreak havoc instantaneously. Different from some existing research in psychology and social sciences about misinformation, social media platforms pose unprecedented challenges for misinformation detection. First, intentional spreaders of misinformation will actively disguise themselves. Second, content of misinformation may be manipulated to avoid being detected, while abundant contextual information may play a vital role in detecting it. Third, not only accuracy, earliness of a detection method is also important in containing misinformation from being viral. Fourth, social media platforms have been used as a fundamental data source for various disciplines, and these research may have been conducted in the presence of misinformation. To tackle the challenges, we focus on developing machine learning algorithms that are robust to adversarial manipulation and data scarcity.

The main objective of this dissertation is to provide a systematic study of misinformation detection in social media. To tackle the challenges of adversarial attacks, I propose adaptive detection algorithms to deal with the active manipulations of misinformation spreaders via content and networks. To facilitate content-based approaches, I analyze the contextual data of misinformation and propose to incorporate the specific contextual patterns of misinformation into a principled detection framework. Considering its rapidly growing nature, I study how misinformation can be detected at an early stage. In particular, I focus on the challenge of data scarcity and propose a novel framework to enable historical data to be utilized for emerging incidents that are seemingly irrelevant. With misinformation being viral, applications that rely on social media data face the challenge of corrupted data. To this end, I present robust statistical relational learning and personalization algorithms to minimize the negative effect of misinformation.
ContributorsWu, Liang (Author) / Liu, Huan (Thesis advisor) / Tong, Hanghang (Committee member) / Doupe, Adam (Committee member) / Davison, Brian D. (Committee member) / Arizona State University (Publisher)
Created2019
155252-Thumbnail Image.png
Description
Due to vast resources brought by social media services, social data mining has

received increasing attention in recent years. The availability of sheer amounts of

user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The abundant link information

in social networks could provide another rich source

Due to vast resources brought by social media services, social data mining has

received increasing attention in recent years. The availability of sheer amounts of

user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The abundant link information

in social networks could provide another rich source in deriving implicit information

for social data mining. However, the vast majority of existing studies overwhelmingly

focus on positive links between users while negative links are also prevailing in real-

world social networks such as distrust relations in Epinions and foe links in Slashdot.

Though recent studies show that negative links have some added value over positive

links, it is dicult to directly employ them because of its distinct characteristics from

positive interactions. Another challenge is that label information is rather limited

in social media as the labeling process requires human attention and may be very

expensive. Hence, alternative criteria are needed to guide the learning process for

many tasks such as feature selection and sentiment analysis.

To address above-mentioned issues, I study two novel problems for signed social

networks mining, (1) unsupervised feature selection in signed social networks; and

(2) unsupervised sentiment analysis with signed social networks. To tackle the first problem, I propose a novel unsupervised feature selection framework SignedFS. In

particular, I model positive and negative links simultaneously for user preference

learning, and then embed the user preference learning into feature selection. To study the second problem, I incorporate explicit sentiment signals in textual terms and

implicit sentiment signals from signed social networks into a coherent model Signed-

Senti. Empirical experiments on real-world datasets corroborate the effectiveness of

these two frameworks on the tasks of feature selection and sentiment analysis.
ContributorsCheng, Kewei (Author) / Liu, Huan (Thesis advisor) / Tong, Hanghang (Committee member) / Baral, Chitta (Committee member) / Arizona State University (Publisher)
Created2017