Matching Items (7)
Filtering by

Clear all filters

151718-Thumbnail Image.png
Description
The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is used to power two novel methods of ranking tweets by propagating the reputation over an agreement graph based on tweets' content similarity. Additionally, I show how the agreement graph helps counter tweet spam. An evaluation of my method on 16~million tweets from the TREC 2011 Microblog Dataset shows that it doubles the precision over baseline Twitter Search and achieves higher precision than current state of the art method. I present a detailed internal empirical evaluation of RAProp in comparison to several alternative approaches proposed by me, as well as external evaluation in comparison to the current state of the art method.
ContributorsRavikumar, Srijith (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Liu, Huan (Committee member) / Arizona State University (Publisher)
Created2013
Description
Twitter is a micro-blogging platform where the users can be social, informational or both. In certain cases, users generate tweets that have no "hashtags" or "@mentions"; we call it an orphaned tweet. The user will be more interested to find more "context" of an orphaned tweet presumably to engage with

Twitter is a micro-blogging platform where the users can be social, informational or both. In certain cases, users generate tweets that have no "hashtags" or "@mentions"; we call it an orphaned tweet. The user will be more interested to find more "context" of an orphaned tweet presumably to engage with his/her friend on that topic. Finding context for an Orphaned tweet manually is challenging because of larger social graph of a user , the enormous volume of tweets generated per second, topic diversity, and limited information from tweet length of 140 characters. To help the user to get the context of an orphaned tweet, this thesis aims at building a hashtag recommendation system called TweetSense, to suggest hashtags as a context or metadata for the orphaned tweets. This in turn would increase user's social engagement and impact Twitter to maintain its monthly active online users in its social network. In contrast to other existing systems, this hashtag recommendation system recommends personalized hashtags by exploiting the social signals of users in Twitter. The novelty with this system is that it emphasizes on selecting the suitable candidate set of hashtags from the related tweets of user's social graph (timeline).The system then rank them based on the combination of features scores computed from their tweet and user related features. It is evaluated based on its ability to predict suitable hashtags for a random sample of tweets whose existing hashtags are deliberately removed for evaluation. I present a detailed internal empirical evaluation of TweetSense, as well as an external evaluation in comparison with current state of the art method.
ContributorsVijayakumar, Manikandan (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2014
153447-Thumbnail Image.png
Description
The use of blogging tools in the second language classroom has been investigated from a variety of theoretical and methodological perspectives (Alm, 2009; Armstrong & Retterer, 2008; Dippold, 2009; Ducate & Lomicka, 2008; Elola & Oskoz, 2008; Jauregi & Banados, 2008; Lee, 2009; Petersen, Divitini, & Chabert, 2008; Pinkman, 2005;

The use of blogging tools in the second language classroom has been investigated from a variety of theoretical and methodological perspectives (Alm, 2009; Armstrong & Retterer, 2008; Dippold, 2009; Ducate & Lomicka, 2008; Elola & Oskoz, 2008; Jauregi & Banados, 2008; Lee, 2009; Petersen, Divitini, & Chabert, 2008; Pinkman, 2005; Raith, 2009; Soares, 2008; Sun, 2009, 2012; Vurdien, 2011; Yang, 2009) and a growing number of studies examine the use of microblogging tools for language learning (Antenos-Conforti, 2009; Borau, Ullrich, Feng, & Shen, 2009; Lomicka & Lord, 2011; Perifanou, 2009). Grounded in Cultural Historical Activity Theory (Engestrom, 1987), the present study explores the outcomes of a semester-long project based on the Bridging Activities framework (Thorne & Reinhardt, 2008) and implemented in an intermediate hybrid Spanish-language course at a large public university in Arizona, in which students used microblogging and blogging tools to collect digital texts, analyze perspectives of the target culture, and participate as part of an online community of language learners with a broader audience of native speakers. The research questions are: (1) What technology is used by the students, with what frequency and for what purposes in both English and Spanish prior to beginning the project?, (2) What are students' values and attitudes toward using Twitter and Blogger as tools for learning Spanish and how do they change over time through their use in the project during the semester course?, and (3) What tensions emerge in the activity systems of the intermediate Spanish-language students throughout the process of using Twitter and Blogger for the project? What are the underlying reasons for the tensions? How are they resolved? The data was collected using pre-, post-, and periodic surveys, which included Likert and open-ended questions, as well as the participants' microblog and blog posts. The quantitative data was analyzed using descriptive statistics and the qualitative data was analyzed to identify emerging themes following the Constant Comparative Method (Glaser & Strauss, 1967). Finally, three participant outliers were selected as case studies for activity theoretical analysis in order to identify tensions and, through their resolution, evidence of expansive learning.
ContributorsAlvarado, Margaret (Author) / Lafford, Barbara (Thesis advisor) / González, Verónica (Committee member) / Cerron-Palomino, Alvaro (Committee member) / Arizona State University (Publisher)
Created2015
153901-Thumbnail Image.png
Description
Micro-blogging platforms like Twitter have become some of the most popular sites for people to share and express their views and opinions about public events like debates, sports events or other news articles. These social updates by people complement the written news articles or transcripts of events in giving the

Micro-blogging platforms like Twitter have become some of the most popular sites for people to share and express their views and opinions about public events like debates, sports events or other news articles. These social updates by people complement the written news articles or transcripts of events in giving the popular public opinion about these events. So it would be useful to annotate the transcript with tweets. The technical challenge is to align the tweets with the correct segment of the transcript. ET-LDA by Hu et al [9] addresses this issue by modeling the whole process with an LDA-based graphical model. The system segments the transcript into coherent and meaningful parts and also determines if a tweet is a general tweet about the event or it refers to a particular segment of the transcript. One characteristic of the Hu et al’s model is that it expects all the data to be available upfront and uses batch inference procedure. But in many cases we find that data is not available beforehand, and it is often streaming. In such cases it is infeasible to repeatedly run the batch inference algorithm. My thesis presents an online inference algorithm for the ET-LDA model, with a continuous stream of tweet data and compare their runtime and performance to existing algorithms.
ContributorsAcharya, Anirudh (Author) / Kambhampati, Subbarao (Thesis advisor) / Davulcu, Hasan (Committee member) / Tong, Hanghang (Committee member) / Arizona State University (Publisher)
Created2015
153858-Thumbnail Image.png
Description
Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who

Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who has chosen to subscribe to the tweets from the originator by following the originator. Although, hashtags are used to tag tweets in an effort to attach context to the tweets, many tweets do not have a hashtag. Such tweets are called orphan tweets and they adversely affect the experience of a browser.

A hashtag is a type of label or meta-data tag used in social networks and micro-blogging services which makes it easier for users to find messages with a specific theme or content. The context of a tweet can be defined as a set of one or more hashtags. Users often do not use hashtags to tag their tweets. This leads to the problem of missing context for tweets. To address the problem of missing hashtags, a statistical method was proposed which predicts most likely hashtags based on the social circle of an originator.

In this thesis, we propose to improve on the existing context recovery system by selectively limiting the candidate set of hashtags to be derived from the intimate circle of the originator rather than from every user in the social network of the originator. This helps in reducing the computation, increasing speed of prediction, scaling the system to originators with large social networks while still preserving most of the accuracy of the predictions. We also propose to not only derive the candidate hashtags from the social network of the originator but also derive the candidate hashtags based on the content of the tweet. We further propose to learn personalized statistical models according to the adoption patterns of different originators. This helps in not only identifying the personalized candidate set of hashtags based on the social circle and content of the tweets but also in customizing the hashtag adoption pattern to the originator of the tweet.
ContributorsMallapura Umamaheshwar, Tejas (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)
Created2015
154859-Thumbnail Image.png
Description
In supervised learning, machine learning techniques can be applied to learn a model on

a small set of labeled documents which can be used to classify a larger set of unknown

documents. Machine learning techniques can be used to analyze a political scenario

in a given society. A lot of research has been

In supervised learning, machine learning techniques can be applied to learn a model on

a small set of labeled documents which can be used to classify a larger set of unknown

documents. Machine learning techniques can be used to analyze a political scenario

in a given society. A lot of research has been going on in this field to understand

the interactions of various people in the society in response to actions taken by their

organizations.

This paper talks about understanding the Russian influence on people in Latvia.

This is done by building an eeffective model learnt on initial set of documents

containing a combination of official party web-pages, important political leaders' social

networking sites. Since twitter is a micro-blogging site which allows people to post

their opinions on any topic, the model built is used for estimating the tweets sup-

porting the Russian and Latvian political organizations in Latvia. All the documents

collected for analysis are in Latvian and Russian languages which are rich in vocabulary resulting into huge number of features. Hence, feature selection techniques can

be used to reduce the vocabulary set relevant to the classification model. This thesis

provides a comparative analysis of traditional feature selection techniques and implementation of a new iterative feature selection method using EM and cross-domain

training along with supportive visualization tool. This method out performed other

feature selection methods by reducing the number of features up-to 50% along with

good model accuracy. The results from the classification are used to interpret user

behavior and their political influence patterns across organizations in Latvia using

interactive dashboard with combination of powerful widgets.
ContributorsBollapragada, Lakshmi Gayatri Niharika (Author) / Davulcu, Hasan (Thesis advisor) / Sen, Arunabha (Committee member) / Hsiao, Ihan (Committee member) / Arizona State University (Publisher)
Created2016
149464-Thumbnail Image.png
Description
Online social networks, including Twitter, have expanded in both scale and diversity of content, which has created significant challenges to the average user. These challenges include finding relevant information on a topic and building social ties with like-minded individuals. The fundamental question addressed by this thesis is if an individual

Online social networks, including Twitter, have expanded in both scale and diversity of content, which has created significant challenges to the average user. These challenges include finding relevant information on a topic and building social ties with like-minded individuals. The fundamental question addressed by this thesis is if an individual can leverage social network to search for information that is relevant to him or her. We propose to answer this question by developing computational algorithms that analyze a user's social network. The features of the social network we analyze include the network topology and member communications of a specific user's social network. Determining the "social value" of one's contacts is a valuable outcome of this research. The algorithms we developed were tested on Twitter, which is an extremely popular social network. Twitter was chosen due to its popularity and a majority of the communications artifacts on Twitter is publically available. In this work, the social network of a user refers to the "following relationship" social network. Our algorithm is not specific to Twitter, and is applicable to other social networks, where the network topology and communications are accessible. My approaches are as follows. For a user interested in using the system, I first determine the immediate social network of the user as well as the social contacts for each person in this network. Afterwards, I establish and extend the social network for each user. For each member of the social network, their tweet data are analyzed and represented by using a word distribution. To accomplish this, I use WordNet, a popular lexical database, to determine semantic similarity between two words. My mechanism of search combines both communication distance between two users and social relationships to determine the search results. Additionally, I developed a search interface, where a user can interactively query the system. I conducted preliminary user study to evaluate the quality and utility of my method and system against several baseline methods, including the default Twitter search. The experimental results from the user study indicate that my method is able to find relevant people and identify valuable contacts in one's social circle based on the query. The proposed system outperforms baseline methods in terms of standard information retrieval metrics.
ContributorsXu, Ke (Author) / Sundaram, Hari (Thesis advisor) / Ye, Jieping (Committee member) / Kelliher, Aisling (Committee member) / Arizona State University (Publisher)
Created2010