Search Content

Improved, scalable, and personalized context recovery system: E-TweetSense

Description

Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who…

Browsing Twitter users, or browsers, often find it increasingly cumbersome to attach meaning to tweets that are displayed on their timeline as they follow more and more users or pages. The tweets being browsed are created by Twitter users called originators, and are of some significance to the browser who has chosen to subscribe to the tweets from the originator by following the originator. Although, hashtags are used to tag tweets in an effort to attach context to the tweets, many tweets do not have a hashtag. Such tweets are called orphan tweets and they adversely affect the experience of a browser.

A hashtag is a type of label or meta-data tag used in social networks and micro-blogging services which makes it easier for users to find messages with a specific theme or content. The context of a tweet can be defined as a set of one or more hashtags. Users often do not use hashtags to tag their tweets. This leads to the problem of missing context for tweets. To address the problem of missing hashtags, a statistical method was proposed which predicts most likely hashtags based on the social circle of an originator.

In this thesis, we propose to improve on the existing context recovery system by selectively limiting the candidate set of hashtags to be derived from the intimate circle of the originator rather than from every user in the social network of the originator. This helps in reducing the computation, increasing speed of prediction, scaling the system to originators with large social networks while still preserving most of the accuracy of the predictions. We also propose to not only derive the candidate hashtags from the social network of the originator but also derive the candidate hashtags based on the content of the tweet. We further propose to learn personalized statistical models according to the adoption patterns of different originators. This helps in not only identifying the personalized candidate set of hashtags based on the social circle and content of the tweets but also in customizing the hashtag adoption pattern to the originator of the tweet.

ContributorsMallapura Umamaheshwar, Tejas (Author) / Kambhampati, Subbarao (Thesis advisor) / Liu, Huan (Committee member) / Davulcu, Hasan (Committee member) / Arizona State University (Publisher)

Created2015

Improving solar PV scheduling using statistical techniques

Description

The inherent intermittency in solar energy resources poses challenges to scheduling generation, transmission, and distribution systems. Energy storage devices are often used to mitigate variability in renewable asset generation and provide a mechanism to shift renewable power between periods of the day. In the absence of storage, however, time series…

The inherent intermittency in solar energy resources poses challenges to scheduling generation, transmission, and distribution systems. Energy storage devices are often used to mitigate variability in renewable asset generation and provide a mechanism to shift renewable power between periods of the day. In the absence of storage, however, time series forecasting techniques can be used to estimate future solar resource availability to improve the accuracy of solar generator scheduling. The knowledge of future solar availability helps scheduling solar generation at high-penetration levels, and assists with the selection and scheduling of spinning reserves. This study employs statistical techniques to improve the accuracy of solar resource forecasts that are in turn used to estimate solar photovoltaic (PV) power generation. The first part of the study involves time series forecasting of the global horizontal irradiation (GHI) in Phoenix, Arizona using Seasonal Autoregressive Integrated Moving Average (SARIMA) models. A comparative study is completed for time series forecasting models developed with different time step resolutions, forecasting start time, forecasting time horizons, training data, and transformations for data measured at Phoenix, Arizona. Approximately 3,000 models were generated and evaluated across the entire study. One major finding is that forecasted values one day ahead are near repeats of the preceding day—due to the 24-hour seasonal differencing—indicating that use of statistical forecasting over multiple days creates a repeating pattern. Logarithmic transform data were found to perform poorly in nearly all cases relative to untransformed or square-root transform data when forecasting out to four days. Forecasts using a logarithmic transform followed a similar profile as the immediate day prior whereas forecasts using untransformed and square-root transform data had smoother daily solar profiles that better represented the average intraday profile. Error values were generally lower during mornings and evenings and higher during midday. Regarding one-day forecasting and shorter forecasting horizons, the logarithmic transformation performed better than untransformed data and square-root transformed data irrespective of forecast horizon for data resolutions of 1-hour, 30-minutes, and 15-minutes.

ContributorsSoundiah Regunathan Rajasekaran, Dhiwaakar Purusothaman (Author) / Johnson, Nathan G (Thesis advisor) / Karady, George G. (Thesis advisor) / Ayyanar, Raja (Committee member) / Arizona State University (Publisher)

Created2016

Predicting demographic and financial attributes in a bank marketing dataset

Description

Bank institutions employ several marketing strategies to maximize new customer acquisition as well as current customer retention. Telemarketing is one such approach taken where individual customers are contacted by bank representatives with offers. These telemarketing strategies can be improved in combination with data mining techniques that allow predictability…

Bank institutions employ several marketing strategies to maximize new customer acquisition as well as current customer retention. Telemarketing is one such approach taken where individual customers are contacted by bank representatives with offers. These telemarketing strategies can be improved in combination with data mining techniques that allow predictability of customer information and interests. In this thesis, bank telemarketing data from a Portuguese banking institution were analyzed to determine predictability of several client demographic and financial attributes and find most contributing factors in each. Data were preprocessed to ensure quality, and then data mining models were generated for the attributes with logistic regression, support vector machine (SVM) and random forest using Orange as the data mining tool. Results were analyzed using precision, recall and F1 score.

ContributorsEjaz, Samira (Author) / Davulcu, Hasan (Thesis advisor) / Balasooriya, Janaka (Committee member) / Candan, Kasim (Committee member) / Arizona State University (Publisher)

Created2016

Visual Event Cueing in Linked Spatiotemporal Data

Description

The media disperses a large amount of information daily pertaining to political events social movements, and societal conflicts. Media pertaining to these topics, no matter the format of publication used, are framed a particular way. Framing is used not for just guiding audiences to desired beliefs, but also to fuel…

The media disperses a large amount of information daily pertaining to political events social movements, and societal conflicts. Media pertaining to these topics, no matter the format of publication used, are framed a particular way. Framing is used not for just guiding audiences to desired beliefs, but also to fuel societal change or legitimize/delegitimize social movements. For this reason, tools that can help to clarify when changes in social discourse occur and identify their causes are of great use. This thesis presents a visual analytics framework that allows for the exploration and visualization of changes that occur in social climate with respect to space and time. Focusing on the links between data from the Armed Conflict Location and Event Data Project (ACLED) and a streaming RSS news data set, users can be cued into interesting events enabling them to form and explore hypothesis. This visual analytics framework also focuses on improving intervention detection, allowing users to hypothesize about correlations between events and happiness levels, and supports collaborative analysis.

ContributorsSteptoe, Michael (Author) / Maciejewski, Ross (Thesis advisor) / Davulcu, Hasan (Committee member) / Corman, Steven (Committee member) / Arizona State University (Publisher)

Created2017

The effect of racial microaggressions on Latinas: student perceptions, reactions, and coping mechanisms

Description

Interpersonal racial discrimination is positively associated with poor mental health outcomes in a number of marginalized groups across the United States (Brondolo, et al., 2008). This paper examines how racial discrimination affects the self-esteem, self-worth, and racial pride of Latinas using interview data from a purposive sample of students. The…

Interpersonal racial discrimination is positively associated with poor mental health outcomes in a number of marginalized groups across the United States (Brondolo, et al., 2008). This paper examines how racial discrimination affects the self-esteem, self-worth, and racial pride of Latinas using interview data from a purposive sample of students. The objectives of this study are: (a) to better understand the effects of racial microaggressions on young Latinas’ construction of self, (b) to explicate how these self-perceptions influence deviant behavior and maladaptive thought processes, drawing on strain and discrimination literatures, and (c) to examine the protective mechanisms Latinas employ with friends and family as a response to racial discrimination. Findings indicated that respondents experienced racial discrimination through a variety of channels, from negative stereotypes to feeling a distinct prejudice in academic settings. Participants utilized numerous coping mechanisms to deal with such encounters, most of which emphasized the importance of drawing strength from Hispanic values, culture, and language during times of adversity.

ContributorsBarstow, Callie Elizabeth (Author) / Burt, Callie (Thesis advisor) / Decker, Scott (Committee member) / Wang, Xia (Committee member) / Arizona State University (Publisher)

Created2015

Story detection using generalized concepts

Description

A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Generalized concepts are used to overcome this problem. Generalization may…

A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Generalized concepts are used to overcome this problem. Generalization may result into word sense disambiguation failing to find similarity. This is addressed by taking into account contextual synonyms. Concept discovery based on contextual synonyms reveal information about the semantic roles of the words leading to concepts. Merger engine generalize the concepts so that it can be used as features in learning algorithms.

ContributorsKedia, Nitesh (Author) / Davulcu, Hasan (Thesis advisor) / Corman, Steve R (Committee member) / Li, Baoxin (Committee member) / Arizona State University (Publisher)

Created2015

Filtering by

Improved, scalable, and personalized context recovery system: E-TweetSense

Improving solar PV scheduling using statistical techniques

Predicting demographic and financial attributes in a bank marketing dataset

Visual Event Cueing in Linked Spatiotemporal Data

The effect of racial microaggressions on Latinas: student perceptions, reactions, and coping mechanisms

Story detection using generalized concepts