Matching Items (10)

Filtering by

Clear all filters

152337-Thumbnail Image.png

Study of an epidemic multiple behavior diffusion model in a resource constrained social network

Description

In contemporary society, sustainability and public well-being have been pressing challenges. Some of the important questions are:how can sustainable practices, such as reducing carbon emission, be encouraged? , How can a healthy lifestyle be maintained?Even though individuals are interested, they

In contemporary society, sustainability and public well-being have been pressing challenges. Some of the important questions are:how can sustainable practices, such as reducing carbon emission, be encouraged? , How can a healthy lifestyle be maintained?Even though individuals are interested, they are unable to adopt these behaviors due to resource constraints. Developing a framework to enable cooperative behavior adoption and to sustain it for a long period of time is a major challenge. As a part of developing this framework, I am focusing on methods to understand behavior diffusion over time. Facilitating behavior diffusion with resource constraints in a large population is qualitatively different from promoting cooperation in small groups. Previous work in social sciences has derived conditions for sustainable cooperative behavior in small homogeneous groups. However, how groups of individuals having resource constraint co-operate over extended periods of time is not well understood, and is the focus of my thesis. I develop models to analyze behavior diffusion over time through the lens of epidemic models with the condition that individuals have resource constraint. I introduce an epidemic model SVRS ( Susceptible-Volatile-Recovered-Susceptible) to accommodate multiple behavior adoption. I investigate the longitudinal effects of behavior diffusion by varying different properties of an individual such as resources,threshold and cost of behavior adoption. I also consider how behavior adoption of an individual varies with her knowledge of global adoption. I evaluate my models on several synthetic topologies like complete regular graph, preferential attachment and small-world and make some interesting observations. Periodic injection of early adopters can help in boosting the spread of behaviors and sustain it for a longer period of time. Also, behavior propagation for the classical epidemic model SIRS (Susceptible-Infected-Recovered-Susceptible) does not continue for an infinite period of time as per conventional wisdom. One interesting future direction is to investigate how behavior adoption is affected when number of individuals in a network changes. The affects on behavior adoption when availability of behavior changes with time can also be examined.

Contributors

Agent

Created

Date Created
2013

151627-Thumbnail Image.png

A semantic triplet based story classifier

Description

Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business",

Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine learning approach is followed, in which a module is first trained with pre-classified training data and then class of test data is predicted. Good feature extraction is an important step in the machine learning approach and hence the main component of this text classifier is semantic triplet based features in addition to traditional features like standard keyword based features and statistical features based on shallow-parsing (such as density of POS tags and named entities). Triplet {Subject, Verb, Object} in a sentence is defined as a relation between subject and object, the relation being the predicate (verb). Triplet extraction process, is a 5 step process which takes input corpus as a web text document(s), each consisting of one or many paragraphs, from RSS feeds to lists of extremist website. Input corpus feeds into the "Pronoun Resolution" step, which uses an heuristic approach to identify the noun phrases referenced by the pronouns. The next step "SRL Parser" is a shallow semantic parser and converts the incoming pronoun resolved paragraphs into annotated predicate argument format. The output of SRL parser is processed by "Triplet Extractor" algorithm which forms the triplet in the form {Subject, Verb, Object}. Generalization and reduction of triplet features is the next step. Reduced feature representation reduces computing time, yields better discriminatory behavior and handles curse of dimensionality phenomena. For training and testing, a ten- fold cross validation approach is followed. In each round SVM classifier is trained with 90% of labeled (training) data and in the testing phase, classes of remaining 10% unlabeled (testing) data are predicted. Concluding, this paper proposes a model with semantic triplet based features for story classification. The effectiveness of the model is demonstrated against other traditional features used in the literature for text classification tasks.

Contributors

Agent

Created

Date Created
2013

151323-Thumbnail Image.png

The interpersonal determinants of green purchasing: an assessment of the empirical record

Description

This study investigates how well prominent behavioral theories from social psychology explain green purchasing behavior (GPB). I assess three prominent theories in terms of their suitability for GPB research, their attractiveness to GPB empiricists, and the strength of their empirical

This study investigates how well prominent behavioral theories from social psychology explain green purchasing behavior (GPB). I assess three prominent theories in terms of their suitability for GPB research, their attractiveness to GPB empiricists, and the strength of their empirical evidence when applied to GPB. First, a qualitative assessment of the Theory of Planned Behavior (TPB), Norm Activation Theory (NAT), and Value-Belief-Norm Theory (VBN) is conducted to evaluate a) how well the phenomenon and concepts in each theory match the characteristics of pro-environmental behavior and b) how well the assumptions made in each theory match common assumptions made in purchasing theory. Second, a quantitative assessment of these three theories is conducted in which r2 values and methodological parameters (e.g., sample size) are collected from a sample of 21 empirical studies on GPB to evaluate the accuracy and generalize-ability of empirical evidence. In the qualitative assessment, the results show each theory has its advantages and disadvantages. The results also provide a theoretically-grounded roadmap for modifying each theory to be more suitable for GPB research. In the quantitative assessment, the TPB outperforms the other two theories in every aspect taken into consideration. It proves to 1) create the most accurate models 2) be supported by the most generalize-able empirical evidence and 3) be the most attractive theory to empiricists. Although the TPB establishes itself as the best foundational theory for an empiricist to start from, it's clear that a more comprehensive model is needed to achieve consistent results and improve our understanding of GPB. NAT and the Theory of Interpersonal Behavior (TIB) offer pathways to extend the TPB. The TIB seems particularly apt for this endeavor, while VBN does not appear to have much to offer. Overall, the TPB has already proven to hold a relatively high predictive value. But with the state of ecosystem services continuing to decline on a global scale, it's important for models of GPB to become more accurate and reliable. Better models have the capacity to help marketing professionals, product developers, and policy makers develop strategies for encouraging consumers to buy green products.

Contributors

Agent

Created

Date Created
2012

153595-Thumbnail Image.png

Story detection using generalized concepts

Description

A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns.

A major challenge in automated text analysis is that different words are used for related concepts. Analyzing text at the surface level would treat related concepts (i.e. actors, actions, targets, and victims) as different objects, potentially missing common narrative patterns. Generalized concepts are used to overcome this problem. Generalization may result into word sense disambiguation failing to find similarity. This is addressed by taking into account contextual synonyms. Concept discovery based on contextual synonyms reveal information about the semantic roles of the words leading to concepts. Merger engine generalize the concepts so that it can be used as features in learning algorithms.

Contributors

Agent

Created

Date Created
2015

154641-Thumbnail Image.png

A timeline extraction approach to derive drug usage patterns in pregnant women using social media

Description

Proliferation of social media websites and discussion forums in the last decade has resulted in social media mining emerging as an effective mechanism to extract consumer patterns. Most research on social media and pharmacovigilance have concentrated on

Adverse Drug Reaction (ADR)

Proliferation of social media websites and discussion forums in the last decade has resulted in social media mining emerging as an effective mechanism to extract consumer patterns. Most research on social media and pharmacovigilance have concentrated on

Adverse Drug Reaction (ADR) identification. Such methods employ a step of drug search followed by classification of the associated text as consisting an ADR or not. Although this method works efficiently for ADR classifications, if ADR evidence is present in users posts over time, drug mentions fail to capture such ADRs. It also fails to record additional user information which may provide an opportunity to perform an in-depth analysis for lifestyle habits and possible reasons for any medical problems.

Pre-market clinical trials for drugs generally do not include pregnant women, and so their effects on pregnancy outcomes are not discovered early. This thesis presents a thorough, alternative strategy for assessing the safety profiles of drugs during pregnancy by utilizing user timelines from social media. I explore the use of a variety of state-of-the-art social media mining techniques, including rule-based and machine learning techniques, to identify pregnant women, monitor their drug usage patterns, categorize their birth outcomes, and attempt to discover associations between drugs and bad birth outcomes.

The technique used models user timelines as longitudinal patient networks, which provide us with a variety of key information about pregnancy, drug usage, and post-

birth reactions. I evaluate the distinct parts of the pipeline separately, validating the usefulness of each step. The approach to use user timelines in this fashion has produced very encouraging results, and can be employed for a range of other important tasks where users/patients are required to be followed over time to derive population-based measures.

Contributors

Agent

Created

Date Created
2016

154756-Thumbnail Image.png

Sentiment analysis for long-term stock prediction

Description

There have been extensive research in how news and twitter feeds can affect the outcome of a given stock. However, a majority of this research has studied the short term effects of sentiment with a given stock price.

There have been extensive research in how news and twitter feeds can affect the outcome of a given stock. However, a majority of this research has studied the short term effects of sentiment with a given stock price. Within this research, I studied the long-term effects of a given stock price using fundamental analysis techniques. Within this research, I collected both sentiment data and fundamental data for Apple Inc., Microsoft Corp., and Peabody Energy Corp. Using a neural network algorithm, I found that sentiment does have an effect on the annual growth of these companies but the fundamentals are more relevant when determining overall growth. The stocks which show more consistent growth hold more importance on the previous year’s stock price but companies which have less consistency in their growth showed more reliance on the revenue growth and sentiment on the overall company and CEO. I discuss how I collected my research data and used a multi-layered perceptron to predict a threshold growth of a given stock. The threshold used for this particular research was 10%. I then showed the prediction of this threshold using my perceptron and afterwards, perform an f anova test on my choice of features. The results showed the fundamentals being the better predictor of stock information but fundamentals came in a close second in several cases, proving sentiment does hold an effect over long term growth.

Contributors

Agent

Created

Date Created
2016

156297-Thumbnail Image.png

Detecting Political Framing Shifts and the Adversarial Phrases within\\ Rival Factions and Ranking Temporal Snapshot Contents in Social Media

Description

Social Computing is an area of computer science concerned with dynamics of communities and cultures, created through computer-mediated social interaction. Various social media platforms, such as social network services and microblogging, enable users to come together and create social movements

Social Computing is an area of computer science concerned with dynamics of communities and cultures, created through computer-mediated social interaction. Various social media platforms, such as social network services and microblogging, enable users to come together and create social movements expressing their opinions on diverse sets of issues, events, complaints, grievances, and goals. Methods for monitoring and summarizing these types of sociopolitical trends, its leaders and followers, messages, and dynamics are needed. In this dissertation, a framework comprising of community and content-based computational methods is presented to provide insights for multilingual and noisy political social media content. First, a model is developed to predict the emergence of viral hashtag breakouts, using network features. Next, another model is developed to detect and compare individual and organizational accounts, by using a set of domain and language-independent features. The third model exposes contentious issues, driving reactionary dynamics between opposing camps. The fourth model develops community detection and visualization methods to reveal underlying dynamics and key messages that drive dynamics. The final model presents a use case methodology for detecting and monitoring foreign influence, wherein a state actor and news media under its control attempt to shift public opinion by framing information to support multiple adversarial narratives that facilitate their goals. In each case, a discussion of novel aspects and contributions of the models is presented, as well as quantitative and qualitative evaluations. An analysis of multiple conflict situations will be conducted, covering areas in the UK, Bangladesh, Libya and the Ukraine where adversarial framing lead to polarization, declines in social cohesion, social unrest, and even civil wars (e.g., Libya and the Ukraine).

Contributors

Agent

Created

Date Created
2018

153988-Thumbnail Image.png

Automatic text summarization using importance of sentences for email corpus

Description

With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the

With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the user. The search engine user has to go through certain webpages before getting at the webpage he/she wanted. This problem of Information Overload can be solved using Automatic Text Summarization. Summarization is a process of obtaining at abridged version of documents so that user can have a quick view to understand what exactly the document is about. Email threads from W3C are used in this system. Apart from common IR features like Term Frequency, Inverse Document Frequency, Term Rank, a variation of page rank based on graph model, which can cluster the words with respective to word ambiguity, is implemented. Term Rank also considers the possibility of co-occurrence of words with the corpus and evaluates the rank of the word accordingly. Sentences of email threads are ranked as per features and summaries are generated. System implemented the concept of pyramid evaluation in content selection. The system can be considered as a framework for Unsupervised Learning in text summarization.

Contributors

Agent

Created

Date Created
2015

156205-Thumbnail Image.png

Detecting Frames and Causal Relationships in Climate Change Related Text Databases Based on Semantic Features

Description

The subliminal impact of framing of social, political and environmental issues such as climate change has been studied for decades in political science and communications research. Media framing offers an “interpretative package" for average citizens on how to make sense

The subliminal impact of framing of social, political and environmental issues such as climate change has been studied for decades in political science and communications research. Media framing offers an “interpretative package" for average citizens on how to make sense of climate change and its consequences to their livelihoods, how to deal with its negative impacts, and which mitigation or adaptation policies to support. A line of related work has used bag of words and word-level features to detect frames automatically in text. Such works face limitations since standard keyword based features may not generalize well to accommodate surface variations in text when different keywords are used for similar concepts.

This thesis develops a unique type of textual features that generalize triplets extracted from text, by clustering them into high-level concepts. These concepts are utilized as features to detect frames in text. Compared to uni-gram and bi-gram based models, classification and clustering using generalized concepts yield better discriminating features and a higher classification accuracy with a 12% boost (i.e. from 74% to 83% F-measure) and 0.91 clustering purity for Frame/Non-Frame detection.

The automatic discovery of complex causal chains among interlinked events and their participating actors has not yet been thoroughly studied. Previous studies related to extracting causal relationships from text were based on laborious and incomplete hand-developed lists of explicit causal verbs, such as “causes" and “results in." Such approaches result in limited recall because standard causal verbs may not generalize well to accommodate surface variations in texts when different keywords and phrases are used to express similar causal effects. Therefore, I present a system that utilizes generalized concepts to extract causal relationships. The proposed algorithms overcome surface variations in written expressions of causal relationships and discover the domino effects between climate events and human security. This semi-supervised approach alleviates the need for labor intensive keyword list development and annotated datasets. Experimental evaluations by domain experts achieve an average precision of 82%. Qualitative assessments of causal chains show that results are consistent with the 2014 IPCC report illuminating causal mechanisms underlying the linkages between climatic stresses and social instability.

Contributors

Agent

Created

Date Created
2018

158353-Thumbnail Image.png

memeBot: Automatic Image Meme Generation for Online Social Interaction

Description

Internet memes have become a widespread tool used by people for interacting and exchanging ideas over social media, blogs, and open messengers. Internet memes most commonly take the form of an image which is a combination of image, text, and

Internet memes have become a widespread tool used by people for interacting and exchanging ideas over social media, blogs, and open messengers. Internet memes most commonly take the form of an image which is a combination of image, text, and humor, making them a powerful tool to deliver information. Image memes are used in viral marketing and mass advertising to propagate any ideas ranging from simple commercials to those that can cause changes and development in the social structures like countering hate speech.

This work proposes to treat automatic image meme generation as a translation process, and further present an end to end neural and probabilistic approach to generate an image-based meme for any given sentence using an encoder-decoder architecture. For a given input sentence, a meme is generated by combining a meme template image and a text caption where the meme template image is selected from a set of popular candidates using a selection module and the meme caption is generated by an encoder-decoder model. An encoder is used to map the selected meme template and the input sentence into a meme embedding space and then a decoder is used to decode the meme caption from the meme embedding space. The generated natural language caption is conditioned on the input sentence and the selected meme template.

The model learns the dependencies between the meme captions and the meme template images and generates new memes using the learned dependencies. The quality of the generated captions and the generated memes is evaluated through both automated metrics and human evaluation. An experiment is designed to score how well the generated memes can represent popular tweets from Twitter conversations. Experiments on Twitter data show the efficacy of the model in generating memes capable of representing a sentence in online social interaction.

Contributors

Agent

Created

Date Created
2020