Matching Items (1,397)
- All Subjects: Computer Science
US Senate is the venue of political debates where the federal bills are formed and voted. Senators show their support/opposition along the bills with their votes. This information makes it possible to extract the polarity of the senators. Similarly, blogosphere plays an increasingly important role as a forum for public debate. Authors display sentiment toward issues, organizations or people using a natural language.
In this research, given a mixed set of senators/blogs debating on a set of political issues from opposing camps, I use signed bipartite graphs for modeling debates, and I propose an algorithm for partitioning both the opinion holders (senators or blogs) and the issues (bills or topics) comprising the debate into binary opposing camps. Simultaneously, my algorithm scales the entities on a univariate scale. Using this scale, a researcher can identify moderate and extreme senators/blogs within each camp, and polarizing versus unifying issues. Through performance evaluations I show that my proposed algorithm provides an effective solution to the problem, and performs much better than existing baseline algorithms adapted to solve this new problem. In my experiments, I used both real data from political blogosphere and US Congress records, as well as synthetic data which were obtained by varying polarization and degree distribution of the vertices of the graph to show the robustness of my algorithm.
I also applied my algorithm on all the terms of the US Senate to the date for longitudinal analysis and developed a web based interactive user interface www.PartisanScale.com to visualize the analysis.
US politics is most often polarized with respect to the left/right alignment of the entities. However, certain issues do not reflect the polarization due to political parties, but observe a split correlating to the demographics of the senators, or simply receive consensus. I propose a hierarchical clustering algorithm that identifies groups of bills that share the same polarization characteristics. I developed a web based interactive user interface www.ControversyAnalysis.com to visualize the clusters while providing a synopsis through distribution charts, word clouds, and heat maps.
Internet browsers are today capable of warning internet users of a potential phishing attack. Browsers identify these websites by referring to blacklists of reported phishing websites maintained by trusted organizations like Google, Phishtank etc. On identifying a Unified Resource Locator (URL) requested by a user as a reported phishing URL, browsers like Mozilla Firefox and Google Chrome display an 'active' warning message in an attempt to stop the user from making a potentially dangerous decision of visiting the website and sharing confidential information like username-password, credit card information, social security number etc.
However, these warnings are not always successful at safeguarding the user from a phishing attack. On several occasions, users ignore these warnings and 'click through' them, eventually landing at the potentially dangerous website and giving away confidential information. Failure to understand the warning, failure to differentiate different types of browser warnings, diminishing trust on browser warnings due to repeated encounter are some of the reasons that make users ignore these warnings. It is important to address these factors in order to eventually improve a user’s reaction to these warnings.
In this thesis, I propose a novel design to improve the effectiveness and reliability of phishing warning messages. This design utilizes the name of the target website that a fake website is mimicking, to display a simple, easy to understand and interactive warning message with the primary objective of keeping the user away from a potentially spoof website.
Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents a set of computational methods, that generalize well across different conditions, for speech-based applications involving emotion recognition and keyword detection, and ambient sounds-based applications such as lifelogging.
The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions is strongly desired. A latent topic models-based method is proposed to learn supra-segmental features from low-level acoustic descriptors. The derived features outperform state-of-the-art approaches over multiple databases. Cross-corpus studies are conducted to determine the ability of these features to generalize well across different databases. The proposed method is also applied to derive features from facial expressions; a multi-modal fusion overcomes the deficiencies of a speech only approach and further improves the recognition performance.
Besides affecting the acoustic properties of speech, emotions have a strong influence over speech articulation kinematics. A learning approach, which constrains a classifier trained over acoustic descriptors, to also model articulatory data is proposed here. This method requires articulatory information only during the training stage, thus overcoming the challenges inherent to large-scale data collection, while simultaneously exploiting the correlations between articulation kinematics and acoustic descriptors to improve the accuracy of emotion recognition systems.
Identifying context from ambient sounds in a lifelogging scenario requires feature extraction, segmentation and annotation techniques capable of efficiently handling long duration audio recordings; a complete framework for such applications is presented. The performance is evaluated on real world data and accompanied by a prototypical Android-based user interface.
The proposed methods are also assessed in terms of computation and implementation complexity. Software and field programmable gate array based implementations are considered for emotion recognition, while virtual platforms are used to model the complexities of lifelogging. The derived metrics are used to determine the feasibility of these methods for applications requiring real-time capabilities and low power consumption.
This study investigated the ability to relate a test taker’s non-verbal cues during online assessments to probable cheating incidents. Specifically, this study focused on the role of time delay, head pose and affective state for detection of cheating incidences in a lab-based online testing session. The analysis of a test taker’s non-verbal cues indicated that time delay, the variation of a student’s head pose relative to the computer screen and confusion had significantly statistical relation to cheating behaviors. Additionally, time delay, head pose relative to the computer screen, confusion, and the interaction term of confusion and time delay were predictors in a support vector machine of cheating prediction with an average accuracy of 70.7%. The current algorithm could automatically flag suspicious student behavior for proctors in large scale online courses during remotely administered exams.
The recent years have witnessed a rapid development of mobile devices and smart devices. As more and more people are getting involved in the online environment, privacy issues are becoming increasingly important. People’s privacy in the digital world is much easier to leak than in the real world, because every action people take online would leave a trail of information which could be recorded, collected and used by malicious attackers. Besides, service providers might collect users’ information and analyze them, which also leads to a privacy breach. Therefore, preserving people’s privacy is very important in the online environment.
In this dissertation, I study the problems of preserving people’s identity privacy and loca- tion privacy in the online environment. Specifically, I study four topics: identity privacy in online social networks (OSNs), identity privacy in anonymous message submission, lo- cation privacy in location based social networks (LBSNs), and location privacy in location based reminders. In the first topic, I propose a system which can hide users’ identity and data from untrusted storage site where the OSN provider puts users’ data. I also design a fine grained access control mechanism which prevents unauthorized users from accessing the data. Based on the secret sharing scheme, I construct a shuffle protocol that disconnects the relationship between members’ identities and their submitted messages in the topic of identity privacy in anonymous message submission. The message is encrypted on the mem- ber side and decrypted on the message collector side. The collector eventually gets all of the messages but does not know who submitted which message. In the third topic, I pro- pose a framework that hides users’ check-in information from the LBSN. Considering the limited computation resources on smart devices, I propose a delegatable pseudo random function to outsource computations to the much more powerful server while preserving privacy. I also implement efficient revocations. In the topic of location privacy in location based reminders, I propose a system to hide users’ reminder locations from an untrusted cloud server. I propose a cross based approach and an improved bar based approach, re- spectively, to represent a reminder area. The reminder location and reminder message are encrypted before uploading to the cloud server, which then can determine whether the dis- tance between the user’s current location and the reminder location is within the reminder distance without knowing anything about the user’s location information and the content of the reminder message.
With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the user. The search engine user has to go through certain webpages before getting at the webpage he/she wanted. This problem of Information Overload can be solved using Automatic Text Summarization. Summarization is a process of obtaining at abridged version of documents so that user can have a quick view to understand what exactly the document is about. Email threads from W3C are used in this system. Apart from common IR features like Term Frequency, Inverse Document Frequency, Term Rank, a variation of page rank based on graph model, which can cluster the words with respective to word ambiguity, is implemented. Term Rank also considers the possibility of co-occurrence of words with the corpus and evaluates the rank of the word accordingly. Sentences of email threads are ranked as per features and summaries are generated. System implemented the concept of pyramid evaluation in content selection. The system can be considered as a framework for Unsupervised Learning in text summarization.
Most embedded applications are constructed with multiple threads to handle concurrent events. For optimization and debugging of the programs, dynamic program analysis is widely used to collect execution information while the program is running. Unfortunately, the non-deterministic behavior of multithreaded embedded software makes the dynamic analysis difficult. In addition, instrumentation overhead for gathering execution information may change the execution of a program, and lead to distorted analysis results, i.e., probe effect. This thesis presents a framework that tackles the non-determinism and probe effect incurred in dynamic analysis of embedded software. The thesis largely consists of three parts. First of all, we discusses a deterministic replay framework to provide reproducible execution. Once a program execution is recorded, software instrumentation can be safely applied during replay without probe effect. Second, a discussion of probe effect is presented and a simulation-based analysis is proposed to detect execution changes of a program caused by instrumentation overhead. The simulation-based analysis examines if the recording instrumentation changes the original program execution. Lastly, the thesis discusses data race detection algorithms that help to remove data races for correctness of the replay and the simulation-based analysis. The focus is to make the detection efficient for C/C++ programs, and to increase scalability of the detection on multi-core machines.
Cisco estimates that by 2020, 50 billion devices will be connected to the Internet. But 99% of the things today remain isolated and unconnected. Different connectivity protocols, proprietary access, varied device characteristics, security concerns are the main reasons for that isolated state. This project aims at designing and building a prototype gateway that exposes a simple and intuitive HTTP Restful interface to access and manipulate devices and the data that they produce while addressing most of the issues listed above. Along with manipulating devices, the framework exposes sensor data in such a way that it can be used to create applications like rules or events that make the home smarter. It also allows the user to represent high-level knowledge by aggregating the low-level sensor data. This high-level representation can be considered as a property of the environment or object rather than the sensor itself which makes interpreting the values more intuitive and accessible.
There has been a vast increase in applications of Unmanned Aerial Vehicles (UAVs) in civilian domains. To operate in the civilian airspace, a UAV must be able to sense and avoid both static and moving obstacles for flight safety. While indoor and low-altitude environments are mainly occupied by static obstacles, risks in space of higher altitude primarily come from moving obstacles such as other aircraft or flying vehicles in the airspace. Therefore, the ability to avoid moving obstacles becomes a necessity
for Unmanned Aerial Vehicles.
Towards enabling a UAV to autonomously sense and avoid moving obstacles, this thesis makes the following contributions. Initially, an image-based reactive motion planner is developed for a quadrotor to avoid a fast approaching obstacle. Furthermore, A Dubin’s curve based geometry method is developed as a global path planner for a fixed-wing UAV to avoid collisions with aircraft. The image-based method is unable to produce an optimal path and the geometry method uses a simplified UAV model. To compensate
these two disadvantages, a series of algorithms built upon the Closed-Loop Rapid Exploratory Random Tree are developed as global path planners to generate collision avoidance paths in real time. The algorithms are validated in Software-In-the-Loop (SITL) and Hardware-In-the-Loop (HIL) simulations using a fixed-wing UAV model and in real flight experiments using quadrotors. It is observed that the algorithm enables a UAV to avoid moving obstacles approaching to it with different directions and speeds.