Matching Items (37)

Filtering by

Clear all filters

Enhancing Student Learning Through Adaptive Sentence Generation

Description

Education of any skill based subject, such as mathematics or language, involves a significant amount of repetition and pratice. According to the National Survey of Student Engagements, students spend on average 17 hours per week reviewing and practicing material previously

Education of any skill based subject, such as mathematics or language, involves a significant amount of repetition and pratice. According to the National Survey of Student Engagements, students spend on average 17 hours per week reviewing and practicing material previously learned in a classroom, with higher performing students showing a tendency to spend more time practicing. As such, learning software has emerged in the past several decades focusing on providing a wide range of examples, practice problems, and situations for users to exercise their skills. Notably, math students have benefited from software that procedurally generates a virtually infinite number of practice problems and their corresponding solutions. This allows for instantaneous feedback and automatic generation of tests and quizzes. Of course, this is only possible because software is capable of generating and verifying a virtually endless supply of sample problems across a wide range of topics within mathematics. While English learning software has progressed in a similar manner, it faces a series of hurdles distinctly different from those of mathematics. In particular, there is a wide range of exception cases present in English grammar. Some words have unique spellings for their plural forms, some words have identical spelling for plural forms, and some words are conjugated differently for only one particular tense or person-of-speech. These issues combined make the problem of generating grammatically correct sentences complicated. To compound to this problem, the grammar rules in English are vast, and often depend on the context in which they are used. Verb-tense agreement (e.g. "I eat" vs "he eats"), and conjugation of irregular verbs (e.g. swim -> swam) are common examples. This thesis presents an algorithm designed to randomly generate a virtually infinite number of practice problems for students of English as a second language. This approach differs from other generation approaches by generating based on a context set by educators, so that problems can be generated in the context of what students are currently learning. The algorithm is validated through a study in which over 35 000 sentences generated by the algorithm are verified by multiple grammar checking algorithms, and a subset of the sentences are validated against 3 education standards by a subject matter expert in the field. The study found that this approach has a significantly reduced grammar error ratio compared to other generation algorithms, and shows potential where context specification is concerned.

Contributors

Agent

Created

Date Created
2016-05

153901-Thumbnail Image.png

Online ET-LDA - joint modeling of events and their related tweets with online streaming data

Description

Micro-blogging platforms like Twitter have become some of the most popular sites for people to share and express their views and opinions about public events like debates, sports events or other news articles. These social updates by people complement the

Micro-blogging platforms like Twitter have become some of the most popular sites for people to share and express their views and opinions about public events like debates, sports events or other news articles. These social updates by people complement the written news articles or transcripts of events in giving the popular public opinion about these events. So it would be useful to annotate the transcript with tweets. The technical challenge is to align the tweets with the correct segment of the transcript. ET-LDA by Hu et al [9] addresses this issue by modeling the whole process with an LDA-based graphical model. The system segments the transcript into coherent and meaningful parts and also determines if a tweet is a general tweet about the event or it refers to a particular segment of the transcript. One characteristic of the Hu et al’s model is that it expects all the data to be available upfront and uses batch inference procedure. But in many cases we find that data is not available beforehand, and it is often streaming. In such cases it is infeasible to repeatedly run the batch inference algorithm. My thesis presents an online inference algorithm for the ET-LDA model, with a continuous stream of tweet data and compare their runtime and performance to existing algorithms.

Contributors

Agent

Created

Date Created
2015

153832-Thumbnail Image.png

Secure Mobile SDN

Description

The increasing usage of smart-phones and mobile devices in work environment and IT

industry has brought about unique set of challenges and opportunities. ARM architecture

in particular has evolved to a point where it supports implementations across wide spectrum

of performance points and

The increasing usage of smart-phones and mobile devices in work environment and IT

industry has brought about unique set of challenges and opportunities. ARM architecture

in particular has evolved to a point where it supports implementations across wide spectrum

of performance points and ARM based tablets and smart-phones are in demand. The

enhancements to basic ARM RISC architecture allow ARM to have high performance,

small code size, low power consumption and small silicon area. Users want their devices to

perform many tasks such as read email, play games, and run other online applications and

organizations no longer desire to provision and maintain individual’s IT equipment. The

term BYOD (Bring Your Own Device) has come into being from demand of such a work

setup and is one of the motivation of this research work. It brings many opportunities such

as increased productivity and reduced costs and challenges such as secured data access,

data leakage and amount of control by the organization.

To provision such a framework we need to bridge the gap from both organizations side

and individuals point of view. Mobile device users face issue of application delivery on

multiple platforms. For instance having purchased many applications from one proprietary

application store, individuals may want to move them to a different platform/device but

currently this is not possible. Organizations face security issues in providing such a solution

as there are many potential threats from allowing BYOD work-style such as unauthorized

access to data, attacks from the devices within and outside the network.

ARM based Secure Mobile SDN framework will resolve these issues and enable employees

to consolidate both personal and business calls and mobile data access on a single device.

To address application delivery issue we are introducing KVM based virtualization that

will allow host OS to run multiple guest OS. To address the security problem we introduce

SDN environment where host would be running bridged network of guest OS using Open

vSwitch . This would allow a remote controller to monitor the state of guest OS for making

important control and traffic flow decisions based on the situation.

Contributors

Agent

Created

Date Created
2015

153574-Thumbnail Image.png

Breaking hash-tag detection algorithm for social media (Twitter)

Description

In trading, volume is a measure of how much stock has been exchanged in a given period of time. Since every stock is distinctive and has an alternate measure of shares, volume can be contrasted with historical volume inside a

In trading, volume is a measure of how much stock has been exchanged in a given period of time. Since every stock is distinctive and has an alternate measure of shares, volume can be contrasted with historical volume inside a stock to spot changes. It is likewise used to affirm value patterns, breakouts, and spot potential reversals. In my thesis, I hypothesize that the concept of trading volume can be extrapolated to social media (Twitter).

The ubiquity of social media, especially Twitter, in financial market has been overly resonant in the past couple of years. With the growth of its (Twitter) usage by news channels, financial experts and pandits, the global economy does seem to hinge on 140 characters. By analyzing the number of tweets hash tagged to a stock, a strong relation can be established between the number of people talking about it, to the trading volume of the stock.

In my work, I overt this relation and find a state of the breakout when the volume goes beyond a characterized support or resistance level.

Contributors

Agent

Created

Date Created
2015

154769-Thumbnail Image.png

Directional prediction of stock prices using breaking news on Twitter

Description

Stock market news and investing tips are popular topics in Twitter. In this dissertation, first I utilize a 5-year financial news corpus comprising over 50,000 articles collected from the NASDAQ website matching the 30 stock symbols in Dow Jones Index

Stock market news and investing tips are popular topics in Twitter. In this dissertation, first I utilize a 5-year financial news corpus comprising over 50,000 articles collected from the NASDAQ website matching the 30 stock symbols in Dow Jones Index (DJI) to train a directional stock price prediction system based on news content. Next, I proceed to show that information in articles indicated by breaking Tweet volumes leads to a statistically significant boost in the hourly directional prediction accuracies for the DJI stock prices mentioned in these articles. Secondly, I show that using document-level sentiment extraction does not yield a statistically significant boost in the directional predictive accuracies in the presence of other 1-gram keyword features. Thirdly I test the performance of the system on several time-frames and identify the 4 hour time-frame for both the price charts and for Tweet breakout detection as the best time-frame combination. Finally, I develop a set of price momentum based trade exit rules to cut losing trades early and to allow the winning trades run longer. I show that the Tweet volume breakout based trading system with the price momentum based exit rules not only improves the winning accuracy and the return on investment, but it also lowers the maximum drawdown and achieves the highest overall return over maximum drawdown.

Contributors

Agent

Created

Date Created
2016

154545-Thumbnail Image.png

Automatic tracking of linguistic changes for monitoring cognitive-linguistic health

Description

Many neurological disorders, especially those that result in dementia, impact speech and language production. A number of studies have shown that there exist subtle changes in linguistic complexity in these individuals that precede disease onset. However, these studies are conducted

Many neurological disorders, especially those that result in dementia, impact speech and language production. A number of studies have shown that there exist subtle changes in linguistic complexity in these individuals that precede disease onset. However, these studies are conducted on controlled speech samples from a specific task. This thesis explores the possibility of using natural language processing in order to detect declining linguistic complexity from more natural discourse. We use existing data from public figures suspected (or at risk) of suffering from cognitive-linguistic decline, downloaded from the Internet, to detect changes in linguistic complexity. In particular, we focus on two case studies. The first case study analyzes President Ronald Reagan’s transcribed spontaneous speech samples during his presidency. President Reagan was diagnosed with Alzheimer’s disease in 1994, however my results showed declining linguistic complexity during the span of the 8 years he was in office. President George Herbert Walker Bush, who has no known diagnosis of Alzheimer’s disease, shows no decline in the same measures. In the second case study, we analyze transcribed spontaneous speech samples from the news conferences of 10 current NFL players and 18 non-player personnel since 2007. The non-player personnel have never played professional football. Longitudinal analysis of linguistic complexity showed contrasting patterns in the two groups. The majority (6 of 10) of current players showed decline in at least one measure of linguistic complexity over time. In contrast, the majority (11 out of 18) of non-player personnel showed an increase in at least one linguistic complexity measure.

Contributors

Agent

Created

Date Created
2016

156468-Thumbnail Image.png

Study of Knowledge Transfer Techniques For Deep Learning on Edge Devices

Description

With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy,

With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced in order to be placed on edge devices, but they may loose their capability and may not generalize and perform well compared to large models. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to a small one (termed student) in order to improve the performance of the latter. This approach seems to be promising for learning on edge devices, but a thorough investigation on its effectiveness is lacking.

The purpose of this work is to provide an extensive study on the performance (both in terms of accuracy and convergence speed) of knowledge transfer, considering different student-teacher architectures, datasets and different techniques for transferring knowledge from teacher to student.

A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact. For example, a smaller and shorter network, trained with knowledge transfer on Caltech 101 achieved a significant improvement of 7.36\% in the accuracy and converges 16 times faster compared to the same network trained without knowledge transfer. On the other hand, smaller network which is thinner than the teacher network performed worse with an accuracy drop of 9.48\% on Caltech 101, even with utilization of knowledge transfer.

Contributors

Agent

Created

Date Created
2018

155252-Thumbnail Image.png

Mining signed social networks using unsupervised learning algorithms

Description

Due to vast resources brought by social media services, social data mining has

received increasing attention in recent years. The availability of sheer amounts of

user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The

Due to vast resources brought by social media services, social data mining has

received increasing attention in recent years. The availability of sheer amounts of

user-generated data presents data scientists both opportunities and challenges. Opportunities are presented with additional data sources. The abundant link information

in social networks could provide another rich source in deriving implicit information

for social data mining. However, the vast majority of existing studies overwhelmingly

focus on positive links between users while negative links are also prevailing in real-

world social networks such as distrust relations in Epinions and foe links in Slashdot.

Though recent studies show that negative links have some added value over positive

links, it is dicult to directly employ them because of its distinct characteristics from

positive interactions. Another challenge is that label information is rather limited

in social media as the labeling process requires human attention and may be very

expensive. Hence, alternative criteria are needed to guide the learning process for

many tasks such as feature selection and sentiment analysis.

To address above-mentioned issues, I study two novel problems for signed social

networks mining, (1) unsupervised feature selection in signed social networks; and

(2) unsupervised sentiment analysis with signed social networks. To tackle the first problem, I propose a novel unsupervised feature selection framework SignedFS. In

particular, I model positive and negative links simultaneously for user preference

learning, and then embed the user preference learning into feature selection. To study the second problem, I incorporate explicit sentiment signals in textual terms and

implicit sentiment signals from signed social networks into a coherent model Signed-

Senti. Empirical experiments on real-world datasets corroborate the effectiveness of

these two frameworks on the tasks of feature selection and sentiment analysis.

Contributors

Agent

Created

Date Created
2017

155262-Thumbnail Image.png

Mason: Real-time NBA Matches Outcome Prediction

Description

The National Basketball Association (NBA) is the most popular basketball league in the world. The world-wide mighty high popularity to the league leads to large amount of interesting and challenging research problems. Among them, predicting the outcome of an upcoming

The National Basketball Association (NBA) is the most popular basketball league in the world. The world-wide mighty high popularity to the league leads to large amount of interesting and challenging research problems. Among them, predicting the outcome of an upcoming NBA match between two specific teams according to their historical data is especially attractive. With rapid development of machine learning techniques, it opens the door to examine the correlation between statistical data and outcome of matches. However, existing methods typically make predictions before game starts. In-game prediction, or real-time prediction, has not yet been sufficiently studied. During a match, data are cumulatively generated, and with the accumulation, data become more comprehensive and potentially embrace more predictive power, so that prediction accuracy may dynamically increase with a match goes on. In this study, I design game-level and player-level features based on realtime data of NBA matches and apply a machine learning model to investigate the possibility and characteristics of using real-time prediction in NBA matches.

Contributors

Agent

Created

Date Created
2017

157892-Thumbnail Image.png

Detecting Adversarial Examples by Measuring their Stress Response

Description

Machine learning (ML) and deep neural networks (DNNs) have achieved great success in a variety of application domains, however, despite significant effort to make these networks robust, they remain vulnerable to adversarial attacks in which input that is perceptually indistinguishable

Machine learning (ML) and deep neural networks (DNNs) have achieved great success in a variety of application domains, however, despite significant effort to make these networks robust, they remain vulnerable to adversarial attacks in which input that is perceptually indistinguishable from natural data can be erroneously classified with high prediction confidence. Works on defending against adversarial examples can be broadly classified as correcting or detecting, which aim, respectively at negating the effects of the attack and correctly classifying the input, or detecting and rejecting the input as adversarial. In this work, a new approach for detecting adversarial examples is proposed. The approach takes advantage of the robustness of natural images to noise. As noise is added to a natural image, the prediction probability of its true class drops, but the drop is not sudden or precipitous. The same seems to not hold for adversarial examples. In other word, the stress response profile for natural images seems different from that of adversarial examples, which could be detected by their stress response profile. An evaluation of this approach for detecting adversarial examples is performed on the MNIST, CIFAR-10 and ImageNet datasets. Experimental data shows that this approach is effective at detecting some adversarial examples on small scaled simple content images and with little sacrifice on benign accuracy.

Contributors

Agent

Created

Date Created
2019