Matching Items (40)
Filtering by

Clear all filters

156622-Thumbnail Image.png
Description
Reasoning about the activities of cyber threat actors is critical to defend against cyber

attacks. However, this task is difficult for a variety of reasons. In simple terms, it is difficult

to determine who the attacker is, what the desired goals are of the attacker, and how they will

carry out their attacks.

Reasoning about the activities of cyber threat actors is critical to defend against cyber

attacks. However, this task is difficult for a variety of reasons. In simple terms, it is difficult

to determine who the attacker is, what the desired goals are of the attacker, and how they will

carry out their attacks. These three questions essentially entail understanding the attacker’s

use of deception, the capabilities available, and the intent of launching the attack. These

three issues are highly inter-related. If an adversary can hide their intent, they can better

deceive a defender. If an adversary’s capabilities are not well understood, then determining

what their goals are becomes difficult as the defender is uncertain if they have the necessary

tools to accomplish them. However, the understanding of these aspects are also mutually

supportive. If we have a clear picture of capabilities, intent can better be deciphered. If we

understand intent and capabilities, a defender may be able to see through deception schemes.

In this dissertation, I present three pieces of work to tackle these questions to obtain

a better understanding of cyber threats. First, we introduce a new reasoning framework

to address deception. We evaluate the framework by building a dataset from DEFCON

capture-the-flag exercise to identify the person or group responsible for a cyber attack.

We demonstrate that the framework not only handles cases of deception but also provides

transparent decision making in identifying the threat actor. The second task uses a cognitive

learning model to determine the intent – goals of the threat actor on the target system.

The third task looks at understanding the capabilities of threat actors to target systems by

identifying at-risk systems from hacker discussions on darkweb websites. To achieve this

task we gather discussions from more than 300 darkweb websites relating to malicious

hacking.
ContributorsNunes, Eric (Author) / Shakarian, Paulo (Thesis advisor) / Ahn, Gail-Joon (Committee member) / Baral, Chitta (Committee member) / Cooke, Nancy J. (Committee member) / Arizona State University (Publisher)
Created2018
136687-Thumbnail Image.png
Description
Mathematics is an increasingly critical subject and the achievement of students in mathematics has been the focus of many recent reports and studies. However, few studies exist that both observe and discuss the specific teaching and assessment techniques employed in the classrooms across multiple countries. The focus of this study

Mathematics is an increasingly critical subject and the achievement of students in mathematics has been the focus of many recent reports and studies. However, few studies exist that both observe and discuss the specific teaching and assessment techniques employed in the classrooms across multiple countries. The focus of this study is to look at classrooms and educators across six high achieving countries to identify and compare teaching strategies being used. In Finland, Hong Kong, Japan, New Zealand, Singapore, and Switzerland, twenty educators were interviewed and fourteen educators were observed teaching. Themes were first identified by comparing individual teacher responses within each country. These themes were then grouped together across countries and eight emerging patterns were identified. These strategies include students active involvement in the classroom, students given written feedback on assessments, students involvement in thoughtful discussion about mathematical concepts, students solving and explaining mathematics problems at the board, students exploring mathematical concepts either before or after being taught the material, students engagement in practical applications, students making connections between concepts, and students having confidence in their ability to understand mathematics. The strategies identified across these six high achieving countries can inform educators in their efforts of increasing student understanding of mathematical concepts and lead to an improvement in mathematics performance.
ContributorsAnglin, Julia Mae (Author) / Middleton, James (Thesis director) / Vicich, James (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2014-12
136516-Thumbnail Image.png
Description
Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot

Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot detection, we are interested in bots on Twitter that tweet Arabic extremist-like phrases. A testing dataset is collected using the honeypot method, and five different heuristics are measured for their effectiveness in detecting bots. The model underperformed, but we have laid the ground-work for a vastly untapped focus on bot detection: extremist ideal diffusion through bots.
ContributorsKarlsrud, Mark C. (Author) / Liu, Huan (Thesis director) / Morstatter, Fred (Committee member) / Barrett, The Honors College (Contributor) / Computing and Informatics Program (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2015-05
136271-Thumbnail Image.png
Description
The OMFIT (One Modeling Framework for Integrated Tasks) modeling environment and the BRAINFUSE module have been deployed on the PPPL (Princeton Plasma Physics Laboratory) computing cluster with modifications that have rendered the application of artificial neural networks (NNs) to the TRANSP databases for the JET (Joint European Torus), TFTR (Tokamak

The OMFIT (One Modeling Framework for Integrated Tasks) modeling environment and the BRAINFUSE module have been deployed on the PPPL (Princeton Plasma Physics Laboratory) computing cluster with modifications that have rendered the application of artificial neural networks (NNs) to the TRANSP databases for the JET (Joint European Torus), TFTR (Tokamak Fusion Test Reactor), and NSTX (National Spherical Torus Experiment) devices possible through their use. This development has facilitated the investigation of NNs for predicting heat transport profiles in JET, TFTR, and NSTX, and has promoted additional investigations to discover how else NNs may be of use to scientists at PPPL. In applying NNs to the aforementioned devices for predicting heat transport, the primary goal of this endeavor is to reproduce the success shown in Meneghini et al. in using NNs for heat transport prediction in DIII-D. Being able to reproduce the results from is important because this in turn would provide scientists at PPPL with a quick and efficient toolset for reliably predicting heat transport profiles much faster than any existing computational methods allow; the progress towards this goal is outlined in this report, and potential additional applications of the NN framework are presented.
ContributorsLuna, Christopher Joseph (Author) / Tang, Wenbo (Thesis director) / Treacy, Michael (Committee member) / Orso, Meneghini (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Department of Physics (Contributor)
Created2015-05
136409-Thumbnail Image.png
Description
Twitter, the microblogging platform, has grown in prominence to the point that the topics that trend on the network are often the subject of the news and other traditional media. By predicting trends on Twitter, it could be possible to predict the next major topic of interest to the public.

Twitter, the microblogging platform, has grown in prominence to the point that the topics that trend on the network are often the subject of the news and other traditional media. By predicting trends on Twitter, it could be possible to predict the next major topic of interest to the public. With this motivation, this paper develops a model for trends leveraging previous work with k-nearest-neighbors and dynamic time warping. The development of this model provides insight into the length and features of trends, and successfully generalizes to identify 74.3% of trends in the time period of interest. The model developed in this work provides understanding into why par- ticular words trend on Twitter.
ContributorsMarshall, Grant A (Author) / Liu, Huan (Thesis director) / Morstatter, Fred (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2015-05
136442-Thumbnail Image.png
Description
A model has been developed to modify Euler-Bernoulli beam theory for wooden beams, using visible properties of wood knot-defects. Treating knots in a beam as a system of two ellipses that change the local bending stiffness has been shown to improve the fit of a theoretical beam displacement function to

A model has been developed to modify Euler-Bernoulli beam theory for wooden beams, using visible properties of wood knot-defects. Treating knots in a beam as a system of two ellipses that change the local bending stiffness has been shown to improve the fit of a theoretical beam displacement function to edge-line deflection data extracted from digital imagery of experimentally loaded beams. In addition, an Ellipse Logistic Model (ELM) has been proposed, using L1-regularized logistic regression, to predict the impact of a knot on the displacement of a beam. By classifying a knot as severely positive or negative, vs. mildly positive or negative, ELM can classify knots that lead to large changes to beam deflection, while not over-emphasizing knots that may not be a problem. Using ELM with a regression-fit Young's Modulus on three-point bending of Douglass Fir, it is possible estimate the effects a knot will have on the shape of the resulting displacement curve.
Created2015-05
133339-Thumbnail Image.png
Description
Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important

Medical records are increasingly being recorded in the form of electronic health records (EHRs), with a significant amount of patient data recorded as unstructured natural language text. Consequently, being able to extract and utilize clinical data present within these records is an important step in furthering clinical care. One important aspect within these records is the presence of prescription information. Existing techniques for extracting prescription information — which includes medication names, dosages, frequencies, reasons for taking, and mode of administration — from unstructured text have focused on the application of rule- and classifier-based methods. While state-of-the-art systems can be effective in extracting many types of information, they require significant effort to develop hand-crafted rules and conduct effective feature engineering. This paper presents the use of a bidirectional LSTM with CRF tagging model initialized with precomputed word embeddings for extracting prescription information from sentences without requiring significant feature engineering. The experimental results, run on the i2b2 2009 dataset, achieve an F1 macro measure of 0.8562, and scores above 0.9449 on four of the six categories, indicating significant potential for this model.
ContributorsRawal, Samarth Chetan (Author) / Baral, Chitta (Thesis director) / Anwar, Saadat (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
133482-Thumbnail Image.png
Description
Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using rule induction methods. The dataset is cleaned by removing entries

Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using rule induction methods. The dataset is cleaned by removing entries with missing data. The new column is created to measure price difference to create a more accurate analysis on the change in price. Eight relevant variables are selected using cross validation: the total number of bitcoins, the total size of the blockchains, the hash rate, mining difficulty, revenue from mining, transaction fees, the cost of transactions and the estimated transaction volume. The in-sample data is modeled using a simple tree fit, first with one variable and then with eight. Using all eight variables, the in-sample model and data have a correlation of 0.6822657. The in-sample model is improved by first applying bootstrap aggregation (also known as bagging) to fit 400 decision trees to the in-sample data using one variable. Then the random forests technique is applied to the data using all eight variables. This results in a correlation between the model and data of 9.9443413. The random forests technique is then applied to an Ethereum dataset, resulting in a correlation of 9.6904798. Finally, an out-of-sample model is created for Bitcoin and Ethereum using random forests, with a benchmark correlation of 0.03 for financial data. The correlation between the training model and the testing data for Bitcoin was 0.06957639, while for Ethereum the correlation was -0.171125. In conclusion, it is confirmed that cryptocurrencies can have accurate in-sample models by applying the random forests method to a dataset. However, out-of-sample modeling is more difficult, but in some cases better than typical forms of financial data. It should also be noted that cryptocurrency data has similar properties to other related financial datasets, realizing future potential for system modeling for cryptocurrency within the financial world.
ContributorsBrowning, Jacob Christian (Author) / Meuth, Ryan (Thesis director) / Jones, Donald (Committee member) / McCulloch, Robert (Committee member) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
134011-Thumbnail Image.png
Description
Machine learning for analytics has exponentially increased in the past few years due to its ability to identify hidden insights in data. It also has a plethora of applications in healthcare ranging from improving image recognition in CT scans to extracting semantic meaning from thousands of medical form PDFs. Currently

Machine learning for analytics has exponentially increased in the past few years due to its ability to identify hidden insights in data. It also has a plethora of applications in healthcare ranging from improving image recognition in CT scans to extracting semantic meaning from thousands of medical form PDFs. Currently in the BioElectrical Systems and Technology Lab, there is a biosensor in development that retrieves and analyzes data manually. In a proof of concept, this project uses the neural network architecture to automatically parse and classify a cardiac disease data set as well as explore health related factors impacting cardiac disease in patients of all ages.
Created2018-05
135056-Thumbnail Image.png
Description
In this paper, I will show that news headlines of global events can predict changes in stock price by using Machine Learning and eight years of data from r/WorldNews, a popular forum on Reddit.com. My data is confined to the top 25 daily posts on the forum, and due to

In this paper, I will show that news headlines of global events can predict changes in stock price by using Machine Learning and eight years of data from r/WorldNews, a popular forum on Reddit.com. My data is confined to the top 25 daily posts on the forum, and due to the implicit filtering mechanism in the online community, these 25 posts are representative of the most popular news headlines and influential global events of the day. Hence, these posts shine a light on how large-scale social and political events affect the stock market. Using a Logistic Regression and a Naive Bayes classifier, I am able to predict with approximately 85% accuracy a binary change in stock price using term-feature vectors gathered from the news headlines. The accuracy, precision and recall results closely rival the best models in this field of research. In addition to the results, I will also describe the mathematical underpinnings of the two models; preceded by a general investigation of the intersection between the multiple academic disciplines related to this project. These range from social to computer science and from statistics to philosophy. The goal of this additional discussion is to further illustrate the interdisciplinary nature of the research and hopefully inspire a non-monolithic mindset when further investigations are pursued.
Created2016-12