Matching Items (32)
Filtering by

Clear all filters

136271-Thumbnail Image.png
Description
The OMFIT (One Modeling Framework for Integrated Tasks) modeling environment and the BRAINFUSE module have been deployed on the PPPL (Princeton Plasma Physics Laboratory) computing cluster with modifications that have rendered the application of artificial neural networks (NNs) to the TRANSP databases for the JET (Joint European Torus), TFTR (Tokamak

The OMFIT (One Modeling Framework for Integrated Tasks) modeling environment and the BRAINFUSE module have been deployed on the PPPL (Princeton Plasma Physics Laboratory) computing cluster with modifications that have rendered the application of artificial neural networks (NNs) to the TRANSP databases for the JET (Joint European Torus), TFTR (Tokamak Fusion Test Reactor), and NSTX (National Spherical Torus Experiment) devices possible through their use. This development has facilitated the investigation of NNs for predicting heat transport profiles in JET, TFTR, and NSTX, and has promoted additional investigations to discover how else NNs may be of use to scientists at PPPL. In applying NNs to the aforementioned devices for predicting heat transport, the primary goal of this endeavor is to reproduce the success shown in Meneghini et al. in using NNs for heat transport prediction in DIII-D. Being able to reproduce the results from is important because this in turn would provide scientists at PPPL with a quick and efficient toolset for reliably predicting heat transport profiles much faster than any existing computational methods allow; the progress towards this goal is outlined in this report, and potential additional applications of the NN framework are presented.
ContributorsLuna, Christopher Joseph (Author) / Tang, Wenbo (Thesis director) / Treacy, Michael (Committee member) / Orso, Meneghini (Committee member) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Department of Physics (Contributor)
Created2015-05
136409-Thumbnail Image.png
Description
Twitter, the microblogging platform, has grown in prominence to the point that the topics that trend on the network are often the subject of the news and other traditional media. By predicting trends on Twitter, it could be possible to predict the next major topic of interest to the public.

Twitter, the microblogging platform, has grown in prominence to the point that the topics that trend on the network are often the subject of the news and other traditional media. By predicting trends on Twitter, it could be possible to predict the next major topic of interest to the public. With this motivation, this paper develops a model for trends leveraging previous work with k-nearest-neighbors and dynamic time warping. The development of this model provides insight into the length and features of trends, and successfully generalizes to identify 74.3% of trends in the time period of interest. The model developed in this work provides understanding into why par- ticular words trend on Twitter.
ContributorsMarshall, Grant A (Author) / Liu, Huan (Thesis director) / Morstatter, Fred (Committee member) / Computer Science and Engineering Program (Contributor) / Barrett, The Honors College (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2015-05
136516-Thumbnail Image.png
Description
Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot

Bots tamper with social media networks by artificially inflating the popularity of certain topics. In this paper, we define what a bot is, we detail different motivations for bots, we describe previous work in bot detection and observation, and then we perform bot detection of our own. For our bot detection, we are interested in bots on Twitter that tweet Arabic extremist-like phrases. A testing dataset is collected using the honeypot method, and five different heuristics are measured for their effectiveness in detecting bots. The model underperformed, but we have laid the ground-work for a vastly untapped focus on bot detection: extremist ideal diffusion through bots.
ContributorsKarlsrud, Mark C. (Author) / Liu, Huan (Thesis director) / Morstatter, Fred (Committee member) / Barrett, The Honors College (Contributor) / Computing and Informatics Program (Contributor) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor)
Created2015-05
136442-Thumbnail Image.png
Description
A model has been developed to modify Euler-Bernoulli beam theory for wooden beams, using visible properties of wood knot-defects. Treating knots in a beam as a system of two ellipses that change the local bending stiffness has been shown to improve the fit of a theoretical beam displacement function to

A model has been developed to modify Euler-Bernoulli beam theory for wooden beams, using visible properties of wood knot-defects. Treating knots in a beam as a system of two ellipses that change the local bending stiffness has been shown to improve the fit of a theoretical beam displacement function to edge-line deflection data extracted from digital imagery of experimentally loaded beams. In addition, an Ellipse Logistic Model (ELM) has been proposed, using L1-regularized logistic regression, to predict the impact of a knot on the displacement of a beam. By classifying a knot as severely positive or negative, vs. mildly positive or negative, ELM can classify knots that lead to large changes to beam deflection, while not over-emphasizing knots that may not be a problem. Using ELM with a regression-fit Young's Modulus on three-point bending of Douglass Fir, it is possible estimate the effects a knot will have on the shape of the resulting displacement curve.
Created2015-05
131482-Thumbnail Image.png
Description
In shotgun proteomics, liquid chromatography coupled to tandem mass spectrometry
(LC-MS/MS) is used to identify and quantify peptides and proteins. LC-MS/MS produces mass spectra, which must be searched by one or more engines, which employ
algorithms to match spectra to theoretical spectra derived from a reference database.
These engines identify and characterize proteins

In shotgun proteomics, liquid chromatography coupled to tandem mass spectrometry
(LC-MS/MS) is used to identify and quantify peptides and proteins. LC-MS/MS produces mass spectra, which must be searched by one or more engines, which employ
algorithms to match spectra to theoretical spectra derived from a reference database.
These engines identify and characterize proteins and their component peptides. By
training a convolutional neural network on a dataset of over 6 million MS/MS spectra
derived from human proteins, we aim to create a tool that can quickly and effectively
identify spectra as peptides prior to database searching. This can significantly reduce search space and thus run time for database searches, thereby accelerating LCMS/MS-based proteomics data acquisition. Additionally, by training neural networks
on labels derived from the search results of three different database search engines, we
aim to examine and compare which features are best identified by individual search
engines, a neural network, or a combination of these.
ContributorsWhyte, Cameron Stafford (Author) / Suren, Jayasuriya (Thesis director) / Gil, Speyer (Committee member) / Patrick, Pirrotte (Committee member) / School of Mathematical and Statistical Sciences (Contributor, Contributor) / Barrett, The Honors College (Contributor)
Created2020-05
132368-Thumbnail Image.png
Description
A defense-by-randomization framework is proposed as an effective defense mechanism against different types of adversarial attacks on neural networks. Experiments were conducted by selecting a combination of differently constructed image classification neural networks to observe which combinations applied to this framework were most effective in maximizing classification accuracy. Furthermore, the

A defense-by-randomization framework is proposed as an effective defense mechanism against different types of adversarial attacks on neural networks. Experiments were conducted by selecting a combination of differently constructed image classification neural networks to observe which combinations applied to this framework were most effective in maximizing classification accuracy. Furthermore, the reasons why particular combinations were more effective than others is explored.
ContributorsMazboudi, Yassine Ahmad (Author) / Yang, Yezhou (Thesis director) / Ren, Yi (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Economics Program in CLAS (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
132267-Thumbnail Image.png
Description
AARP estimates that 90% of seniors wish to remain in their homes during retirement. Seniors need assistance as they age, historically they have received assistance from either family members, nursing homes, or Continuing Care Retirement Communities. For seniors not wanting any of these options, there has been very few alternatives.

AARP estimates that 90% of seniors wish to remain in their homes during retirement. Seniors need assistance as they age, historically they have received assistance from either family members, nursing homes, or Continuing Care Retirement Communities. For seniors not wanting any of these options, there has been very few alternatives. Now, the emergence of the continuing care at home program is providing hope for a different method of elder care moving forward. CCaH programs offer services such as: skilled nursing care, care coordination, emergency response systems, aid with personal and health care, and transportation. Such services allow seniors to continue to live in their own home with assistance as their health deteriorates over time. Currently, only 30 CCaH programs exist. With the growth of the elderly population in the coming years, this model seems poised for growth.
ContributorsSturm, Brendan (Author) / Milovanovic, Jelena (Thesis director) / Hassett, Matthew (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Economics Program in CLAS (Contributor) / Barrett, The Honors College (Contributor)
Created2019-05
133413-Thumbnail Image.png
Description
Catastrophe events occur rather infrequently, but upon their occurrence, can lead to colossal losses for insurance companies. Due to their size and volatility, catastrophe losses are often treated separately from other insurance losses. In fact, many property and casualty insurance companies feature a department or team which focuses solely on

Catastrophe events occur rather infrequently, but upon their occurrence, can lead to colossal losses for insurance companies. Due to their size and volatility, catastrophe losses are often treated separately from other insurance losses. In fact, many property and casualty insurance companies feature a department or team which focuses solely on modeling catastrophes. Setting reserves for catastrophe losses is difficult due to their unpredictable and often long-tailed nature. Determining loss development factors (LDFs) to estimate the ultimate loss amounts for catastrophe events is one method for setting reserves. In an attempt to aid Company XYZ set more accurate reserves, the research conducted focuses on estimating LDFs for catastrophes which have already occurred and have been settled. Furthermore, the research describes the process used to build a linear model in R to estimate LDFs for Company XYZ's closed catastrophe claims from 2001 \u2014 2016. This linear model was used to predict a catastrophe's LDFs based on the age in weeks of the catastrophe during the first year. Back testing was also performed, as was the comparison between the estimated ultimate losses and actual losses. Future research consideration was proposed.
ContributorsSwoverland, Robert Bo (Author) / Milovanovic, Jelena (Thesis director) / Zicarelli, John (Committee member) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
133482-Thumbnail Image.png
Description
Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using rule induction methods. The dataset is cleaned by removing entries

Cryptocurrencies have become one of the most fascinating forms of currency and economics due to their fluctuating values and lack of centralization. This project attempts to use machine learning methods to effectively model in-sample data for Bitcoin and Ethereum using rule induction methods. The dataset is cleaned by removing entries with missing data. The new column is created to measure price difference to create a more accurate analysis on the change in price. Eight relevant variables are selected using cross validation: the total number of bitcoins, the total size of the blockchains, the hash rate, mining difficulty, revenue from mining, transaction fees, the cost of transactions and the estimated transaction volume. The in-sample data is modeled using a simple tree fit, first with one variable and then with eight. Using all eight variables, the in-sample model and data have a correlation of 0.6822657. The in-sample model is improved by first applying bootstrap aggregation (also known as bagging) to fit 400 decision trees to the in-sample data using one variable. Then the random forests technique is applied to the data using all eight variables. This results in a correlation between the model and data of 9.9443413. The random forests technique is then applied to an Ethereum dataset, resulting in a correlation of 9.6904798. Finally, an out-of-sample model is created for Bitcoin and Ethereum using random forests, with a benchmark correlation of 0.03 for financial data. The correlation between the training model and the testing data for Bitcoin was 0.06957639, while for Ethereum the correlation was -0.171125. In conclusion, it is confirmed that cryptocurrencies can have accurate in-sample models by applying the random forests method to a dataset. However, out-of-sample modeling is more difficult, but in some cases better than typical forms of financial data. It should also be noted that cryptocurrency data has similar properties to other related financial datasets, realizing future potential for system modeling for cryptocurrency within the financial world.
ContributorsBrowning, Jacob Christian (Author) / Meuth, Ryan (Thesis director) / Jones, Donald (Committee member) / McCulloch, Robert (Committee member) / Computer Science and Engineering Program (Contributor) / School of Mathematical and Statistical Sciences (Contributor) / Barrett, The Honors College (Contributor)
Created2018-05
135056-Thumbnail Image.png
Description
In this paper, I will show that news headlines of global events can predict changes in stock price by using Machine Learning and eight years of data from r/WorldNews, a popular forum on Reddit.com. My data is confined to the top 25 daily posts on the forum, and due to

In this paper, I will show that news headlines of global events can predict changes in stock price by using Machine Learning and eight years of data from r/WorldNews, a popular forum on Reddit.com. My data is confined to the top 25 daily posts on the forum, and due to the implicit filtering mechanism in the online community, these 25 posts are representative of the most popular news headlines and influential global events of the day. Hence, these posts shine a light on how large-scale social and political events affect the stock market. Using a Logistic Regression and a Naive Bayes classifier, I am able to predict with approximately 85% accuracy a binary change in stock price using term-feature vectors gathered from the news headlines. The accuracy, precision and recall results closely rival the best models in this field of research. In addition to the results, I will also describe the mathematical underpinnings of the two models; preceded by a general investigation of the intersection between the multiple academic disciplines related to this project. These range from social to computer science and from statistics to philosophy. The goal of this additional discussion is to further illustrate the interdisciplinary nature of the research and hopefully inspire a non-monolithic mindset when further investigations are pursued.
Created2016-12