Filtering by
- All Subjects: Machine Learning
- Resource Type: Text
Classification in machine learning is quite crucial to solve many problems that the world is presented with today. Therefore, it is key to understand one’s problem and develop an efficient model to achieve a solution. One technique to achieve greater model selection and thus further ease in problem solving is estimation of the Bayes Error Rate. This paper provides the development and analysis of two methods used to estimate the Bayes Error Rate on a given set of data to evaluate performance. The first method takes a “global” approach, looking at the data as a whole, and the second is more “local”—partitioning the data at the outset and then building up to a Bayes Error Estimation of the whole. It is found that one of the methods provides an accurate estimation of the true Bayes Error Rate when the dataset is at high dimension, while the other method provides accurate estimation at large sample size. This second conclusion, in particular, can have significant ramifications on “big data” problems, as one would be able to clarify the distribution with an accurate estimation of the Bayes Error Rate by using this method.
This project aims to incorporate the aspect of sentiment analysis into traditional stock analysis to enhance stock rating predictions by applying a reliance on the opinion of various stocks from the Internet. Headlines from eight major news publications and conversations from Yahoo! Finance’s “Conversations” feature were parsed through the Valence Aware Dictionary for Sentiment Reasoning (VADER) natural language processing package to determine numerical polarities which represented positivity or negativity for a given stock ticker. These generated polarities were paired with stock metrics typically observed by stock analysts as the feature set for a Logistic Regression machine learning model. The model was trained on roughly 1500 major stocks to determine a binary classification between a “Buy” or “Not Buy” rating for each stock, and the results of the model were inserted into the back-end of the Agora Web UI which emulates search engine behavior specifically for stocks found in NYSE and NASDAQ. The model reported an accuracy of 82.5% and for most major stocks, the model’s prediction correlated with stock analysts’ ratings. Given the volatility of the stock market and the propensity for hive-mind behavior in online forums, the performance of the Logistic Regression model would benefit from incorporating historical stock data and more sources of opinion to balance any subjectivity in the model.
NASA has designed and publicized a standard benchmark problem for propulsion engine gas path diagnostic that enables comparisons among different engine diagnostic approaches. Some traditional model-based approaches and novel purely data-driven approaches such as machine learning, have been applied to this problem.
This study focuses on a different machine learning approach to the diagnostic problem. Some most common machine learning techniques, such as support vector machine, multi-layer perceptron, and self-organizing map are used to help gain insight into the different engine failure modes from the perspective of big data. They are organically integrated to achieve good performance based on a good understanding of the complex dataset.
The study presents a new hierarchical machine learning structure to enhance classification accuracy in NASA's engine diagnostic benchmark problem. The designed hierarchical structure produces an average diagnostic accuracy of 73.6%, which outperforms comparable studies that were most recently published.
This project considers the FPGA implementations of MLP and CNN feedforward. While FPGAs provide significant performance improvements, they come at a substantial financial cost. We explore the options of implementing these algorithms on a smaller budget. We successfully implement a multilayer perceptron that identifies handwritten digits from the MNIST dataset on a student-level DE10-Lite FPGA with a test accuracy of 91.99%. We also apply our trained network to external image data loaded through a webcam and a Raspberry Pi, but we observe lower test accuracy in these images. Later, we consider the requirements necessary to implement a more elaborate convolutional neural network on the same FPGA. The study deems the CNN implementation feasible in the criteria of memory requirements and basic architecture. We suggest the CNN implementation on the same FPGA to be worthy of further exploration.