Theses and Dissertations
Filtering by
- Creators: Electrical Engineering Program
Classification in machine learning is quite crucial to solve many problems that the world is presented with today. Therefore, it is key to understand one’s problem and develop an efficient model to achieve a solution. One technique to achieve greater model selection and thus further ease in problem solving is estimation of the Bayes Error Rate. This paper provides the development and analysis of two methods used to estimate the Bayes Error Rate on a given set of data to evaluate performance. The first method takes a “global” approach, looking at the data as a whole, and the second is more “local”—partitioning the data at the outset and then building up to a Bayes Error Estimation of the whole. It is found that one of the methods provides an accurate estimation of the true Bayes Error Rate when the dataset is at high dimension, while the other method provides accurate estimation at large sample size. This second conclusion, in particular, can have significant ramifications on “big data” problems, as one would be able to clarify the distribution with an accurate estimation of the Bayes Error Rate by using this method.
In this research, I surveyed existing methods of characterizing Epilepsy from Electroencephalogram (EEG) data, including the Random Forest algorithm, which was claimed by many researchers to be the most effective at detecting epileptic seizures [7]. I observed that although many papers claimed a detection of >99% using Random Forest, it was not specified “when” the detection was declared within the 23.6 second interval of the seizure event. In this research, I created a time-series procedure to detect the seizure as early as possible within the 23.6 second epileptic seizure window and found that the detection is effective (> 92%) as early as the first few seconds of the epileptic episode. I intend to use this research as a stepping stone towards my upcoming Masters thesis research where I plan to expand the time-series detection mechanism to the pre-ictal stage, which will require a different dataset.