Description
Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts,

Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts, for anomaly detection in categorical data in which neither the dimension, the specific attributes involved, nor the form of the pattern is known a priori. I use RandomForest (RF) technique as an effective learner for artificial contrast. RF is a powerful algorithm that can handle relations of attributes in high dimensional data and detect anomalies while providing probability estimates for risk decisions.

I apply the model to two simulated data sets and one real data set. The model was able to detect anomalies with a very high accuracy. Finally, by comparing the proposed model with other models in the literature, I demonstrate superior performance of the proposed model.
Reuse Permissions
  • Downloads
    pdf (932.2 KB)

    Details

    Title
    • Anomaly detection in categorical datasets with artificial contrasts
    Contributors
    Date Created
    2016
    Resource Type
  • Text
  • Collections this item is in
    Note
    • Partial requirement for: M.S., Arizona State University, 2016
      Note type
      thesis
    • Includes bibliographical references (pages 51-53)
      Note type
      bibliography
    • Field of study: Industrial engineering

    Citation and reuse

    Statement of Responsibility

    by Seyyedehnasim Mousavi

    Machine-readable links