Description
In many classication problems data samples cannot be collected easily, example in drug trials, biological experiments and study on cancer patients. In many situations the data set size is small and there are many outliers. When classifying such data, example

In many classication problems data samples cannot be collected easily, example in drug trials, biological experiments and study on cancer patients. In many situations the data set size is small and there are many outliers. When classifying such data, example cancer vs normal patients the consequences of mis-classication are probably more important than any other data type, because the data point could be a cancer patient or the classication decision could help determine what gene might be over expressed and perhaps a cause of cancer. These mis-classications are typically higher in the presence of outlier data points. The aim of this thesis is to develop a maximum margin classier that is suited to address the lack of robustness of discriminant based classiers (like the Support Vector Machine (SVM)) to noise and outliers. The underlying notion is to adopt and develop a natural loss function that is more robust to outliers and more representative of the true loss function of the data. It is demonstrated experimentally that SVM's are indeed susceptible to outliers and that the new classier developed, here coined as Robust-SVM (RSVM), is superior to all studied classier on the synthetic datasets. It is superior to the SVM in both the synthetic and experimental data from biomedical studies and is competent to a classier derived on similar lines when real life data examples are considered.
Reuse Permissions
  • Included in this item (3)


    Details

    Title
    • Robust margin based classifiers for small sample data
    Contributors
    Agent
    Date Created
    2011
    Collections this item is in
    Note
    • Partial requirement for: M.S., Arizona State University, 2011
      Note type
      thesis
    • Includes bibliographical references (p. 34-35)
      Note type
      bibliography
    • Field of study: Computer science

    Citation and reuse

    Statement of Responsibility

    by Sidharth Gupta

    Machine-readable links