Theses and Dissertations
Displaying 1 - 2 of 2
Filtering by
- All Subjects: Facial Expression Recognition from Video Sequences
- All Subjects: Hemispherical photography
- Creators: Turaga, Pavan
Description
Fisheye cameras are special cameras that have a much larger field of view compared to
conventional cameras. The large field of view comes at a price of non-linear distortions
introduced near the boundaries of the images captured by such cameras. Despite this
drawback, they are being used increasingly in many applications of computer vision,
robotics, reconnaissance, astrophotography, surveillance and automotive applications.
The images captured from such cameras can be corrected for their distortion if the
cameras are calibrated and the distortion function is determined. Calibration also allows
fisheye cameras to be used in tasks involving metric scene measurement, metric
scene reconstruction and other simultaneous localization and mapping (SLAM) algorithms.
This thesis presents a calibration toolbox (FisheyeCDC Toolbox) that implements a collection of some of the most widely used techniques for calibration of fisheye cameras under one package. This enables an inexperienced user to calibrate his/her own camera without the need for a theoretical understanding about computer vision and camera calibration. This thesis also explores some of the applications of calibration such as distortion correction and 3D reconstruction.
conventional cameras. The large field of view comes at a price of non-linear distortions
introduced near the boundaries of the images captured by such cameras. Despite this
drawback, they are being used increasingly in many applications of computer vision,
robotics, reconnaissance, astrophotography, surveillance and automotive applications.
The images captured from such cameras can be corrected for their distortion if the
cameras are calibrated and the distortion function is determined. Calibration also allows
fisheye cameras to be used in tasks involving metric scene measurement, metric
scene reconstruction and other simultaneous localization and mapping (SLAM) algorithms.
This thesis presents a calibration toolbox (FisheyeCDC Toolbox) that implements a collection of some of the most widely used techniques for calibration of fisheye cameras under one package. This enables an inexperienced user to calibrate his/her own camera without the need for a theoretical understanding about computer vision and camera calibration. This thesis also explores some of the applications of calibration such as distortion correction and 3D reconstruction.
ContributorsKashyap Takmul Purushothama Raju, Vinay (Author) / Karam, Lina (Thesis advisor) / Turaga, Pavan (Committee member) / Tepedelenlioğlu, Cihan (Committee member) / Arizona State University (Publisher)
Created2014
Description
In this thesis we consider the problem of facial expression recognition (FER) from video sequences. Our method is based on subspace representations and Grassmann manifold based learning. We use Local Binary Pattern (LBP) at the frame level for representing the facial features. Next we develop a model to represent the video sequence in a lower dimensional expression subspace and also as a linear dynamical system using Autoregressive Moving Average (ARMA) model. As these subspaces lie on Grassmann space, we use Grassmann manifold based learning techniques such as kernel Fisher Discriminant Analysis with Grassmann kernels for classification. We consider six expressions namely, Angry (AN), Disgust (Di), Fear (Fe), Happy (Ha), Sadness (Sa) and Surprise (Su) for classification. We perform experiments on extended Cohn-Kanade (CK+) facial expression database to evaluate the expression recognition performance. Our method demonstrates good expression recognition performance outperforming other state of the art FER algorithms. We achieve an average recognition accuracy of 97.41% using a method based on expression subspace, kernel-FDA and Support Vector Machines (SVM) classifier. By using a simpler classifier, 1-Nearest Neighbor (1-NN) along with kernel-FDA, we achieve a recognition accuracy of 97.09%. We find that to process a group of 19 frames in a video sequence, LBP feature extraction requires majority of computation time (97 %) which is about 1.662 seconds on the Intel Core i3, dual core platform. However when only 3 frames (onset, middle and peak) of a video sequence are used, the computational complexity is reduced by about 83.75 % to 260 milliseconds at the expense of drop in the recognition accuracy to 92.88 %.
ContributorsYellamraju, Anirudh (Author) / Chakrabarti, Chaitali (Thesis advisor) / Turaga, Pavan (Thesis advisor) / Karam, Lina (Committee member) / Arizona State University (Publisher)
Created2014