Information forensics and security have come a long way in just a few years thanks to the recent advances in biometric recognition. The main challenge remains a proper design of a biometric modality that can be resilient to unconstrained conditions, such as quality distortions. This work presents a solution to face and ear recognition under unconstrained visual variations, with a main focus on recognition in the presence of blur, occlusion and additive noise distortions.
First, the dissertation addresses the problem of scene variations in the presence of blur, occlusion and additive noise distortions resulting from capture, processing and transmission. Despite their excellent performance, ’deep’ methods are susceptible to visual distortions, which significantly reduce their performance. Sparse representations, on the other hand, have shown huge potential capabilities in handling problems, such as occlusion and corruption. In this work, an augmented SRC (ASRC) framework is presented to improve the performance of the Spare Representation Classifier (SRC) in the presence of blur, additive noise and block occlusion, while preserving its robustness to scene dependent variations. Different feature types are considered in the performance evaluation including image raw pixels, HoG and deep learning VGG-Face. The proposed ASRC framework is shown to outperform the conventional SRC in terms of recognition accuracy, in addition to other existing sparse-based methods and blur invariant methods at medium to high levels of distortion, when particularly used with discriminative features.
In order to assess the quality of features in improving both the sparsity of the representation and the classification accuracy, a feature sparse coding and classification index (FSCCI) is proposed and used for feature ranking and selection within both the SRC and ASRC frameworks.
The second part of the dissertation presents a method for unconstrained ear recognition using deep learning features. The unconstrained ear recognition is performed using transfer learning with deep neural networks (DNNs) as a feature extractor followed by a shallow classifier. Data augmentation is used to improve the recognition performance by augmenting the training dataset with image transformations. The recognition performance of the feature extraction models is compared with an ensemble of fine-tuned networks. The results show that, in the case where long training time is not desirable or a large amount of data is not available, the features from pre-trained DNNs can be used with a shallow classifier to give a comparable recognition accuracy to the fine-tuned networks.