• Nasir Ahmad Department of Electronic and Electrical Engineering, Loughborough University, UK
Keywords: Face recognition, features extraction, classification, depth images, pre-processing


Researchers have tried to improve the accuracy of face recognition by combining 2D and 3D images to overcome
the problems of illumination, pose variation and occlusion. Although combining 2D with 3D have shown better results
when compared with 2D images only, however applicability of these methods is inadequate in practical implementations
due to high cost of 3D sensors, therefore we are using the low cost sensor Kinect acquired images. We do
face recognition from RGB images, depth images and then we combine both RGB and depth maps i.e. concatenate
different Modalities to improve the accuracy of recognition. Depth maps have holes and noise induced from camera
sensors, therefore we process them to remove these distortions and then we apply the face recognition algorithm.
Experimental results reveal that the accuracy of face recognition can be increased by combining RGB and depth
images and applying pre-processing on depth maps which mitigate the effects of covariates such as holes and noise
in the depth maps.


1. Abate, A. F., Nappi, M., Riccio, D, Sabatino,
G., (2007), “2D and 3D face recognition: A
survey”, Pattern Recognition Letters, Vol.28(14):pp.1885–1906.
2. Bowyer, K. W., Chang, K., Flynn, P., (2004), “A
survey of approaches to three-dimensional face
recognition”, IEEE 17th International Conference
on Pattern Recognition, Vol.1: pp.358–361.
3. Scheenstra, A., Ruifrok, A., Veltkamp, R. C.,
(2005), “A survey of 3D face recognition methods”,
International Conference on Audio-and Video-based
Biometric Person Authentication, Springer Berlin
Heidelberg, pp.891–899.
4. Bo, L., Ren, X., Fox, D. (2011), “Depth kernel
descriptors for object recognition” IEEE/RSJ
International Conference on Intelligent Robots and
Systems (IROS), pp.821–826.
5. Park, Y., Lepetit, V, Woo, W., (2011), “Texture-less
object tracking with online training using an RGB-D
camera”, IEEE International Symposium on Mixed
and Augmented Reality (ISMAR), pp.121–126.
6. Ramey, A., González-Pacheco, V., Salichs, M. A.,
(2011), “Integration of a low-cost RGB-D sensor
in a social robot for gesture recognition” 6th international
conference on Human-robot interaction,
7. Hg, R. I., Jasek, P., Rofidal, C., Nasrollahi, K.,
Moeslund, T. B., Tranchet, G., (2012), “An RGB-d
database using microsoft’s Kinect for windows for
face detection” SITIS, pp.42-46.
8. Huynh, T., Min, R., Dugelay, J. L., (2012), “An efficient
LBP based descriptor for facial depth images
applied to gender recognition using RGB-D face
data”, ACCV, pp.133-145.
9. Li, B. Y. L., Mian, A. S., Liu, W., Krishna, A., (2013),
“Using kinect for face recognition under varying
poses, expressions, illumination and disguise” WACV,
10. Shermina, J., (2011), “Illumination Invariant Face
Recognition Using Discrete Cosine Transform
and Principal Component Analysis”, International
conference on Emerging Trendes in Electrical andComputer Technology (ICETECT), pp.826-830
11. Solh, M., AlRegib, G., (2012), “Hierarchical Hole
Filling (HHF): Depth Image Based Rendering
without Depth Map Filtering for 3DTV” International
Workshop on Multimedia Signal Processing (MMSP),
12. Kolb, A., Barth, E., Koch, R., Larsen, R., (2009),
“Time-of-Flight Sensors in Computer Graphics”,
Eurographics (STARs), pp.119-134
13. Bae, K., Kyung, K. M., Kim, T. C., (2011), “Depth
upsampling method using the confidence map for
a fusion of a high resolution color sensor and low
resolution time-of-flight depth sensor”, IS&T/SPIE
Electronic Imaging, pp.786406-786406.
14. Camplani, M., Salgado, L., (2012), “Adaptive Spatio-
Temporal filter for low-cost camera depth maps”,
IEEE Conference on Emerging Signal Processing
Applications (ESPA), pp.33-36.
15. Min, R., Kose, N., Dugelay, J. L., (2014), “Kinect
Face DB: A Kinect Database for Face Recognition”,
IEEE Transactions on Systems, Man, and Cybernetics:
Systems, Vol.44(11): pp.1534-1548.