Face recognition: a novel deep learning approach
Full text «Opticheskii Zhurnal»
Full text on elibrary.ru
Publication in Journal of Optical Technology
Sh. Ch. Pang, Zh. Zh. Yu Face recognition: a novel deep learning approach [на англ. яз.] // Оптический журнал. 2015. Т. 82. № 4. С. 54–65.
Sh. Ch. Pang, Zh. Zh. Yu Face recognition: a novel deep learning approach [in English] // Opticheskii Zhurnal. 2015. V. 82. № 4. P. 54–65.
Sh. Ch. Pang and Zh. Zh. Yu, "Face recognition: a novel deep learning approach," Journal of Optical Technology. 82(4), 237-245 (2015). https://doi.org/10.1364/JOT.82.000237
We propose a novel and robust deep learning method for face recognition, which uses effective image representations learned automatically to handle big data. There are two stages of the deep learning architecture in real-time application. First, in the offline training procedure, we train a stacked denoising autoencoder to learn generic image features from 80 million images from the Tiny Images Dataset used as auxiliary offline training data. Second, in the supervised object recognition procedure, we construct five layers as a feature extractor to produce an image representation and an additional classification layer, which we can use to further tune generic image features to adapt to specific object recognition by online training of the corresponding objects. Comparison with the state-of-the-art face recognition methods shows that our deep learning algorithm in face recognition is more accurate and it is a perfect processing tool for the big data problem.
Big data, face recognition, deep learning, feature extraction, feature learning
Acknowledgements:This work has been supported by Specialized Research Fund for the Doctoral Program of Higher Education of China (Grant No.20120061110045) and supported by the Project of Science and Technology Development Plan of Jilin Province, China (Grant No. 20150204007GX).
OCIS codes: 100.0100, 100.3008, 100.4996
References:1. IBM What Is Big Data: Bring Big Data to the Enterprise, http://www-01.ibm.com/software/in/data/bigdata/.IBM. 2012.
2. Michel F. How Many Photos Are Uploaded to Flickr Every Day and Month? http://www. flickr.com/photos/franckmichel/6855169886/. 2012.
3. Wu X., Zhu X., Wu G. Q., Ding W. Data mining with big data // Knowledge and Data Engineering. IEEE Transactions on. 2014. V. 26. № 1. P. 97–107.
4. Mervis J. U.S. Science Policy: Agencies Rally to Tackle Big Data // Science. 2012. V. 336. № 6077. P. 22.
5. Labrinidis A., Jagadish H. Challenges and Opportunities with Big Data // Proc. VLDB Endowment. 2012. V. 5. № 12. P. 2032–2033.
6. Benayed S., Eltaher M., Lee J. Developing Kinect-like Motion Detection System using Canny Edge Detector // American Journal of Computing Research Repository. 2014. V. 2. № 2. P. 28–32.
7. Hinton G., Deng L., Yu D., Dahl G., Mohamed A., Jaitly N., Senior A., Vanhoucke V., Nguyen P., Sainath T., Kingsbury B. Deep neural networks for acoustic modeling in speech recognition // IEEE Signal Processing Magazine. 2012. V. 29. № 6. P. 82–97.
8. Potapov A.S., Malyshev I.A., Puysha A.E., Averkin A.N. New paradigm of learnable computer vision algorithms based on the representational MDL principle // Proc. SPIE. 2010. V. 7696. P. 769606.
9. Potapov A.S. Theoretico-informational approach to the introduction of feedback into multilevel machine-vision systems // JOT. 2007. V. 74. № 10. P. 694–699.
10. Torralba A., Fergus R., Freeman W.T. 80 Million Tiny Images: a Large Data Set for Nonparametric Object and Scene Recognition // PAMI. IEEE Transactions on. 2008. V. 30. № 11. P. 1958–1970.
11. Niyogi X. Locality Preserving Projections // Neural Inform. Proc. Systems. 2004. V. 16. P. 153.
12. Averkin A., Potapov A. Learning Representative Features for Facial Images Based on a Modified Principal Component Analysis // Proc. AIP Conf. 2013. V. 1537. P. 76–84.
13. Wright J., Yang A. Y., Ganesh A., Sastry S. S., Ma Y. Robust Face Recognition via Sparse Representation // PAMI. IEEE Transactions on. 2009. V. 31. № 2. P. 210–227.
14. He X., Cai D., Yan S., Zhang H.J. Neighborhood Preserving Embedding // Computer Vision. 2005. ICCV 2005. Tenth IEEE Intern. Conf. on. IEEE. 2005. V. 2. P. 1208–1213.
15. Vincent P., Larochelle H., Lajoie I., Bengio Y., Manzagol P.A. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion // The Journal of Machine Learning Research. 2010. V. 11. P. 3371–3408.
16. Belhumeur P., Hepanha J., Kriegman D. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection // IEEE Trans. PAMI. 1997. V. 19. № 7. P. 711−720.
17. Saul L.K., Roweis S.T. Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifolds // The Journal of Machine Learning Research. 2003. V. 4. P. 119–155.
18. Belkin M., Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering // NIPS. 2001. V. 14. P. 585–591.
19. Bouattour H., Fogelman-Soulie F., Viennet E. Solving the Human Face Recognition Task Using Neural Nets // Artificial Neural Networks. 1992. V. 2. P. 1595–1598.
20. Rajaraman A., Ullman J. Mining of Massive Data Sets. Cambridge: Univ. Press, 2011.
21. Vincent P., Larochelle H., Bengio Y., Manzago P.A. Extracting and Composing Robust Features with Denoising Autoencoders // Proc. of the 25th Intern. Conf. on Machine Learning. ACM. 2008. P. 1096–1103.
22. Lowe D.G. Distinctive Image Features from Scale-Invariant Keypoints // International Journal of Computer Vision. 2004. V. 60. № 2. P. 91–110.
23. Dalal N., Triggs B. Histograms of Oriented Gradients for Human Detection // Computer Vision and Pattern Recognition.2005. CVPR 2005. IEEE Computer Society Conference on. IEEE. 2005. V. 1. P. 886–893.
24. Deep Learning Methods for Vision, http://cs.nyu.edu/~fergus/tutorials/deep_learning_cvpr12/. CVPR. 2012.
25. Bengio Y., Courville A., Vincent P. Representation Learning: a Review and New Perspectives // IEEE Trans. PAMI. Special issue Learning Deep Architectures. 2013.
26. Hinton G. E. Learning multiple layers of representation // Trends in Cognitive Sciences. 2007. V. 11. № 10. P. 428–434.
27. Miller G., Fellbaum C. Wordnet: an Electronic Lexical Database, http://www.cogsci. princeton. edu/wn. 1998.
28. Wang N., Yeung D.Y. Learning a Deep Compact Image Representation for Visual Tracking // Advances in Neural Information Processing Systems. 2013. P. 809–817.
29. Zadeh M.R., Amin S., Khalili D., Singh V.P. Daily Outflow Prediction by Multi Layer Perceptron with Logistic Sigmoid and Tangent Sigmoid Activation Functions // Water Resources Management. 2010. V. 24. № 11. P. 2673–2688.