ITMO
ru/ ru

ISSN: 1023-5086

ru/

ISSN: 1023-5086

Scientific and technical

Opticheskii Zhurnal

A full-text English translation of the journal is published by Optica Publishing Group under the title “Journal of Optical Technology”

Article submission Подать статью
Больше информации Back

УДК: 004.93'1, 004.932.2

Convolutional deep-learning artificial neural networks

For Russian citation (Opticheskii Zhurnal):

Луцив В.Р. Сверточные искусственные нейронные сети глубокого обучения // Оптический журнал. 2015. Т. 82. № 8. С. 11–23.

 

Lutsiv V.R. Convolutional deep-learning artificial neural networks [in Russian] // Opticheskii Zhurnal. 2015. V. 82. № 8. P. 11–23.

For citation (Journal of Optical Technology):

V. P. Lutsiv, "Convolutional deep-learning artificial neural networks," Journal of Optical Technology. 82(8), 499-508 (2015). https://doi.org/10.1364/JOT.82.000499

Abstract:

This paper discusses the history of the appearance and development of the concept of convolutional artificial neural networks, which, because they use a learning technology based on back-propagation of the error signal, have become one of the most efficient tools of automatic image classification. Along with the possibilities of modern convolutional neural networks in the area of shape classification of objects, the features of their use in analyzing the information of other hierarchical levels have also been analyzed—from the classification of textures to the structural decomposition of images, based on the formation of attention zones. Convolutional networks are considered in close association with a description of their natural analogs, found in the neural ensembles of living visual systems.

Keywords:

artificial neural network, convolutional network, deep-learning network, image classification, structural analysis

Acknowledgements:

This work was carried out with the support of the Ministry of Education and Science of the Russian Federation and with the partial state support of the leading universities of the Russian Federation (Subsidy 074-U01).

OCIS codes: 100.2960, 100.3005, 100.3008, 100.5010, 110.2960, 150.1135

References:

1. M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” European Conference on Computer Vision (ECCV), Part I, Zurich, Switzerland, 6–12 September 2014, pp. 818–833.
2. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Back-propagation applied to handwritten zip code recognition,” Neural Comput. 1, 541 (1989).
3. V. R. Lutsiv and R. O. Malashin, “Object-independent structural image analysis: history and modern approaches,” J. Opt. Technol. 81, 642 (2014) [Opt. Zh. 81, No. 11, 31 (2014)].
4. F. Rosenblatt, Principles of Neurodynamics (Spartan Books, New York, 1962; Mir, Moscow, 1965).
5. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” Parallel Distrib. Proc. 1, 318 (1986).
6. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature 323, 533 (1986).
7. K. B. Farr and R. L. Hartman, “Optical-digital neural network system for aided target recognition,” Proc. SPIE 2485, 141 (1995).
8. K. Yamada, H. Kami, J. Tsukumo, and T. Temma, “Handwritten numeral recognition by multilayered neural network with improved learning algorithm,” in IJCNN International Joint Conference on Neural Networks, Washington, DC, 1989, pp. II-259–II-266.
9. K. Yamada, “Improved learning algorithm for multilayer neural networks and handwritten numeral recognition,” NEC Res. Dev. No. 98, 81 (July, 1990).
10. K. Fukushima, “Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biol. Cybernet. 36, No. 4, 93 (1980).
11. K. Fukushima, “Neural network model for selective attention in visual pattern recognition and associative recall,” Appl. Opt. 26, 4985 (1987).
12. D. Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information (W. H. Freeman, San Francisco, 1982; Radio i Svyaz’, Moscow, 1987).
13. B. Julesz, “Textrons. The elements of textural perception and their interactions,” Nature 290, 91 (1981).
14. K. Laws, “Rapid texture identification,” Proc. SPIE 238, 376 (1980).
15. R. Vistnes, “Texture models and image measures for segmentation,” Proceedings: 18th Image Understanding Workshop, Cambridge, MA, 6–8 April, 1988, pp. 1005–1015.
16. V. R. Lutsiv, D. S. Dolinov, A. K. Zherebko, and T. A. Novikova, “Using artificial neural networks in image processing problems,” J. Opt. Technol. 64, 112 (1997).
17. N. V. Zavalishin and I. B. Muchnik, Models of Visual Perception and Image-Analysis Algorithms (Nauka, Moscow, 1974).
18. V. Lutsiv, Automatic Image Analysis: An Object-Independent Structural Approach (Lambert Akademik Publishing, Saarbrucken, Germany, 2011).
19. V. R. Lutsiv, “Object-independent approach to the structural analysis of images,” J. Opt. Technol. 75, 708 (2008) [Opt. Zh. 75, No. 11, 26 (2008)].
20. V. Lutsiv and I. Malyshev, “Image structural analysis in the tasks of automatic navigation of unmanned vehicles and inspection of Earth surface,” Proc. SPIE 8897, 88970 (2013).
21. P. Viola and M. J. Jones, “Rapid object detection using a boosted cascade of simple features,” Proceedings of the IEEE Computer Vision and Pattern Recognition Conference, Kauai, HI, 8–14 December 2001, pp. I-501–I-518.
22. Q. Zhu, S. Avidan, M. Yeh, and K. Cheng, “Fast human detection using a cascade of histograms of oriented gradients,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, 2006, vol. 2, pp. 1491–1498.
23. N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, 2005, pp. 886–893.
24. P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, “Object detection with discriminatively trained part based models,” IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627 (2010).
25. http://www.image‑net.org.
26. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60, No. 2, 91 (2004).
27. H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded Up Robust Features,” Proceedings of the Ninth European Conference on Computer Vision, Graz, Austria, 7–13 May 2006, pp. 404–417.
28. J. M. Morel and G. Yu, “ASIFT: a new framework for fully affine invariant image comparison,” SIAM J. Imaging Sci. 2, 438 (2009).
29. G. Yu and J. M. Morel, “A fully affine invariant image comparison method,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Taipei, Taiwan, 2009, pp. 1597–1600.
30. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” NIPS Conference Proceedings, Lake Tahoe, Nevada, 3–8 December 2012, pp. 1106–1114.
31. A. S. Potapov, I. A. Malyshev, A. E. Puysha, and A. N. Averkin, “New paradigm of learnable computer vision algorithms based on representational MDL principle,” Proc. SPIE 7696, 769606 (2010).
32. A. Potapov, V. Batishcheva, and M. Peterson, “Limited generalization capabilities of autoencoders with logistic regression on training sets of small sizes,” IFIP Advances in Information and Communication Technology, L. Iliadis, I. Maglogiannis, and H. Papadopoulos, eds., vol. 436 (AIAI 2014) (Springer, New York, 2014), pp. 256–264.
33. A. G. Howard, “Some improvements on deep convolutional neural-network-based image classification,” Second International Conference on Learning Representations (ICLR2014), Banff, Canada, 14–16 April 2014.
34. I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, “Maxout networks,” Second International Conference on Learning Representations (ICLR2014), Banff, Canada, 14–16 April 2014 (arXiv preprint arXiv:1302.4389, 2013).
35. M. Lin, Q. Chen, and S. Yan, “Network in network,” Second International Conference on Learning Representations (ICLR2014), Banff, Canada, 14–16 April 2014.
36. M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional networks,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, 13–18 June 2010, pp. 2528–2535.