Principle of least action in dynamically configured image analysis systems

Malashin, R.O.

Full text «Opticheskii Zhurnal»

Full text on elibrary.ru

Publication in Journal of Optical Technology

For Russian citation (Opticheskii Zhurnal):

Малашин Р.О. Принцип наименьшего действия в динамически конфигурируемых системах анализа изображений // Оптический журнал. 2019. Т. 86. № 11. С. 5–13. http://doi.org/10.17586/1023-5086-2019-86-11-05-13

Malashin R.O. Principle of least action in dynamically configured image analysis systems [in Russian] // Opticheskii Zhurnal. 2019. V. 86. № 11. P. 5–13. http://doi.org/10.17586/1023-5086-2019-86-11-05-13

For citation (Journal of Optical Technology):

R. O. Malashin, "Principle of least action in dynamically configured image analysis systems," Journal of Optical Technology. 86(11), 678-685 (2019). https://doi.org/10.1364/JOT.86.000678

Abstract:

We propose and justify the development of neural network architectures for training image analysis systems with a dynamically configurable calculation structure. The proposed systems, trained in accordance with the principle of least action, are useful for increasing the processing speed of large amounts of data and can help overcome other shortcomings of deep architectures. The problem of image classification is considered in detail, the solution of which reduces to training a network agent that operates in an environment of classifiers and indirectly perceives images through them. The proposed approach can be used to create algorithms in systems for automatic image analysis, where the critical characteristic is the average processing time of one frame, for example, in systems for image indexing based on their content.

Keywords:

dynamically configurable calculations, principle of least action, image analysis

Acknowledgements:

The research was supported by the Russian Science Foundation (19-71-00146).

OCIS codes: 150.1135, 100.4996

References:

1. L. S. Polak, Variational Principles of Mechanics (Fizmatlit, Moscow, 1959).
2. B. P. Babkin, The Experience of a Systematic Study of Complex-Nervous (Mental) Phenomena in a Dog (VMA, St. Petersburg, 1904).
3. Yu. E. Shelepin and N. N. Krasil’nikov, “Principle of least action, physiology of vision and conditioned reflex theory,” Ross. Fiziol. Zh. im. I. M. Sechenova 89(6), 725–730 (2003).
4. H. Hosseini, B. Xiao, and R. Poovendran, “Google’s cloud vision API is not robust to noise,” arXiv:1704.05051 (2017).
5. A. A. Kharkevich, Selected Works in 3 Volumes, Volume 3, Information Theory, Image Recognition (Nauka, Moscow, 1973).
6. P. Rosenbloom, “A cognitive odyssey: from the power law of practice to a general learning mechanism and beyond,” Tutorials Quant. Methods Psychol. 2, 43–51 (2006).
7. R. Hu, J. Andreas, M. Rohrbach, T. Darrell, and K. Saenko, “Learning to reason: end-to-end module networks for visual question answering,” in International Conference on Computer Vision (2017), pp. 804–813.
8. X. Wang, F. D. Z. Yu, and J. Gonzalez, “SkipNet: learning dynamic routing in convolutional networks,” arXiv:1711.09485 (2017).
9. A. Mosca and G. Magoulas, “Deep incremental boosting,” arXiv:1708.03704 (2017).
10. S. Lee, T. Chen, L. Yu, and C. Lai, “Image classification based on the boost convolutional neural network,” IEEE Access 6, 12755–12766 (2018).
11. P. Viola and M. J. Jones, “Robust real-time face detection,” Int. J. Comput. Vis. 57(2), 137–154 (2004).
12. K. Murphy, Machine Leaning: A Probabilistic Perspective (MIT Press, Cambridge, Massachusetts, 2012).
13. H. Larochelle and I. Murray, “The neural autoregressive distribution estimator,” in International Conference on Artificial Intelligence and Statistics (2011), pp. 29–37.
14. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput. 9(8), 1735–1780 (1997).
15. N. Hai, L. Anh, and M. Nakagawa, “Combination of LSTM and CNN for recognizing mathematical symbols,” in Proceedings of the 17th Information-Based Induction Sciences Workshop (2014).

16. T. Saina, O. Vinyals, and H. Sak, “Convolutional, long short-term memory, fully connected deep neural networks,” in International Conference on Acoustics, Speech, and Signal Processing (2015), pp. 4580–4584.
17. V. Minh, H. Nicolas, A. Graves, and K. Kavukcouglu, “Recurrent models of visual attention,” in Proceedings of Neural Information Processing Systems (2014).
18. K. Yu. Shelepin, G. E. Trufanov, V. A. Fokin, P. P. Vasil’ev, and A. V. Sokolov, “Digital visualization of the activity of neural networks of the human brain before, during, and after insight when images are being recognized,” J. Opt. Technol. 85(8), 468–475 (2018) [Opt. Zh. 85(8), 29–38 (2018)].