DOI: 10.17586/1023-5086-2021-88-12-36-41
УДК: 51-76, 004.032.26, 004.932.1, 004.8
Representation of categories through prototypes formed based on coordinated activity of units in convolutional neural networks
Full text «Opticheskii Zhurnal»
Full text on elibrary.ru
Publication in Journal of Optical Technology
Малахова Е.Ю. Представление категорий посредством прототипов согласованной активности нейронов в свёрточных нейронных сетях // Оптический журнал. 2021. Т. 88. № 12. С. 36–41. http://doi.org/10.17586/1023-5086-2021-88-12-36-41
Malakhova E.Yu. Representation of categories through prototypes formed based on coordinated activity of units in convolutional neural networks [in Russian] // Opticheskii Zhurnal. 2021. V. 88. № 12. P. 36–41. http://doi.org/10.17586/1023-5086-2021-88-12-36-41
E. Yu. Malakhova, "Representation of categories through prototypes formed based on coordinated activity of units in convolutional neural networks," Journal of Optical Technology. 88(12), 706-709 (2021). https://doi.org/10.1364/JOT.88.000706
Various techniques focus on understanding the way an image or a category concept is represented within convolutional neural networks. It is common to assume that one latent neuron can be a detector for a category or its dominant features. The analysis of the collective activity of neurons in hidden layers shows that representation of most of the categories is complex and distributed across as many as 93% of units participating in the encoding. To account for the complexity, this study suggests an approach that represents category by constructing a prototype formed as a covariance matrix of layer neurons’ activation. The approach makes it possible to take into account the coordinated response of the population of artificial neurons, as well as to preserve the complexity of the distributed activation pattern at different stages of processing, from low-level image statistics to the level of abstract and semantic features.
convolutional neural networks, information representation, interpretable deep learning, vision system model
OCIS codes: 200.4260, 330.4060
References:1. D. Erhan, Y. Bengio, A. Courville, and P. Vincent, “Visualizing higher-layer features of a deep network,” Tech. Rep. 1341 (University of Montreal Press, 2009), p. 3.
2. K. Simonyan, A. Vedaldi, and A. Zisserman, “Deep inside convolutional networks: visualising image classification models and saliency maps,” arXiv:1312.6034 (2013).
3. M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” in Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014, pp. 818−833.
4. A. Mahendran and A. Vedaldi, “Understanding deep image representations by inverting them,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 5188–5196.
5. A. Nguyen, J. Yosinski, and J. Clune, “Multifaceted feature visualization: uncovering the different types of features learned by each neuron in deep neural networks,” arXiv:1602.03616 (2016).
6. D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba, “Network dissection: quantifying interpretability of deep visual representations,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21–26 July 2017, pp. 3319–3327.
7. M. Kågebäck and O. Mogren, “Disentanglement by penalizing correlation,” in Proceedings of the NIPS Workshop on Learning Disentangled Features, Long Beach, California, 9 December 2017.
8. I. Higgins, D. Amos, D. Pfau, S. Racaniere, L. Matthey, D. Rezende, and A. Lerchner, “Towards a definition of disentangled representations,” arXiv:1812.02230 (2018).
9. A. Dosovitskiy and T. Brox, “Generating images with perceptual similarity metrics based on deep networks,” in Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS’16), Barcelona, 5–10 December 2016, pp. 658–666.
10. F. Gao, Y. Wang, P. Li, M. Tan, J. Yu, and Y. Zhu, “DeepSim: Deep similarity for image quality assessment,” Neurocomputing 257, 104–114 (2017).
11. L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 26 June–1 July 2016, pp. 2414–2423.
12. N. Kruger, P. Janssen, S. Kalkan, M. Lappe, A. Leonardis, J. Piater, A. J. Rodriguez-Sanchez, and L. Wiskott, “Deep hierarchies in the primate visual cortex: what can we learn for computer vision?” IEEE Trans. Pattern Anal. Mach. Intell. 35, 1847–1871 (2013).
13. C. F. Cadieu, H. Hong, D. L. K. Yamins, N. Pinto, D. Ardila, E. A. Solomon, N. J. Majaj, and J. J. DiCarlo, “Deep neural networks rival the representation of primate IT cortex for core visual object recognition,” PLoS Comput. Biol. 10(12), e1003963 (2014).
14. E. Yu. Malakhova, “Information representation space in artificial and biological neural networks,” J. Opt. Technol. 87(10), 598–603 (2019) [Opt. Zh. 87(10), 50–58 (2020)].
15. V. D. Glezer and I. I. Tsukerman, Information and Vision (AN SSSR, Moscow-Leningrad, 1961).
16. Yu. E. Shelepin, “Orientation selectivity and spatial frequency characteristics of receptive fields of occipital cortical neurons in cats,” Neurophysiology 13(3), 161–165 (1981) [Neirofiziologiia 13(3), 227–232 (1981)].
17. F. W. Campbell and J. G. Robson, “Application of Fourier analysis to the visibility of gratings,” J. Physiol. 197(3), 551–566 (1968).
18. D. J. Field, “What the statistics of natural images tell us about visual coding,” Proc. SPIE 1077, 269–276 (1989).
19. W. E. Vinje and J. L. Gallant, “Sparse coding and decorrelation in primary visual cortex during natural vision,” Science 287(5456) 1273–1276 (2000).
20. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv:1409.1556 (2014).
21. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, “ImageNet: a large-scale hierarchical image database,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami Beach, Florida, 20–21 June 2009, pp. 248–255.
22. E. Fix and J. L. Hodges, “Discriminatory analysis—nonparametric discrimination: consistency properties,” Int. Stat. Rev. 57(3), 238–247 (1989).