DOI: 10.17586/1023-5086-2022-89-02-25-35
УДК: 004.932.4
Study of the ability of neural networks to extract and use semantic information when they are trained to reconstruct noisy images
Full text on elibrary.ru
Publication in Journal of Optical Technology
Титаренко М.А., Малашин Р.О. Исследование способностей нейронных сетей к извлечению и использованию семантической информации при обучении восстановлению зашумлённых изображений // Оптический журнал. 2022. Т. 89. №2. С. 25–35. http://doi.org/ 10.17586/1023-5086-2022-89-02-25-35
Titarenko M.A., Malashin R.O. Study of the ability of neural networks to extract and use semantic information when they are trained to reconstruct noisy images [in Russian] // Opticheskii Zhurnal. 2022. V. 89. № 2. P. 25–35. http://doi.org/ 10.17586/1023-5086-2022-89-02-25-35
Mikhail Alekseevich Titarenko and Roman Olegovich Malashin, "Study of the ability of neural networks to extract and use semantic information when they are trained to reconstruct noisy images," Journal of Optical Technology. 89(2), 81-88 (2022). https://doi.org/10.1364/JOT.89.000081
Subject of study. This paper discusses how deep convolutional neural networks can be used to improve images obtained under noisy conditions when supplementary information concerning objects on the image is input in the form of segmentation masks. Several methods of using semantic information during the operation of the network are studied. First, by introducing a mask along with the image at the input of the network, and second, by creating a loss function by using a segmentation mask. Method. Several experimental series are carried out both with and without using various types of semantic information. The experiments were run with several noise intensities. The image-reconstruction quality as a whole is analyzed, along with the reconstruction quality of a region of images corresponding to an entire class. The class of road signs was chosen as the target class since it possesses less variability than many other classes, and this gives an advantage to the network when it is reconstructed in the presence of semantic information compared with in its absence. The COCO dataset with marked segmentation maps was used during the study. Test surroundings for reconstructing the images, with the possibility of visualizing the results of the testing, were developed to analyze the semantic properties of all the objects contained in the COCO set, and this made it possible to draw certain useful conclusions concerning how various properties of the objects influence the accuracy with which they are reconstructed. Main results. We show that, under very noisy conditions, a reconstruction network trained using supplementary information in the form of segmentation masks can do a better job of reconstructing objects that correspond to the masks (by 3.5%), while lowering the capability of the network to reconstruct the entire image is only insignificantly reduced (by 0.4%). However, no such quality increment is obtained for weak and medium noise. Practical significance. Our goal in this paper was not to create a finished algorithm and neural-network architecture, but only to study the possible properties of such algorithms; we therefore supplied semantic reference markers to the input of the neural networks. In developing such a method, a segmenter network that would automatically extract information from a noisy image can be added (the process itself can be iterative in this case—the image is improved after being segmented, and a refined segmentation mask is constructed from the improved image).
image enhancement, image segmentation, deep neural networks
OCIS codes: 150.1135, 100.2980
References:1. K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising,” IEEE Trans. Image Proc. 26, 3142–3155 (2017).
2. X. J. Mao, C. Shen, and Y.-B. Yang, “Image restoration using very deep convolutional encoder–decoder networks with symmetric skip connections,” in Proceedings of the 30th International Conference on Neural Information Processing Systems (2016), pp. 2810–2818.
3. S. Anwar and N. Barnes, “Real image denoising with feature attention,” in IEEE International Conference on Computer Vision (2019).
4. T. Remez, O. Litany, and R. Giryes, “Deep class aware denoising,” in International Conference on Sampling Theory and Application (2017), pp. 138–142.
5. A. Lucas, S. Lopez-Tapia, R. Molina, and A. K. Katsaggelos, “Generative adversarial networks and perceptual losses for video super-resolution,” IEEE Trans. Image Process 28, 3312–3327 (2019).
6. M. A. Titarenko and R. O. Malashin, “Image enhancement by deep neural networks using high-level information,” J. Opt. Technol. 87(10), 604–610 (2020).
7. L. Jin, W. Zhang, G. Ma, and E. Song, “Learning deep CNNs for impulse noise removal in images,” J. Visual Commun. Image Representation 62(7), 193–205 (2019).
8. S. Zhou, Y. Hu, and H. Jiang, “Multi-view image denoising using convolutional neural network,” Sensors (Basel) 19, 2597 (2019).
9. K. Zhang, Y. Li, W. Zuo, L. Zhang, L. V. Gool, and R. Timofte, “Plug-and-play image restoration with deep denoiser prior,” IEEE Trans. Pattern Anal. Mach. Intell. (to be published).
10. M. P. Heinrich, M. Stille, and T. M. Buzug, “Residual U-Net convolutional neural network architecture for low-dose CT denoising,” Curr. Dir. Biomed. Eng. 4(1), 297–300 (2018).
11. M. Reymann, T. Würfl, P. Ritt, B. Stimpel, M. Cachovan, A. H. Vija, and A. Maier, “U-Net for SPECT image denoising,” in IEEE Nuclear Science Symposium and Medical Imaging Conference (2019).
12. J. M. Crespo, V. Moreno, J. R. Rabuñal, A. Pazos, and M. C. Carbia, “Fringe pattern denoising using U-Net based neural network,” EPJ Web Conf. 238, 06009 (2020).
13. Z. Feng, Z. Li, A. Cai, L. Li, B. Yan, and L. Tong, “A preliminary study on projection denoising for low-dose CT imaging using modified dual-domain U-net,” in Third International Conference on Artificial Intelligence and Big Data, Chengdu, China, 28–31 May 2020, pp. 223–226.
14. T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, P. Dollár, D. Ramanan, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” in European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014, pp. 740–755.
15. D. Kingma and J. Ba, “Adam: a method for stochastic optimization,” https://arxiv.org/pdf/1412.6980.pdf.