УДК: 004.93'1

Methods for the construction of image descriptors for the global visual localization problem

Publication in Journal of Optical Technology

For Russian citation (Opticheskii Zhurnal):

Недошивина Л.С., Петерсон М.В. Исследование методов построения дескрипторов изображений применительно к задаче глобальной визуальной локализации // Оптический журнал. 2017. Т. 84. № 6. С. 21–29.

Nedoshivina L.S., Peterson M.V. Methods for the construction of image descriptors for the global visual localization problem [in Russian] // Opticheskii Zhurnal. 2017. V. 84. № 6. P. 21–29.

For citation (Journal of Optical Technology):

L. S. Nedoshivina and M. V. Peterson, "Methods for the construction of image descriptors for the global visual localization problem," Journal of Optical Technology. 84(6), 377-383 (2017). https://doi.org/10.1364/JOT.84.000377

Abstract:

We examine methods used for the construction of image descriptors for the global visual indoor localization problem based on the aggregation of local features and deep-learning convolutional neural networks. We propose a criterion for estimating the quality of image matching results and a technique for selecting reference frames in a test sample consisting of images obtained by the sequential motion of the camera. Additionally, we present an evaluation of the efficiency and speed of the considered methods, which determined that methods based on convolutional neural networks provide more reliable image matching than techniques based on local features, although the former exhibit lower processing speed.

Keywords:

feature description, image matching, visual localization, convolutional neural networks

Acknowledgements:

The research was supported by the Ministry of Education and Science of the Russian Federation (Minobrnauka) (2323); Government of the Russian Federation (074-U01).

OCIS codes: 150.5758, 100.4996, 150.1135

References:

1. S. Russell and P. Norvig, Artificial Intelligence: a Modern Approach (Williams, Moscow, 2007).
2. “What is iBeacon from Apple?” http://www.wi‑life.ru/stati/wi‑fi/marketing ovye‑stati‑2/what‑is‑ibeacon‑from‑apple.
3. “Warehouse navigation,” http://www.mcfa.com/en/jungheinrich/customer‑focused‑solutions/warehousenav.
4. A. S. Potapov, “Problems of realization of visual systems of robots for nondeterministic environments,” J. Opt. Technol. 77(11), 662–666 (2010). [Opt. Zh. 77(11), 5–11 (2010)].
5. O. Booij, B. Terwijn, Z. Zivkovic, and B. Kröse, “Navigation using an appearance based topological map,” in IEEE International Conference on Robotics and Automation (2007), pp. 3927–3932.
6. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems (NIPS) (2012), pp. 1106–1114.
7. A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, “CNN features off-the-shelf: an astounding baseline for recognition,” in IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops (2014), pp. 512–519.
8. A. Babenko and V. Lempitsky, “Aggregating local deep features for image retrieval,” in Proceedings of the IEEE International Conference on Computer Vision (2015), pp. 1269–1277.
9. D. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60, 91–110 (2004).
10. H. Bay and L. V. Gool, “SURF: speeded up robust features,” in Eurpean Conference on Computer Vision (ECCV) (Springer, Berlin, 2006), pp. 404–417.
11. E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: an efficient alternative to SIFT or SURF,” in IEEE International Conference on Computer Vision (ICCV) (2011), pp. 2564–2571.
12. L. Ledwich and S. Williams, “Reduced SIFT features for image retrieval and indoor localization,” in Australasian Conference on Robotics and Automation (2004).
13. C. Valgren and A. Lilienthal, “SIFT, SURF and seasons: long-term outdoor localization using local features,” Rob. Auton. Syst. 58, 149–156 (2010).
14. G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray, “Visual categorization with bags of keypoints,” in ECCV Workshop on Statistical Learning in Computer Vision (2004).
15. Y. Yang and Y. Newsam, “Bag-of-visual-words and spatial extensions for land-use classification,” in Proceedings of the 18th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (2010), pp. 270–279.
16. F. Perronnin and C. Dance, “Fisher kernels on visual vocabularies for image categorization,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2007), pp. 1–8.
17. H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010), pp. 3304–3311.
18. “Places: the scene recognition database,” http://places.csail.mit.edu.
19. “Decorrelating and then whitening data,” http://courses.media.mit.edu/2010fall/mas622j/whiten.pdf.
20. “Image databases Corridor_kennedy,” http://cogrob.ensta‑paristech.fr/loopclosure.html.
21. “The database of images Freiburg_kidnap,” http://vision.in.tum.de/data/Datasets/rgbd‑dataset/download.
22. D. M. Powers, “Evaluation: from precision, recall and F-factor to ROC, informedness, markedness & correlation,” J. Mach. Learn. Technol. 2, 37–63 (2011).
23. “Library VLFeat,” http://www.vlfeat.org.
24. “Caffe Library,” http://caffe.berkeleyvision.org.
25. “ImageNet database,” http://image‑net.org.