DOI: 10.17586/1023-5086-2020-87-12-57-66

Определение состояния глаз и рта в реальном масштабе времени: применение искусственного интеллекта на основе встроенных систем

Полный текст на elibrary.ru

Публикация в Journal of Optical Technology

Ссылка для цитирования:

Fei Liu, Changcheng Qin, Hongliu Yu. Real-time and efficient eyes and mouth state detection: an artificial intelligence application based on embedded systems (Определение состояния глаз и рта в реальном масштабе времени: применение искусственного интеллекта на основе встроенных систем) [на англ. яз.] // Оптический журнал. 2020. Т. 87 №12. С. 57–66. DOI: 10.17586/1023-5086-2020-87-12-57-66

Ссылка на англоязычную версию:

Fei Liu, Changcheng Qin, and Hongliu Yu, "Real-time and efficient eyes and mouth state detection: an artificial intelligence application based on embedded systems," ournal of Optical Technology. 87(12), 742-749 (2020) https://doi.org/10.1364/JOT.87.000742

Аннотация:

Известно, что многие из дорожных происшествий вызываются утомлением водителя. Часто для определения состояния утомления используется анализ состояния глаз и рта. Однако традиционные способы обработки изображений не дают удовлетворительной точности из-за изменений освещённости, положения головы и других факторов реального окружения. Хотя методы, основанные на глубоком обучении, достигли надлежащей точности обнаружения объектов, для работы в реальном масштабе времени они требуют применения мощных вычислителей. Для достижения удовлетворительной точности обнаружения глаз и рта в реальном масштабе времени в случае применения встроенных платформ, таких как NVIDIA Jetson TX2, необходимо улучшение алгоритмов распознавания объектов на основе глубокого обучения. В настоящей работе, основанной на оригинальной архитектуре YOLOv3-Tiny, мы не только дополнительно применили глубокое разностное обучение (deep residual learning), но также вычислили на основе обучающих наборов шесть новых опорных рамок (anchor boxes) с использованием алгоритма K-средних. Для повышения точности обнаружения были также использованы шесть методов дополнения наборов обучающих данных. Усовершенствованный алгоритм, предложенный в работе, обеспечивает удовлетворительную точность и скорость обнаружения на встроенной платформе типа TX2, подтвердив тем самым эффективность предлагаемых решений.

Ключевые слова:

состояние глаз, состояние рта, архитектура YOLOv3-Tiny, глубокое обучение, распознавание объектов, встроенные системы реального времени

Коды OCIS: 100.3008

Список источников:

Mardi Z., Ashtiani S.N.M., Mikaili M. EEG-based drowsiness detection for safe driving using chaotic features and statistical tests // Journal of medical signals and sensors. 201 V. 1. P. 130.
Jung S.-J., Shin H.-S., Chung W.-Y. Driver fatigue and drowsiness monitoring system with embedded electrocardiogram sensor on steering wheel // IET Intelligent Transport Systems. 2014. V. 8. P. 43–50.
Fatourechi M., Bashashati A., Ward R.K., Birch G.E. EMG and EOG artifacts in brain computer interface systems: A survey // Clinical neurophysiology. 2007. V. 118. P. 480–494.
Ma J., Murphey Y.L., Zhao H. Real time drowsiness detection based on lateral distance using wavelet transform and neural network // IEEE symposium series on computational intelligence. 2015. P. 411–418.
Yazdi M.Z., Soryani M. Driver drowsiness detection by identification of yawning and eye closure // International journal of automotive engineering. 2019. V. 9. P. 3033–3044.
Xiao Z., Hu Z., Geng L., Zhang F., Wu J., Li Y. Fatigue driving recognition network: fatigue driving recognition via convolutional neural network and long short-term memory units // Iet Intelligent Transport Systems. 2019. V. 13. P. 1410–141
Zhang F., Su J., Geng L., Xiao Z. Driver fatigue detection based on eye state recognition // International Conference on Machine Vision and Information Technology (CMVIT). 17–19 Feb. 201 Singapore. P. 105–110.
Ji Y., Wang S., Lu Y., Wei J., Zhao Y. Eye and mouth state detection algorithm based on contour feature extraction // Journal of Electronic Imaging 201 V. 27. P. 051205.
Zhao G., Liu S., Wang Q., Hu T., Chen Y., Lin L., Zhao D. Deep convolutional neural network for drowsy student state detection // Concurrency and Computation: Practice and Experience. 2018. V. 30(23). e4457.
Jakubowski J., Chmielinska J. Detection of driver fatigue symptoms using transfer learning // Bulletin of the Polish Academy of Sciences-technical Sciences. 2018. P. 869–874.
Krizhevsky A., Sutskever I., Hinton G.E. Imagenet classification with deep convolutional neural networks // Advances in neural information processing systems. 2012. P. 1097–1105.
Girshick R., Donahue J., Darrell T., Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation // Computer vision and pattern recognition. 2014. P. 580–587.
Girshick R. Fast R-CNN // International conference on computer vision. Dec. 12. 2015. Santiago, Chile. P. 1440–1448.
Ren S., He K., Girshick R., Sun J. Faster R-CNN: towards real-time object detection with region proposal networks // Neural information processing systems. 2015. P. 91–99.
Redmon J., Divvala S.K., Girshick R., Farhadi A. You only look once: unified, real-time object detection // Computer vision and pattern recognition. 2016. P. 779–788.
Redmon J., Farhadi A. YOLO9000: better, faster, stronger // Computer vision and pattern recognition. 2017. P. 6517–6525.
Redmon J., Farhadi A. YOLOv3: An incremental improvement // arXiv: Computer vision and pattern recognition. 2018.
Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C., Berg A.C. SSD: single shot multibox detector // European conference on computer vision. Oct. 8–16. 2016. Amsterdam. P. 21–37.
Lin T., Goyal P., Girshick R., He K., Dollar P. Focal loss for dense object detection // International conference on computer vision. Oct. 22–29. 2017. Venice. Italy. P. 2999–3007.

Lin T., Dollar P., Girshick R., He K., Hariharan B., Belongie S. Feature pyramid networks for object detection // Computer vision and pattern recognition. 2017. P. 936–944.
He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition // Computer vision and pattern recognition. 2016. P. 770–778.