MRCNNAM: Mask Region Convolutional Neural Network Model Based On Attention Mechanism And Gabor Feature For Pedestrian Detection

Ye Wang

doi:10.6180/jase.202311_26(11).0005

MRCNNAM: Mask Region Convolutional Neural Network Model Based On Attention Mechanism And Gabor Feature For Pedestrian Detection

Computer Science and Information Engineering

Flow chart of improved mask RCNN

Ye Wang

Xiangyang Auto Vocational Technical College, Xiangyang 441100, Hubei Province, China

Received: November 4, 2022
Accepted: November 8, 2022
Publication Date: March 9, 2023

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.6180/jase.202311_26(11).0005

ABSTRACT

A pedestrian detection algorithm based on the improved Mask RCNN framework and Gabor feature is proposed to solve the problem of poor pedestrian detection effect in complex scenes. Firstly, k-means algorithm is adopted to cluster the target box of pedestrian data set to obtain an appropriate aspect ratio. Secondly, the full convolutional network (FCN) is used to segment the foreground object, and the local mask of pedestrian is obtained by pixel prediction. Finally, the overall mask of pedestrian is obtained by learning the local features of pedestrian with attention mechanism and Gabor feature. In order to verify the effectiveness of the improved algorithm, it is compared with the current representative target detection methods (such as Faster RCNN, YOLOv2, RFCN) on the same data set. The experimental results show that the improved algorithm improves the speed and accuracy of pedestrian detection (the accuracy is higher than 84%) and reduces the false detection rate.

Keywords: Pedestrian detection; Mask RCNN; FCN; Attention mechanism; Gabor feature

REFERENCES

[1] Z. Lin,W. Pei, F. Chen, D. Zhang, and G. Lu, (2022) “Pedestrian detection by exemplar-guided contrastive learning" IEEE transactions on image processing: DOI: 10.1109/TIP.2022.3189803.
[2] D. Tian, Y. Han, B.Wang, T. Guan, andW.Wei, (2021) “A review of intelligent driving pedestrian detection based on deep learning" Computational intelligence and neuroscience 2021: DOI: 10.1155/2021/5410049.
[3] Z. Shao, G. Cheng, J. Ma, Z.Wang, J.Wang, and D. Li, (2021) “Real-time and accurate UAV pedestrian detection for social distancing monitoring in COVID-19 pandemic" IEEE transactions on multimedia 24: 2069–2083. DOI: 10.1109/TMM.2021.3075566.
[4] Y. Xiao, K. Zhou, G. Cui, L. Jia, Z. Fang, X. Yang, and Q. Xia, (2021) “Deep learning for occluded and multiscale pedestrian detection: A review" IET Image Processing 15(2): 286–301. DOI: 10.1049/ipr2.12042.
[5] Y. Tang, B. Li, M. Liu, B. Chen, Y. Wang, and W. Ouyang, (2021) “Autopedestrian: an automatic data augmentation and loss function search scheme for pedestrian detection" IEEE transactions on image processing 30: 8483–8496. DOI: 10.1109/TIP.2021.3115672.
[6] Y. Chen, H. Wang, W. Li, C. Sakaridis, D. Dai, and L. Van Gool, (2021) “Scale-aware domain adaptive faster r-cnn" International Journal of Computer Vision 129(7): 2223–2243. DOI: 10.1007/s11263-021-01447-x.
[7] Y. Zhang, H. He, J. Li, Y. Li, J. See, and W. Lin. “Variational pedestrian detection”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 11622–11631. DOI: 10.1109/CVPR46437.2021.01145.
[8] L. Teng, H. Li, S. Yin, and Y. Sun, (2019) “Improved krill group-based region growing algorithm for image segmentation" International Journal of Image and Data Fusion 10(4): 327–341. DOI: 10.1080/19479832.2019.1604574.
[9] Q. Ming, Z. Zhou, L. Miao, H. Zhang, and L. Li. “Dynamic anchor learning for arbitrary-oriented object detection”. In: Proceedings of the AAAI Conference on Artificial Intelligence. 35. 3. 2021, 2355–2363.
[10] X. Zhang, F. Wan, C. Liu, X. Ji, and Q. Ye, (2021) “Learning to match anchors for visual object detection" IEEE Transactions on Pattern Analysis and Machine Intelligence 44(6): 3096–3109. DOI: 10.1109/TPAMI.2021.3050494.
[11] M. Turan, Y. Almalioglu, H. Araujo, E. Konukoglu, and M. Sitti, (2018) “Deep endovo: A recurrent convolutional neural network (rcnn) based visual odometry approach for endoscopic capsule robots" Neurocomputing 275: 1861–1870. DOI: 10.1016/j.neucom.2017.10.014.
[12] S.Woo, S. Hwang, H.-D. Jang, and I. S. Kweon, (2019) “Gated bidirectional feature pyramid network for accurate one-shot detection" machine vision and applications 30(4): 543–555. DOI: 10.1007/s00138-019-01017-9.
[13] J. U. Kim, S. Park, and Y. M. Ro. “Robust small-scale pedestrian detection with cued recall via memory learning”. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021, 3050–3059. DOI: 10.1109/ICCV48922.2021.00304.
[14] Z. Ge, J. Wang, X. Huang, S. Liu, and O. Yoshie, (2021) “Lla: Loss-aware label assignment for dense pedestrian detection" Neurocomputing 462: 272–281. DOI: 10.1016/j.neucom.2021.07.094.
[15] P. Li, Z. Chen, L. T. Yang, J. Gao, Q. Zhang, and M. J. Deen, (2018) “An improved stacked auto-encoder for network traffic flow classification" IEEE Network 32(6): 22–27. DOI: 10.1109/MNET.2018.1800078.
[16] S. Yin and J. Bi. “Medical image annotation based on deep transfer learning”. In: 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). IEEE. 2018, 47–49. DOI: 10.1109/Cybermatics_2018.2018.00042.
[17] Z.-R. Tang, R. Hu, Y. Chen, Z.-H. Sun, and M. Li, (2022) “Multi-expert learning for fusion of pedestrian detection bounding box" Knowledge-Based Systems 241: 108254. DOI: 10.1016/j.knosys.2022.108254.
[18] C. Ning, L. Menglu, Y. Hao, S. Xueping, and L. Yunhong, (2021) “Survey of pedestrian detection with occlusion" Complex & Intelligent Systems 7(1): 577–587. DOI: 10.1007/s40747-020-00206-8.
[19] L. Wang, Y. Shoulin, H. Alyami, A. A. Laghari, M. Rashid, J. Almotiri, H. J. Alyamani, and F. Alturise. A novel deep learning-based single shot multibox detector model for object detection in optical remote sensing images. 2022. DOI: 10.1002/gdj3.162.
[20] Z. Guo,W. Liao, Y. Xiao, P. Veelaert, andW. Philips, (2021) “Weak segmentation supervised deep neural networks for pedestrian detection" Pattern Recognition 119: 108063. DOI: 10.1016/j.patcog.2021.108063.