Crowd Density Estimation Based On Multi-scale Information Fusion And Matching Network In Scenic Spots

Xiaojun Wang

doi:10.6180/jase.202306_26(6).0012

Crowd Density Estimation Based On Multi-scale Information Fusion And Matching Network In Scenic Spots

Computer Science and Information Engineering

XiaojunWang This email address is being protected from spambots. You need JavaScript enabled to view it.¹

¹College of Management, Dalian University of Finance and Economics, Dalian 116622, Liaoning, China

Received: July 12, 2022
Accepted: August 11, 2022
Publication Date: September 20, 2022

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.6180/jase.202306_26(6).0012

ABSTRACT

The crowd density estimation has important application value in intelligent safety prevention, traffic safety and tourist attractions safety prevention, etc. The task of crowd density estimation is to estimate the crowd density distribution by extracting and analyzing the crowd features. Traditional crowd density estimation methods show great differences in perspective changes in 2D images, resulting in loss of feature spatial information and difficulty in extracting scale features and crowd features. In this paper, we propose a novel crowd density estimation method based on multi-scale information fusion and matching network in scenic spots. A multi-scale feature extraction module is introduced to extract multi-scale features of different convolutional layers in matching networks. Through the combination of multi-scale asymmetric convolution and dilated convolution with different expansion rates, the expression ability of extracted semantic information and scale information is enhanced. Finally, in the multi-scale information fusion network, the semantic embedding method is used to introduce spatial information into high-level semantic information. The high level semantic information is introduced into the low level spatial information to enhance the feature expression. And the scale information is integrated with the spatial global context information to obtain the high quality density map and predict the crowd more accurately. Then, we conduct experiments on open data sets, and the results show that the presented model in this paper has good adaptability in scenic spots with large differences in crowd distribution, the average MSE with proposed method is below 15, which is the lowest value and it can extract features from different scenes to estimate density distribution and accurately count crowds.

Keywords: crowd density estimation; multi-scale information fusion; matching network; semantic embedding

REFERENCES

[1] Y. Xia, Y. He, S. Peng, X. Hao, Q. Yang, and B. Yin, (2021) “EDENet: Elaborate density estimation network for crowd counting" Neurocomputing 459: 108–121. DOI: 10.1016/j.neucom.2021.06.086.
[2] B. Li, H. Huang, A. Zhang, P. Liu, and C. Liu, (2021) “Approaches on crowd counting and density estimation: a review" Pattern Analysis and Applications 24(3): 853–874. DOI: 10.1007/s10044-021-00959-z.
[3] M. Jiang and S. Yin, (2021) “Facial expression recognition based on convolutional block attention module and multi-feature fusion" Int. J. of Computational Vision and Robotics: DOI: 10.1504/IJCVR.2022.10044018.
[4] X. Wang, S. Yin, K. Sun, H. Li, J. Liu, and S. Karim, (2020) “GKFC-CNN: Modified gaussian kernel fuzzy Cmeans and convolutional neural network for apple segmentation and recognition" Journal of Applied Science and Engineering 23(3): 555–562. DOI: 10.6180/jase.202009_23(3).0020.
[5] D. Wu, Z. Fan, and M. Cui, (2022) “Average up-sample network for crowd counting" Applied Intelligence 52(2): 1376–1388. DOI: 10.1007/s10489-021-02470-8.
[6] L. Zhao, Z. He, W. Cao, and D. Zhao, (2018) “Real-Time Moving Object Segmentation and Classification From HEVC Compressed Surveillance Video" IEEE Transactions on Circuits and Systems for Video Technology 28(6): 1346–1357. DOI: 10.1109/TCSVT.2016.2645616.
[7] H.-L. Zhu, P. Liu, J.-F. Liu, and X.-L. Tang, (2012) “A graph analysis method for abnormal crowd state detection" Zidonghua Xuebao/Acta Automatica Sinica 38(5): 742–750. DOI: 10.3724/SP.J.1004.2012.00742.
[8] O. Müller, A. Simons, and M. Weinmann, (2017) “Beyond crowd judgments: Data-driven estimation of market value in association football" European Journal of Operational Research 263(2): 611–624. DOI: 10.1016/j.ejor.2017.05.005.
[9] J. Yu, H. Li, S.-L. Yin, Q. Shi, and S. Karim, (2020) “Dynamic gesture recognition based on deep learning in human-to-computer interfaces" Journal of Applied Science and Engineering 23(1): 31–38. DOI: 10.6180/jase.202003_23(1).0004.
[10] G. Gao, J. Gao, Q. Liu, Q. Wang, and Y. Wang, (2020) “Cnn-based density estimation and crowd counting: A survey" arXiv preprint arXiv:2003.12783: DOI: 10.48550/arXiv.2003.12783.
[11] P. L. Mazzeo, R. Contino, P. Spagnolo, C. Distante, E. Stella, M. Nitti, and V. Renò, (2020) “MH-MetroNet- A Multi-Head CNN for passenger-crowd attendance estimation" Journal of Imaging 6(7): DOI: 10.3390/JIMAGING6070062.
[12] S. Cahyawijaya, B. Wilie, and W. Adiprawita. “IDEnet: Inception-based deep convolutional neural network for crowd counting estimation”. In: 2018-October. Cited by: 1. 2018, 548–553. DOI: 10.1109/EECSI.2018.8752637.
[13] L. Chen, G. Wang, and G. Hou, (2021) “Multi-scale and multi-column convolutional neural network for crowd density estimation" Multimedia Tools and Applications 80(5): 6661–6674. DOI: 10.1007/s11042- 020-10002-8.
[14] A. Zhang, X. Jiang, B. Zhang, and X. Cao, (2020) “Multi-scale supervised attentive encoder-decoder network for crowd counting" ACM Transactions on Multimedia Computing, Communications and Applications 16(1s): DOI: 10.1145/3356019.
[15] Y. Wang, S. Hu, G. Wang, C. Chen, and Z. Pan, (2020) “Multi-scale dilated convolution of convolutional neural network for crowd counting" Multimedia Tools and Applications 79(1-2): 1057–1073. DOI: 10.1007/s11042-019-08208-6.
[16] Y.-J. Ma, H.-H. Shuai, and W.-H. Cheng, (2022) “Spatiotemporal Dilated Convolution with Uncertain Matching for Video-Based Crowd Estimation" IEEE Transactions on Multimedia 24: 261–273. DOI: 10.1109/TMM.2021.3050059.
[17] S. Huang, H. Zhou, Y. Liu, and R. Chen, (2020) “High-Resolution Crowd Density Maps Generation with Multi-Scale Fusion Conditional GAN" IEEE Access 8: 108072–108087. DOI: 10.1109/ACCESS.2020.3000741.
[18] D. B. Sam, S. Surya, and R. V. Babu. “Switching convolutional neural network for crowd counting”. In: 2017-January. Cited by: 594. 2017, 4031–4039. DOI: 10.1109/CVPR.2017.429.
[19] S. K. Tripathy and R. Srivastava, (2021) “AMS-CNN: Attentive multi-stream CNN for video-based crowd counting" International Journal of Multimedia Information Retrieval 10(4): 239–254. DOI: 10.1007/s13735-021-00220-7.
[20] R. Wang, T. Liu, J. Lu, and Y. Zhou, (2022) “Interpretable Optimization Training Strategy-Based DCNN and Its Application on CT Image Recognition" Mathematical Problems in Engineering 2022: DOI: 10.1155/2022/2170596.
[21] S. Yin and H. Li, (2020) “Hot Region Selection Based on Selective Search and Modified Fuzzy C-Means in Remote Sensing Images" IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13: 5862–5871. DOI: 10.1109/JSTARS.2020.3025582.
[22] L. Zhu, H. Zhang, S. Ali, B. Yang, and C. Li, (2020) “Crowd counting via Multi-Scale Adversarial Convolutional Neural Networks" Journal of Intelligent Systems 30(1): 180–191. DOI: 10.1515/jisys-2019-0157.
[23] M. Liu, J. Nie, and P. Lin, (2022) “Experimental and Numerical Analysis of the Uniaxial Tensile Properties of F321 Austenitic Stainless Steel at Different Temperatures" Acta Mechanica Solida Sinica 35(3): 409–420. DOI: 10.1007/s10338-021-00304-1.
[24] M. Yuebo, C. Xuanrun, L. Guanghui, and X. Shengjun, (2021) “Crowd density estimation method based on multi-feature information fusion" Laser and Optoelectronics Progress 58(20): DOI: 10.3788/LOP202158.2010021.
[25] Y. Zhang, H. Zhao, Z. Duan, L. Huang, J. Deng, and Q. Zhang, (2021) “Congested crowd counting via adaptive multi-scale context learning" Sensors 21(11): DOI: 10.3390/s21113777.
[26] J. T. Zhou, L. Zhang, J. Du, X. Peng, Z. Fang, Z. Xiao, and H. Zhu, (2022) “Locality-Aware Crowd Counting" IEEE Transactions on Pattern Analysis and Machine Intelligence 44(7): 3602–3613. DOI: 10.1109/TPAMI.2021.3056518.
[27] P. Li, H. Zhang, X. Fang, S. Li, H. Zhou, and X. Zhuang. “Research on Crowd Counting Based on Attention Mechanism and Dilation Convolution”. In: Cited by: 0. 2021, 157–162. DOI: 10.1109/ICCCS52626.2021.9449170.
[28] R. Gouiaa, M. A. Akhloufi, and M. Shahbazi, (2021) “Advances in convolution neural networks based crowd counting and density estimation" Big Data and Cognitive Computing 5(4): DOI: 10.3390/bdcc5040050.