Class Imbalance Alleviation in Object Detection via YOLOv11-based Deep Dynamic Feature Fusion

2026

2026-03-15

Qi Yang¹, Bingkun Jiang¹, Jiatong Tang¹, Jianxi Huang ², and Minghao Li¹

¹Shenyang Ligong University

²Fuzhou University

Received: December 31, 2025
Accepted: January 12, 2026
Publication Date: March 15, 2026

Comparison of feature fusion network architectures. (A)FPN with a top-down pathway. (B)PANet adds a bottom-uppathway. (C)Our proposed SAF-Neck, which employs dynamic convolution to generate input-adaptive fusion strategies, illustrated by two different scenarios for coal and gangue class features

Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: BibTeX | https://doi.org/10.6180/jase.202608_31.019

Download PDF

Object detection models often face significant performance limitations. These challenges include severe sample imbalance, background interference, and target occlusion. Such issues are particularly prevalent in complex industrial and medical imaging domains. While existing solutions typically focus on data resampling or loss function re-weighting to handle imbalance, a fundamental bottleneck within the network architecture itself is often overlooked. Traditional feature fusion necks, such as FPN and PANet, rely on static convolutions that inevitably become biased towards the majority class during training, leading to the marginalization or loss of minority-class features. To address this critical issue at the feature-fusion level, we propose the Semantic-Aware Fusion Neck (SAF-Neck), which replaces the static fusion paradigm with a dynamic, input adaptive mechanism. By generating content-aware convolutional kernels for each input, SAF-Neck adaptively enhances the discriminative features of minority-class samples, preventing them from being suppressed by the majority class. We integrate this core innovation into a synergistic architecture with a Lightweight Probabilistic Spatial Attention-HGNetv2(LPSA-HGNetv2) and an imbalance-robust loss function , forming a comprehensive “front-end feature enhancement and back-end optimization” pipeline. We validate our model, SAF-YOLOv11, on a highly challenging industrial task of coal and gangue classification, characterized by a severe class imbalance ratio of up to 1 : 22. Experimental results show that our model achieves a 90.4% F1-score with a computational load of only 5.7 GFLOPs, outperforming the baseline by 4.1% in F1-score while being 13.6% more computationally efficient.

Keywords: Coal And Gangue Detection; Semantic-Aware Fusion; Lightweight Network; Class Imbalance; YOLOv11n

[1] J. Li and J. Wang, (2019) “Comprehensive utilization and environmental risks of coal gangue: A review” Journal of Cleaner Production 239: 117946.
[2] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. Van Der Laak, B. Van Ginneken, and C. I. Sánchez, (2017) “A survey on deep learning in medical image analysis” Medical image analysis 42: 60–88.
[3] H. Jin, C. Yu, Z. Gong, R. Zheng, Y. Zhao, and Q. Fu, (2023) “Machine learning techniques for pulmonary nodule computer-aided diagnosis using CT images: A systematic review” Biomedical Signal Processing and Control 79: 104104.
[4] R. U. Modi, M. Kancheti, A. Subeesh, C. Raj, A. K. Singh, N. S. Chandel, A. S. Dhimate, M. K. Singh, and S. Singh, (2023) “An automated weed identification framework for sugarcane crop: a deep learning approach” Crop Protection 173: 106360.
[5] A. Gupta, A. Anpalagan, L. Guan, and A. S. Khwaja, (2021) “Deep learning for object detection and scene perception in self-driving cars: Survey, challenges, and open issues” Array 10: 100057.
[6] R. Ameri, C.-C. Hsu, and S. S. Band, (2024) “A systematic review of deep learning approaches for surface defect detection in industrial applications” Engineering Applications of Artificial Intelligence 130: 107717.
[7] Y. Gao, J. Lin, J. Xie, and Z. Ning, (2020) “A real-time defect detection method for digital signal processing of industrial inspection applications” IEEE Transactions on Industrial Informatics 17(5): 3450–3459.
[8] J. Redmon and A. Farhadi, (2018) “Yolov3: An incremental improvement” arXiv preprint arXiv:1804.02767:
[9] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, (2020) “Yolov4: Optimal speed and accuracy of object detection” arXiv preprint arXiv:2004.10934:
[10] C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, et al., (2022) “YOLOv6: A single-stage object detection framework for industrial applications” arXiv preprint arXiv:2209.02976:
[11] A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Han, et al., (2024) “Yolov10: Real-time end-to-end object detection” Advances in Neural Information Processing Systems 37: 107984–108011.
[12] Z. Lv, Y. Fan, T. Sha, Y. Cui, Y. Wu, H. Lv, M. Sun, Y. Tu, Z. Xu, and W. Wang, (2025) “A large-scale open image dataset for deep learning-enabled intelligent sorting and analyzing of raw coal” Scientific Data 12(1): 403. DOI: 10.1038/s41597-025-04719-0.
[13] R. Li, L. Zhao, H. Wei, G. Hu, Y. Xu, B. Ouyang, and J. Tan, (2025) “Multi-defect type beam bridge dataset: GYU-DET” Scientific Data 12(1): 1101. DOI: 10.1038/s41597-025-05395-w.
[14] Z. Liu, Y. Wang, L. Ma, Y. Wu, G. He, X. Liang, and F. Wang, (2025) “CUs-YOLO: enhanced feature fusion model for coal and gangue recognition in complex environment of coal mine” Measurement Science and Technology 36(6): 065012.
[15] X. WEI, F. WANG, D. HE, C. LIU, and D. XU, (2024) “Coal gangue image recognition model based on CSPNet-YOLOv7 target detection algorithm” Coal Science and Technology 52(S1): 238–248.
[16] N. Li, K. Qin, X. Li, and A. Zhang, (2025) “A YOLOv7-based coal and gangue recognition model integrating super-resolution reconstruction” Computer Engineering and Applications 61(15): 343–352.
[17] R. Khanam and M. Hussain, (2024) “Yolov11: An overview of the key architectural enhancements” arXiv preprint arXiv:2410.17725:
[18] X. Zhao, W. Zhang, H. Zhang, C. Zheng, J. Ma, and Z. Zhang, (2024) “ITD-YOLOv8: An infrared target detection model based on YOLOv8 for unmanned aerial vehicles” Drones 8(4): 161.
[19] Y. Ge, Z. Li, and L. Meng, (2025) “YOLO-MSD: a robust industrial surface defect detection model via multi-scale feature fusion” Applied Intelligence 55(12): 1–18.
[20] W. Lv, S. Xu, Y. Zhao, G. Wang, J. Wei, C. Cui, Y. Du, Q. Dang, and Y. Liu, (2023) “Detrs beat yolos on real-time object detection” CoRR:
[21] H. Shen, Z. Wang, J. Zhang, and M. Zhang, (2024) “L-Net: A lightweight convolutional neural network for devices with low computing power” Information Sciences 660: 120131.
[22] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie. “Feature pyramid networks for object detection”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, 2117–2125.
[23] K. Han, Y. Wang, J. Guo, and E. Wu, (2023) “ParameterNet: Parameters are all you need” arXiv preprint arXiv:2306.14525:
[24] Z. Yu, H. Huang, W. Chen, Y. Su, Y. Liu, and X. Wang, (2024) “Yolo-facev2: A scale and occlusion aware face detector” Pattern Recognition 155: 110714.
[25] Z. Cao, L. Fang, Z. Li, and J. Li, (2023) “Lightweight target detection for coal and gangue based on improved Yolov5s” Processes 11(4): 1268.
[26] Y. Sui, L. Zhang, Z. Sun, W. Yi, and M. Wang, (2024) “Research on Coal and Gangue Recognition Based on the Improved YOLOv7-Tiny Target Detection Algorithm” Sensors 24(2): 456.
[27] D. Shang, Z. Lv, Z. Gao, and Y. Li, (2025) “Detection of coal gangue by YOLO deep learning method based on channel pruning” International Journal of Coal Preparation and Utilization 45(1): 231–243.
[28] H. Zhang and K. Ogasawara, (2023) “Grad-CAM-based explainable artificial intelligence related to medical text processing” Bioengineering 10(9): 1070.