- [1] Y. Guo, Y. Liu, T. Georgiou, and M. S. Lew, (2018) “A review of semantic segmentation using deep neural networks" International journal of multimedia in formation retrieval 7: 87–93. DOI: 10.1007/s13735-017-0141-z.
- [2] Y. Xu, M.Li, L. Cui, S. Huang, F. Wei, and M. Zhou. “Layoutlm: Pre-training of text and layout for doc ument image understanding”. In: Proceedings of the 26th ACMSIGKDDinternational conference on knowl edge discovery & data mining. 2020, 1192–1200. DOI: 10.1145/3394486.3403172.
- [3] H.Shao, L. Wang, R. Chen, H. Li, and Y. Liu. “Safety enhanced autonomous driving using interpretable sensor fusion transformer”. In: Conference on Robot Learning. PMLR. 2023, 726–737. DOI: 10.48550/arXiv.2207.14024.
- [4] L. Hou, Y. Cheng, N. Shazeer, N. Parmar, Y. Li, P. Korfiatis, T. M. Drucker, D. J. Blezek, and X. Song. High Resolution Medical Image Analysis with Spatial Partitioning. 2019. DOI: 10.48550/arXiv.1909.03108. arXiv: 1909.03108 [eess.IV].
- [5] Y. Yao, M. Xu, C. Choi, D. J. Crandall, E. M. Atkins, and B. Dariush. “Egocentric Vision-based Future Ve hicle Localization for Intelligent Driving Assistance Systems”. In: 2019 International Conference on Robotics and Automation (ICRA). 2019, 9711–9717. DOI: 10.1109/ICRA.2019.8794474.
- [6] E. Shelhamer, J. Long, and T. Darrell, (2017) “Fully Convolutional Networks for Semantic Segmentation" IEEETransactionsonPatternAnalysisandMachine Intelligence 39(4): 640–651. DOI: 10.1109/TPAMI.2016.2572683.
- [7] J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu. “Dual Attention Network for Scene Segmentation”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. DOI: 10.1109/CVPR.2019.00326.
- [8] H.Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. “Pyramid Scene Parsing Network”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. DOI: 10.1109/CVPR.2017.660.
- [9] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, (2018) “DeepLab: Semantic Image Seg mentation with Deep Convolutional Nets, Atrous Convo lution, and Fully Connected CRFs" IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4): 834–848. DOI: 10.1109/TPAMI.2017.2699184.
- [10] H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y. Iwamoto, X. Han, Y.-W. Chen, and J. Wu. “UNet 3+: A Full-Scale Connected UNet for Medical Im age Segmentation”. In: ICASSP 2020- 2020 IEEE In ternational Conference on Acoustics, Speech and Signal Processing (ICASSP). 2020, 1055–1059. DOI: 10.1109/ICASSP40776.2020.9053405.
- [11] H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia. “IC Net for Real-Time Semantic Segmentation on High Resolution Images”. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018. DOI: 10.48550/arXiv.1704.08545.
- [12] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang. “BiSeNet: Bilateral Segmentation Network for Real time Semantic Segmentation”. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018. DOI: 10.1007/978-3-030-01261-8_20.
- [13] A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello. ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. 2016. DOI: 10.48550/arXiv.1606.02147. arXiv: 1606.02147 [cs.CV].
- [14] X.Ding,C.Xia,X.Zhang,X.Chu,J.Han,andG.Ding. RepMLP: Re-parameterizing Convolutions into Fully connected Layers for Image Recognition. 2022. DOI: 10.48550/arXiv.2105.01883. arXiv: 2105.01883 [cs.CV].
- [15] G. Bender, H. Liu, B. Chen, G. Chu, S. Cheng, P.-J. Kindermans, and Q. V. Le. “Can Weight Sharing Out perform Random Architecture Search? An Investiga tion With TuNAS”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. DOI: 10.1109/CVPR42600.2020.01433.
- [16] Y. Chen, W. Li, and L. Van Gool. “ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018. DOI: 10.1109/CVPR.2018.00823.
- [17] J. Sun and Y. Li, (2021) “Multi-feature fusion network for road scene semantic segmentation" Computers & Electrical Engineering 92: 107155. DOI: 10.1016/j.compeleceng.2021.107155.
- [18] H. Pan, Y. Hong, W. Sun, and Y. Jia, (2023) “Deep Dual-Resolution Networks for Real-Time and Accurate Semantic Segmentation of Traffic Scenes" IEEE Transac tions on Intelligent Transportation Systems 24(3): 3448–3460. DOI: 10.1109/TITS.2022.3228042.
- [19] L. Sun, K. Yang, X. Hu, W. Hu, and K. Wang, (2020) “Real-Time Fusion Network for RGB-D Semantic Segmen tation Incorporating Unexpected Obstacle Detection for Road-Driving Images" IEEE Robotics and Automa tion Letters 5(4): 5558–5565. DOI: 10.1109/LRA.2020.3007457.
- [20] M. Orsic, I. Kreso, P. Bevandic, and S. Segvic. “In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. DOI: 10.1109/CVPR.2019.01289.
- [21] H.Zhang,C.Wu,Z.Zhang,Y. Zhu, H. Lin, Z. Zhang, Y. Sun, T. He, J. Mueller, R. Manmatha, M. Li, and A. Smola. “ResNeSt: Split-Attention Networks”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2022, 2736–2746. DOI: 10.1109/CVPRW56347.2022.00309.
- [22] B.Cheng,M.D.Collins,Y.Zhu,T.Liu,T.S.Huang,H. Adam, andL.-C. Chen. “Panoptic-DeepLab: A Sim ple, Strong, and Fast Baseline for Bottom-Up Panop tic Segmentation”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. DOI: 10.1109/CVPR42600.2020.01249.
- [23] S. Borse, Y. Wang, Y. Zhang, and F. Porikli. “Inverse Form: A Loss Function for Structured Boundary AwareSegmentation”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021, 5901–5911. DOI: 10.1109/CVPR46437.2021.00584.
- [24] Z. Chen, Y. Duan, W. Wang, J. He, T. Lu, J. Dai, and Y. Qiao. Vision Transformer Adapter for Dense Predictions. 2023. DOI: 10.48550/arXiv.2205.08534. arXiv: 2205. 08534 [cs.CV].
- [25] E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Al varez, and P. Luo. “SegFormer: Simple and Efficient Design for Semantic Segmentation with Transform ers”. In: Advances in Neural Information Processing Sys tems. Ed. by M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan. 34. Curran Associates, Inc., 2021, 12077–12090. DOI: 10.48550/arXiv.2105.15203.
- [26] D. Bashkirova, M. Abdelfattah, Z. Zhu, J. Akl, F. Al ladkani, P. Hu, V. Ablavsky, B. Calli, S. A. Bargal, and K. Saenko. “ZeroWaste Dataset: Towards Deformable Object Segmentation inCluttered Scenes”. In: Proceed ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022, 21147–21157. DOI: 10.1109/CVPR52688.2022.02047.
- [27] S. Sabour, N. Frosst, and G. E. Hinton. “Dynamic Routing Between Capsules”. In: Advances in Neural Information Processing Systems. Ed. by I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vish wanathan,andR.Garnett.30.CurranAssociates,Inc., 2017. DOI: 10.48550/arXiv.1710.09829.
- [28] V. Mazzia, F. Salvetti, and M. Chiaberge, (2021) “Efficient-capsnet: Capsule network with self-attention routing" Scientific reports 11(1): 14634. DOI: 10.1038/s41598-021-93977-0.
- [29] J. Rajasegaran, V. Jayasundara, S. Jayasekara, H. Jayasekara, S. Seneviratne, and R. Rodrigo. “Deep Caps: Going Deeper With Capsule Networks”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. DOI: 10.1109/CVPR.2019.01098.
- [30] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. 2017. DOI: 10.48550/arXiv.1704.04861. arXiv: 1704.04861 [cs.CV].
- [31] A. Kirillov, Y. Wu, K. He, and R. Girshick. “PointRend: Image Segmentation As Rendering”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020. DOI: 10.1109/CVPR42600.2020.00982.
- [32] L. Reiher, B. Lampe, and L. Eckstein. “A Sim2Real Deep Learning Approach for the Transformation of Images fromMultiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird’s Eye View”. In: 2020 IEEE 23rd International Conference on Intel ligent Transportation Systems (ITSC). 2020, 1–7. DOI: 10.1109/ITSC45102.2020.9294462.
- [33] M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang. “DenseASPP for Semantic Segmentation in Street Scenes”. In: Proceedings of the IEEE Conference on Com puter Vision and Pattern Recognition (CVPR). 2018. DOI: 10.1109/CVPR.2018.00388.
- [34] T. Takikawa, D. Acuna, V. Jampani, and S. Fidler. “Gated-SCNN: Gated Shape CNNs for Semantic Seg mentation”. In: Proceedings of the IEEE/CVF Interna tional Conference on Computer Vision (ICCV). 2019. DOI: 10.1109/ICCV.2019.00533.
- [35] J. Lee, D. Kim, J. Ponce, and B. Ham. “SFNet: Learn ing Object-Aware Semantic Correspondence”. In: Pro ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. DOI: 10.1109/CVPR.2019.00238.
- [36] R.P.K.Poudel,S.Liwicki,andR.Cipolla.Fast-SCNN: Fast Semantic Segmentation Network. 2019. DOI: 10.48550/arXiv.1902.04502. arXiv: 1902.04502 [cs.CV].
- [37] E. Romera, J. M. Álvarez, L. M. Bergasa, and R. Ar royo, (2018) “ERFNet: Efficient Residual Factorized Con vNet for Real-Time Semantic Segmentation" IEEE Trans actions on Intelligent Transportation Systems 19(1): 263–272. DOI: 10.1109/TITS.2017.2750080.
- [38] Z. Guo, Z. Chen, T. Yu, J. Chen, and S. Liu. “Progres sive Image Inpainting with Full-Resolution Residual Network”. In: Proceedings of the 27th ACM Interna tional Conference on Multimedia. MM ’19. Nice, France: Association for Computing Machinery, 2019, 2496 2504. DOI: 10.1145/3343031.3351022.
- [39] S. Wang, L. Yi, Q. Chen, Z. Meng, H. Dong, and Z. He. “Edge-aware Fully Convolutional Network with CRF-RNNLayer for Hippocampus Segmenta tion”. In: 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC). 2019, 803–806. DOI: 10.1109/ITAIC.2019.8785801.
- [40] H.Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. “Pyramid scene parsing network”. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, 2881–2890. DOI: 10.48550/arXiv.1612.01105.
- [41] H.Wang, X. Jiang, H. Ren, Y. Hu, and S. Bai. “Swift Net: Real-Time Video Object Segmentation”. In: Pro ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021, 1296–1305. DOI: 10.1109/CVPR46437.2021.00135.
- [42] H. Li, P. Xiong, H. Fan, and J. Sun. “DFANet: Deep Feature AggregationforReal-TimeSemanticSegmen tation”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019. DOI: 10.1109/CVPR.2019.00975.
- [43] S. Yan, C. Wu, L. Wang, F. Xu, L. An, K. Guo, and Y. Liu. “DDRNet: Depth Map Denoising and Refine ment for Consumer Depth Cameras Using Cascaded CNNs”. In: Proceedings of the European Conference on Computer Vision (ECCV). 2018. DOI: 10.1007/978-3 030-01249-6_10.