Lujie Zhou  1 , Jianwu Dang 1,2, and Zhenhai Zhang1

1School of Automation and Electrical Engineering, Lanzhou Jiaotong University, Lanzhou, 730070, P.R. China
2Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphic Image Processing, Lanzhou, 730070, China


Received: December 10, 2019
Accepted: October 13, 2020
Publication Date: April 1, 2021

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||  


In the process of high-speed train operation, numerous text-based on-board log data are recorded by on-board safety computers. Machine learning methods can be used to help technicians make correct fault diagnosis decisions using this on-board log reasonably. The imbalance of on-board log data affects the performance of fault diagnosis, resulting in lower accuracy of fault class. To address this problem, this work proposes a fault diagnosis method for on-board equipment based on imbalanced text classification. First, the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm is used to realize text feature extraction and vector transformation of on-board log data, and then an improved bagging ensemble model based on Kernel Extreme Learning Machine (KELM) is established. This model establishes the ensemble classifier in the bagging framework, and the KELM is used as the basic classifier. By random under-sampling of majority class samples to create balanced subsets, and subsets are used to train the base classifiers. An imbalanced classification problem is converted into several balanced classification problems, which ensures the diversity of basic classifiers and improves the recognition effect of the fault class. The experiment and analysis of on-board log data of a railway bureau show that the model can improve the accuracy, recall, precision, F-measure, ROC, and AUC of fault diagnosis.

Keywords: On-board Equipment; Fault Diagnosis; Imbalanced Text Classification; Kernel Extreme Learning Machine; Bagging Ensemble


  1. [1] L. J. Zhou and Y. Dong. Research on Fault Diagnosis Method for On-board Equipment of Train Control System Based on GA-BP Neural Network. Journal of Railway Science and Engineering, 15:3257–3265, 2018.
  2. [2] X. Liang. Bayesian Network Based Fault Diagnosis Method of On-board Equipment of Train Control System. PhD thesis, 2016.
  3. [3] X. Liang and H. F. Wang. Fault Diagnosis Method for On-board Equipment of Train Control System Based on Bayesian Network. Journal of Railway, 39:93–100, 2017.
  4. [4] Shaohua Sun, Zemei Dai, Xinkui Xi, Xin Shan, and Bo Wang. Ensemble Machine Learning Identification of Power Fault Countermeasure Text Considering Word String TF-IDF Feature. In Proceedings of 2018 IEEE International Conference of Safety Produce Informatization, IICSPI 2018, pages 610–616. Institute of Electrical and Electronics Engineers Inc., apr 2019.
  5. [5] Y. Zhao and T. H. Xu. Fault Diagnosis of On-board Equipment of High-speed Railway Signal System Based on Text Mining. Journal of Railway, 37:53–59, 2015.
  6. [6] G. W. Shang, Y. H. Yuan, J. Wang, and F. W. Hu. Research on Fault Feature Extraction and Diagnosis Method of Train Control vehicle equipment based on Labeled-LDA. Journal of Railway, 41:56–66, 2019.
  7. [7] Mateusz Buda, Atsuto Maki, and Maciej A. Mazurowski. A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106:249–259, oct 2018.
  8. [8] Weiwei Zong, Guang Bin Huang, and Yiqiang Chen. Weighted extreme learning machine for imbalance learning. Neurocomputing, 101:229–242, feb 2013.
  9. [9] Yong Zhang, Bo Liu, Jing Cai, and Suhua Zhang. Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution. Neural Computing and Applications, 28(1):259–267, dec 2017.
  10. [10] L. B. Yang, P. Li, and R. Xue. Intelligent Classification of Fault of Railway Signal Equipment Based on Imbalanced Text Data Mining. Journal of Railway, 40:59–66, 2018.
  11. [11] Y. G. Xu, C. L. Lai, and F. Luo. Bagging Ensemble Fault Diagnosis Modeling with Imbalanced Classification in Wastewater Treatment Plant. Journal of South China University of Technology ( Natural Science Edition), 46:107– 115, 2018.
  12. [12] L. Rokach. Chapter 2. In Pattern Classification Using Ensemble Methods. Beijing: National Defense Industry Press, 2015.
  13. [13] Guang Bin Huang, Hongming Zhou, Xiaojian Ding, and Rui Zhang. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(2):513–529, apr 2012.
  14. [14] Chapter 2. In Typical Faults of Train Control Vehicleborne Equipment. China Railway Corporation, Beijing, 2013.
  15. [15] Bhagat Singh Raghuwanshi and Sanyam Shukla. Class imbalance learning using UnderBagging based kernelized extreme learning machine. Neurocomputing, 329:172–187, feb 2019.
  16. [16] J. Li and D. C. Li. Wind Power Time Series Prediction Using Optimized Kernel Extreme Learning Machine Method. Journal of Physics, 65:39–48, 2016.

Latest Articles


27th percentile
Powered by  Scopus

SCImago Journal & Country Rank

Enter your name and email below to receive latest published articles in Journal of Applied Science and Engineering.