Kuang-Yi Chou3, Huan-Chao Keh1, Nan-Ching Huang This email address is being protected from spambots. You need JavaScript enabled to view it.1, Shing-Hwa Lu2, Ding-An Chiang1 and Yuan-Cheng Cheng1

1Department of CSIE, Tamkang University, Tamsui, Taiwan 251, R.O.C.
2Department of Urology, School of Medicine, National Yang-Ming University and Department of Urology, Zhong Xiao Branch, Taipei City Hospital, Urological Center, Taipei, Taiwan 115, R.O.C.
3Center of General Education, National Taipei University of Nursing and Health Sciences, Taipei, Taiwan 112, R.O.C.


 

Received: December 2, 2010
Accepted: June 20, 2011
Publication Date: March 1, 2012

Download Citation: ||https://doi.org/10.6180/jase.2012.15.1.10  


ABSTRACT


Data mining technique is extensively used in medical application. One of key tools is the decision tree. When a decision tree is represented by a collection of rules, the antecedents of individual rules may contain irrelevant values problem. When we use this complete set of rules to medical examinations, the irrelevant values problem may cause unnecessary economic burden both to the patient and the society. We used a hypothyroid disease as an example for the study of irrelevant values problem of decision tree in medical examination. Hypothyroid disease is used to associate to the mechanism of thyroid-stimulating hormone (TSH). Physicians will combine lots of information; such as patient’s clinical records, medical images, and symptoms, prior to the final diagnosis and treatment, especially surgical operation. Therefore, to avoid generating rules with irrelevant values problem, we propose a new algorithm to remove irrelevant values problem of rules in the process of converting the decision tree to rules utilizing information already present in the decision tree. Our algorithm is able to handle both discrete and continuous values.


Keywords: Decision Tree, Classification, Irrelevant Values, Missing Branches, Medical Examination


REFERENCES


  1. [1] Jerez-Aragonés, J. M., et al., “A Combined Neural Network and Decision Trees Model for Prognosis of Breast Cancer Relapse,” Artificial Intelligence in Medicine, Vol. 27, pp. 4563 (2003).
  2. [2] Kononenko, I., “Machine Learning for Medical Diagnosis: History, State of the Art and Perspective,” Artificial Intelligence in Medicine, Vol. 23, pp. 89109 (2001).
  3. [3] Neumann, A., et al., “Machine Learning for Medical Diagnosis: History, State of the Art and Perspective,” Artificial Intelligence in Medicine, Vol. 32, pp. 97113 (2004).
  4. [4] Cheng, J., et al., “Improved Decision Trees: A Generalized Version of ID3,” in Proceedings of the Fifth International Conference on Machine Learning, Ann Arbor, Michigan, USA, pp. 100108 (1988).
  5. [5] Bohanec, M. and Rajkovic, V., “DEX: An Expert System Shell for Decision Support,” Sistemica, Vol. 1, pp. 145157 (1990).
  6. [6] Delen, D., et al., “Predicting Breast Cancer Survivability: A Comparison of Three Data Mining Methods,” Artificial Intelligence in Medicine, Vol. 34, pp. 113 127 (2005).
  7. [7] Fayyad, U. M., “Branching on Attribute Values in Decision Tree Generalization,” in Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), Seattle, Washington, pp. 601606 (1994).
  8. [8] Fayyad, U. M. and Irani, K. B., “A Machine Learning Algorithm (GID3*) for Automated Knowledge Acquisition: Improvements and Extensions,” MI: GM Research Labs, Warren (1991).
  9. [9] John, G. H., et al., “Irrelevant Features and the Subset Selection Problem,” in Machine Learning: Proceedings of the Eleventh International Conference, W. W. Cohen and H. Hirsh, Eds., ed San Fransisco, CA: Morgan Kaufmann Publisher, pp. 121129 (1994).
  10. [10] Maher, P. E. and St. Clair, D., “Uncertain Reasoning in an ID3 Machine Learning Framework,” in Proceedings of the Second IEEE International Conference on Fuzzy Systems, pp. 712 (1993).
  11. [11] Breiman, L., et al., Classification and regression trees, Belmont, California: Wadsworth (1984).
  12. [12] Chiang, D.-A., et al., “The Irrelevant Values Problem in the ID3 Tree,” Computers and Artificial Intelligence, Vol. 19, pp. 169182 (2000).
  13. [13] Cai, Y., et al., “An Attribute-Oriented Approach for Learning Classification Rules from Relational Databases,” in Proceedings of the Sixth International Conference on Data Engineering, Los Angeles, California, USA, pp. 281288 (1990).
  14. [14] Chi, Z. and Yan, H., “ID3-Derived Fuzzy Rules and Optimized Defuzzification for Handwritten Numeral Recognition,” Fuzzy Systems, IEEE Transactions on, Vol. 4, pp. 2431 (1996).
  15. [15] Kamber, M., et al., “Generalization and Decision Tree Induction: Efficient Classification in Data Mining,” in Proceedings of the Seventh International Workshop on Research Issues in Data Engineering, pp. 111120 (1997).
  16. [16] Ou, M. H., et al., “Dynamic Knowledge Validation and Verification for CBR Teledermatology System,” Artificial Intelligence in Medicine, Vol. 39, pp. 7996 (2007).