T. Gayathri This email address is being protected from spambots. You need JavaScript enabled to view it.1 and D. Lalitha Bhaskari1

1Research Scholar, Department of CS&SE, AUCE (A), Visakhapatnam, Andhra Pradesh, India


Received: July 17, 2021
Accepted: September 18, 2021
Publication Date: December 6, 2021

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.6180/jase.202208_25(4).0019  


Big data in healthcare defines a massive quantity of healthcare data accumulated from massive sources like electronic health records (EHR), medical imaging, genomic sequence, pharmacological research, wearable, and medical gadgets, etc. One of the data mining approaches commonly employed to classify big data is the MapReduce model. Data clustering, a significant data mining technique has been extensively investigated in the recent years in handling the diversity in data and various sets of application necessities. In this view, this paper develops an enhanced metaheuristic algorithm based clustering and classification model with MapReduce (EMACC-MR) framework for big data environment. The presented EMACC-MR model involves an oppositional cuckoo search optimization algorithm (OCSOA) with Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and wavelet kernel extreme learning machine (WKELM) model based on the classification model. The inclusion of oppositional based learning (OBL) concept helps to improve the convergence rate of the CSOA. For handling big data, Hadoop MapReduce environment is employed. The proposed OCSOA model improves the clustering quality and MapReduce architecture to cope with the large-scale dataset. For validating the experimental analysis of the proposed model, two benchmark datasets namely Activity recognition and diabetes datasets are used. The simulation outcomes confirmed that the presented model outperforms the compared methods in terms of several evaluation parameters.

Keywords: Big data analytics, Neural Network, Clustering, Classification process, Metaheuristics, MapReduce


  1. [1] R. Selvi and I. Muthulakshmi, (2021) “Modelling the map reduce based optimal gradient boosted tree classification algorithm for diabetes mellitus diagnosis system" Journal of Ambient Intelligence and Humanized Computing 12(2): 1717–1730. DOI: 10.1007/s12652-020-02242-1.
  2. [2] K. M. Sagayam, D. J. Hemanth, X. A. Vasanth, L. E. Henesy, and C. C. Ho. “Optimization of a HMMBased Hand Gesture Recognition System Using a Hybrid Cuckoo Search Algorithm”. In: Hybrid Metaheuristics for Image Analysis. Springer, 2018, 87–114.
  3. [3] P. Koti, P. Dhavachelvan, T. Kalaipriyan, S. Arjunan, J. Uthayakumar, and P. Sujatha, (2020) “An efficient healthcare framework for kidney disease using hybrid harmony search algorithm" Electronic Government 16(1-2): 56–68. DOI: 10.1504/EG.2020.105236.
  4. [4] K. M. Sagayam, D. J. Hemanth, Y. N. Ramprasad, and R. Menon. “Optimization of hand motion recognition
    system based on 2D HMM approach using ABC algorithm. In Hybrid Intelligent Techniques for pattern Analysis and Understanding (pp. 167-192)”. In: Hybrid Intelligent Techniques for pattern Analysis and Understanding. Chapman and Hall/CRC, 2017, 167–192.
  5. [5] B. Ristevski and M. Chen, (2018) “Big Data Analytics in Medicine and Healthcare" Journal of integrative bioinformatics 15(3): DOI: 10.1515/jib-2017-0030.
  6. [6] K. M. Sagayam, S. Suresh, D. J. Hemanth, L. Henessey, and C. C. Ho. “Optimization of SVMBased Hand Gesture Recognition System Using Particle Swarm Optimization and Plant Growth Simulation Algorithm. In The Biometric Computing (pp. 185-200)”. In: The Biometric Computing. Chapman and Hall/CRC, 2019, 185–200.
  7. [7] I. Dinov, B. Heavner, M. Tang, G. Glusman, K. Chard, M. Darcy, R. Madduri, J. Pa, C. Spino, C. Kesselman, I. Foster, E. Deutsch, N. Price, J. Van Horn, J. Ames, K. Clark, L. Hood, B. Hampstead,W. Dauer, and A. Toga, (2016) “Predictive big data analytics: A study of Parkinson’s disease using large, complex, heterogeneous, incongruent, multi-source and incomplete observations" PLoS ONE 11(8): DOI: 10.1371/journal.pone.0157077.
  8. [8] K. Sagayam and D. Hemanth, (2019) “A probabilistic model for state sequence analysis in hidden Markov model for hand gesture recognition" Computational Intelligence 35(1): 59–81. DOI: 10.1111/coin.12188.
  9. [9] S. Lakshmanaprabu, K. Shankar, S. Sheeba Rani, E. Abdulhay, N. Arunkumar, G. Ramirez, and J. Uthayakumar, (2019) “An effect of big data technology with ant colony optimization based routing in vehicular ad hoc networks: Towards smart cities" Journal of Cleaner Production 217: 584–593. DOI: 10.1016/j.jclepro.2019.01.115.
  10. [10] G. Manogaran, V. Vijayakumar, R. Varatharajan, P. Malarvizhi Kumar, R. Sundarasekar, and C.-H. Hsu, (2018) “Machine Learning Based Big Data Processing Framework for Cancer Diagnosis Using Hidden Markov Model and GM Clustering" Wireless Personal Communications 102(3): 2099–2116. DOI: 10.1007/s11277-017-5044-z.
  11. [11] M. Syafrudin, G. Alfian, N. Fitriyani, and J. Rhee, (2018) “Performance analysis of IoT-based sensor, big data processing, and machine learning model for real-time monitoring system in automotive manufacturing" Sensors (Switzerland) 18(9): DOI: 10.3390/s18092946.
  12. [12] S. Lakshmanaprabu, K. Shankar, M. Ilayaraja, A. Nasir, V. Vijayakumar, and N. Chilamkurti, (2019) “Random forest for big data classification in the internet of things using optimal features" International Journal of Machine Learning and Cybernetics 10(10): 2609–2618. DOI: 10.1007/s13042-018-00916-z.
  13. [13] R. Varatharajan, G. Manogaran, and M. Priyan, (2018) “A big data classification approach using LDA with an enhanced SVM method for ECG signals in cloud computing" Multimedia Tools and Applications 77(8): 10195–10215. DOI: 10.1007/s11042-017-5318-1.
  14. [14] L. Nair, S. Shetty, and S. Shetty, (2018) “Applying spark based machine learning model on streaming big data for health status prediction" Computers and Electrical Engineering 65: 393–399. DOI: 10.1016/j.compeleceng.2017.03.009.
  15. [15] C.-W. Song, H. Jung, and K. Chung, (2019) “Development of a medical big-data mining process using topic modeling" Cluster Computing 22: 1949–1958. DOI: 10.1007/s10586-017-0942-0.
  16. [16] F. Ali, S. El-Sappagh, S. Islam, A. Ali, M. Attique, M. Imran, and K.-S. Kwak, (2020) “An intelligent healthcare monitoring framework using wearable sensors and social networking data" Future Generation Computer Systems 114: 23–43. DOI: 10.1016/j.future.2020.07.047.
  17. [17] P. Kumar, S. Lokesh, R. Varatharajan, G. Chandra Babu, and P. Parthasarathy, (2018) “Cloud and IoT based disease prediction and diagnosis system for healthcare using Fuzzy neural classifier" Future Generation Computer Systems 86: 527–534. DOI: 10.1016/j .future.2018.04.036.
  18. [18] G. Manogaran, R. Varatharajan, D. Lopez, P. Kumar, R. Sundarasekar, and C. Thota, (2018) “A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system" Future Generation Computer Systems 82: 375–387. DOI: 10.1016/j.future.2017.10.045.
  19. [19] J. Onesimu, A. Kadam, K. Sagayam, and A. Elngar, (2021) “Internet of things based intelligent accident avoidance system for adverse weather and road conditions" Journal of Reliable Intelligent Environments 7(4): 299–313. DOI: 10.1007/s40860-021-00132-7.
  20. [20] C. Cassisi, A. Ferro, R. Giugno, G. Pigola, and A. Pulvirenti, (2013) “Enhancing density-based clustering: Parameter reduction and outlier detection" Information Systems 38(3): 317–330. DOI: 10.1016/j.is.2012.09.001.
  21. [21] J.Wang, B. Zhou, and S. Zhou, (2016) “An Improved Cuckoo Search Optimization Algorithm for the Problem of Chaotic Systems Parameter Estimation" Computational Intelligence and Neuroscience 2016: DOI: 10.1155/2016/2959370.
  22. [22] K. Sagayam and D. Hemanth, (2018) “ABC algorithm based optimization of 1-D hidden Markov model for hand gesture recognition applications" Computers in Industry 99: 313–323. DOI: 10.1016/j.compind.2018.03.035.
  23. [23] K. Sagayam, A. Ghosh, B. Bhushan, J. Andrew, K. Cengiz, and A. Elngar, (2021) “Underwater estimation of audio signal prediction using fruit fly algorithm and hybrid wavelet neural network" Journal of Reliable Intelligent Environments: DOI: 10.1007/s40860-021-00151-4.
  24. [24] R. Zhao, Q. Luo, and Y. Zhou, (2017) “Elite opposition based social spider optimization algorithm for global function optimization" Algorithms 10(1): DOI: 10.3390/a10010009.
  25. [25] G. Rajesh, X. Raajini, K. Sagayam, and H. Dang, (2020) “A statistical approach for high order epistasis interaction detection for prediction of diabetic macular edema" Informatics in Medicine Unlocked 20: DOI: 10.1016/j.imu.2020.100362.
  26. [26] Pima Indians Diabetes Database. URL: https://www. kaggle . com / uciml / pima - indians - diabetes -database (visited on 09/30/2021).