[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content

Advertisement

Log in

An Integration of Archerfish Hunter Spotted Hyena Optimization and Improved ELM Classifier for Multicollinear Big Data Classification Tasks

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Big data mining has emerged as an active field of interest, and traditional data mining approaches frequently fail to handle the complexities associated with massive datasets. One of the most extensively used strategies for big data classification is MapReduce, which combines the map and reduce processes. For filtering and sorting, the mapping approach is employed, and the reduction technique is used to combine the final classification results. A novel Archerfish Hunter Spotted Hyena Optimization-based Improved Extreme Learning Machine (AHSHO-IELM) classifier-based MapReduce framework is proposed in this paper for big data classification. The IELM algorithm is formed by integrating the ELM technique with Principal Component Analysis to overcome the multicollinear problem and enhance the training and testing time. The AHSHO method combines the archerfish hunter optimization and Spotted Hyena Optimization algorithms to improve the optimal parameter selection of the IELM classifier, which increases classification accuracy and reduces the error rate. The performance of the proposed AHSHO-IELM classifier-based MapReduce framework is evaluated using different performance metrics such as Accuracy, Sensitivity, Specificity, Computational time, F1-Score, Mathews correlation coefficient, and scale-up factor. For the rotten tomatoes movie review dataset and the Dermatology dataset, the proposed approach yields accuracy, specificity, and sensitivity values of 99%, 99%, 98.3%, and 99.3%, 99%, 98%, respectively for a mapper value (X) of 5. The proposed big data classifier is effective for both single-class and multi-class classification, according to the results of the analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Monino J-L, Sedkaoui S (2016) Big data, open data and data development, vol 3. Wiley, Hoboken

    Book  Google Scholar 

  2. Mohanty S, Jagadeesh M, Srivatsa H (2013) Big data imperatives: enterprise ‘big data’warehouse, ‘BI’implementations and analytics. Apress, Berkeley

    Book  Google Scholar 

  3. Wu X, Zhu X, Gong-Qing Wu, Ding W (2013) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107

    Google Scholar 

  4. Bhadani AK, Jothimani D (2016) Big data: challenges, opportunities, and realities. In: Effective big data management and opportunities for implementation, IGI Global, pp 1–24

  5. Leone S (2020) Rotten Tomatoes movies and critic reviews dataset. https://www.kaggle.com/stefanoleone992/rotten-tomatoes-movies-and-critic-reviews-dataset

  6. Banchhor C, Srinivasu N (2020) Integrating Cuckoo search-Grey wolf optimization and Correlative Naive Bayes classifier with Map Reduce model for big data classification. Data Knowl Eng 127:101788

    Article  Google Scholar 

  7. Tsai C-F, Lin W-C, Ke S-W (2016) Big data mining with parallel computing: A comparison of distributed and MapReduce methodologies. J Syst Softw 122:83–92

    Article  Google Scholar 

  8. Sundararaj V, Muthukumar S, Kumar RS (2018) An optimal cluster formation based energy efficient dynamic scheduling hybrid MAC protocol for heavy traffic load in wireless sensor networks. Comput Secur 77:277–288

    Article  Google Scholar 

  9. Sundararaj V (2016) An efficient threshold prediction scheme for wavelet based ECG signal noise reduction using variable step size firefly algorithm. Int J Intell Eng Syst 9(3):117–126

    Google Scholar 

  10. Sundararaj V (2019) Optimised denoising scheme via opposition-based self-adaptive learning PSO algorithm for wavelet-based ECG signal noise reduction. Int J Biomed Eng Technol 31(4):325

    Article  Google Scholar 

  11. Sundararaj V, Anoop V, Dixit P, Arjaria A, Chourasia U, Bhambri P, Rejeesh MR, Sundararaj R (2020) CCGPA-MPPT: Cauchy preferential crossover-based global pollination algorithm for MPPT in photovoltaic system. Prog Photovoltaics Res Appl 28(11):1128–1145

    Article  Google Scholar 

  12. Ravikumar S, Kavitha D (2021) CNN-OHGS: CNN-oppositional-based Henry gas solubility optimization model for autonomous vehicle control system. J Field Robot 38:967–979

    Article  Google Scholar 

  13. Ravikumar S, Kavitha D (2021) IoT based home monitoring system with secure data storage by Keccak-Chaotic sequence in cloud server. J Ambient Intell Human Comput 12:7475–7487. https://doi.org/10.1007/s12652-020-02424-x

    Article  Google Scholar 

  14. Rejeesh MR (2019) Interest point based face recognition using adaptive neuro fuzzy inference system. Multimed Tools Appl 78(16):22691–22710

    Article  Google Scholar 

  15. Kavitha D, Ravikumar S (2021) IOT and context-aware learning-based optimal neural network model for real-time health monitoring. Trans Emerg Telecommun Technol 32(1):e4132

    Google Scholar 

  16. Hassan BA, Rashid TA (2020) Datasets on statistical analysis and performance evaluation of backtracking search optimisation algorithm compared with its counterpart algorithms. Data Brief 28:105046

    Article  Google Scholar 

  17. Hassan BA (2021) CSCF: a chaotic sine cosine firefly algorithm for practical application problems. Neural Comput Appl 33:7011–7030. https://doi.org/10.1007/s00521-020-05474-6

    Article  Google Scholar 

  18. Hassan BA, Rashid TA, Mirjalili S (2021) Formal context reduction in deriving concept hierarchies from corpora using adaptive evolutionary clustering algorithm star. Complex Intell Syst 7:2383–2398. https://doi.org/10.1007/s40747-021-00422-w

    Article  Google Scholar 

  19. Haseena KS, Anees S, Madheswari N (2014) Power optimization using EPAR protocol in MANET. Int J Innov Sci Eng Technol 6:430–436

    Google Scholar 

  20. Radhika R, Sasi Rekha K, DMGAK (2021) A new hybrid approach for data clustering analysis using hybrid fuzzy C-means and fuzzy particle swarm optimization. Design Engineering, 480–492. Retrieved from https://thedesignengineering.com/index.php/DE/article/view/2304

  21. Gowthul Alam MM, Baulkani S (2017) Reformulated query-based document retrieval using optimised kernel fuzzy clustering algorithm. Int J Bus Intell Data Min 12(3):299

    Google Scholar 

  22. Gowthul Alam MM, Baulkani S (2019) Geometric structure information based multi-objective function to increase fuzzy clustering performance with artificial and real-life data. Soft Comput 23(4):1079–1098

    Article  Google Scholar 

  23. Nisha S, Madheswari AN (2016) Secured authentication for internet voting in corporate companies to prevent phishing attacks. Int J Emerg Technol Comput Sci Electron IJETCSE 22(1):45–49

    Google Scholar 

  24. Lin K-C, Zhang K-Y, Huang Y-H, Hung JC, Yen N (2016) Feature selection based on an improved cat swarm optimization algorithm for big data classification. J Supercomput 72(8):3210–3221

    Article  Google Scholar 

  25. Meera S, Sundar C (2021) A hybrid metaheuristic approach for efficient feature selection methods in big data. J Ambient Intell Humaniz Comput 12(3):3743–3751

    Article  Google Scholar 

  26. Jaiswal AK, Tiwari P, Garg S, Shamim Hossain M (2021) Entity-aware capsule network for multi-class classification of big data: a deep learning approach. Futur Gener Comput Syst 117:1–11

    Article  Google Scholar 

  27. García-Gil D, Luengo J, García S, Herrera F (2019) Enabling smart data: noise filtering in big data classification. Inf Sci 479:135–152

    Article  Google Scholar 

  28. BenSaid F, Alimi AM (2021) Online feature selection system for big data classification based on multi-objective automated negotiation. Pattern Recogn 110:107629

    Article  Google Scholar 

  29. Sleeman IV, William C, Krawczyk B (2021) Multi-class imbalanced big data classification on Spark. Knowl Based Syst 212:106598

    Article  Google Scholar 

  30. Zhai J, Zhang S, Zhang M, Liu X (2018) Fuzzy integral-based ELM ensemble for imbalanced big data classification. Soft Comput 22(11):3519–3531

    Article  Google Scholar 

  31. Ravindran S, Aghila G (2020) A data-independent reusable projection (DIRP) technique for dimension reduction in big data classification using k-nearest neighbor (k-NN). Natl Acad Sci Lett 43(1):13–21

    Article  MathSciNet  Google Scholar 

  32. Dagdia ZC (2019) A scalable and distributed dendritic cell algorithm for big data classification. Swarm Evolut Comput 50:100432

    Article  Google Scholar 

  33. Vennila V, Rajiv Kannan A (2019) Hybrid parallel linguistic fuzzy rules with canopy mapreduce for big data classification in cloud. Int J Fuzzy Syst 21(3):809–822

    Article  Google Scholar 

  34. Game PS, Vaze V, Emmanuel M (2019) Optimized Decision tree rules using divergence based grey wolf optimization for big data classification in health care. Evolut Intell. https://doi.org/10.1007/s12065-019-00267-w

    Article  Google Scholar 

  35. Ma Z, Yang LT, Zhang Q (2020) Support multimode tensor machine for multiple classification on industrial big data. IEEE Trans Ind Inf 17(5):3382–3390

    Article  Google Scholar 

  36. Gong C, Su ZG, Wang PH, Wang Q, You Y (2021) Evidential instance selection for K-nearest neighbor classification of big data. Int J Approx Reason 138:123–144

    Article  MathSciNet  Google Scholar 

  37. Al-Thanoon NA, Algamal ZY, Qasim OS (2021) Feature selection based on a crow search algorithm for big data classification. Chemom Intell Lab Syst 212:104288

    Article  Google Scholar 

  38. Gu X, Angelov P, Zhao Z (2021) Self-organizing fuzzy inference ensemble system for big streaming data classification. Knowl Based Syst 218:106870

    Article  Google Scholar 

  39. Kadkhodaei H, Moghadam AME, Dehghan M (2021) Big data classification using heterogeneous ensemble classifiers in Apache Spark based on MapReduce paradigm. Expert Syst Appl 183:115369

    Article  Google Scholar 

  40. Zhang H, Yin Y, Zhang S (2016) An improved ELM algorithm for the measurement of hot metal temperature in blast furnace. Neurocomputing 174:232–237

    Article  Google Scholar 

  41. Hoyle DC (2008) Automatic PCA dimension selection for high dimensional data and small sample sizes. J Mach Learn Res 9(12):2733–2759

    MATH  Google Scholar 

  42. Hong D, Fessler JA, Balzano L (2018) Optimally weighted PCA for high-dimensional heteroscedastic data. arXiv preprint arXiv:1810.12862

  43. Dhiman G, Kumar V (2017) Spotted hyena optimizer: a novel bio-inspired based metaheuristic technique for engineering applications. Adv Eng Softw 114:48–70

    Article  Google Scholar 

  44. Zitouni F, Harous S, Belkeram A, Hammou LEB (2021) The archerfish hunting optimizer: a novel metaheuristic algorithm for global optimization. arXiv preprint arXiv:2102.02134

  45. Leone S (2020) Rotten tomatoes movies and critic reviews dataset. Retrieved September 30, 2021, from https://www.kaggle.com/stefanoleone992/rotten-tomatoes-movies-and-critic-reviews-dataset

  46. UCI Machine Learning Repository: Dermatology Data Set. [online]. https://archive.ics.uci.edu/ml/datasets/Dermatology. Accessed 4 May 2021

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Chidambaram.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by any of the authors.

Informed Consent

Informed consent does not apply as this was a retrospective review with no identifying patient information.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chidambaram, S., Gowthul Alam, M.M. An Integration of Archerfish Hunter Spotted Hyena Optimization and Improved ELM Classifier for Multicollinear Big Data Classification Tasks. Neural Process Lett 54, 2049–2077 (2022). https://doi.org/10.1007/s11063-021-10718-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10718-0

Keywords

Navigation