[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3422713.3422725acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbdtConference Proceedingsconference-collections
research-article

K-means Clustering Based Undersampling for Lower Back Pain Data

Published: 23 October 2020 Publication History

Abstract

Many people are usually suffered from low back pain(LBP). It is very important to identify the LBP in the early stage. The classification algorithm in machine learning can help us to predict whether a person is suffered from low back pain, but class imbalance is often a problem in various real-world datasets including the LBP dataset. In this paper, LBP diagnosis based on a k-means clustering combined with undersampling has been proposed. The first strategy is to combine k-means and stratified random sampling to undersample(KSS). The second strategy is to combine k-means and Manhattan distance to undersample(KMD). Experiments have been conducted on LBP dataset by classification systems. The performance of the method is evaluated using the area under curve(AUC) metric. The results show that the highest classification accuracy (0.92) is obtained for the KSS is combined with logistic regression on LBP dataset. The KSS combine with linear SVM has higher accuracy and stability.

References

[1]
Jöud, A., Petersson, I. F. and Englund, M. 2012.Low back pain - epidemiology of consultations. Arthritis Care & Research (July.2012), 1084--1088.
[2]
Castillo, E. R. and Lieberman, D. E. 2015.Lower back pain. Evolution, Medicine, and Public Health, 2--3.
[3]
[Yen, S.-J. and Lee, Y.-S. 2009. Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems with Applications, 36, 5718--5727.
[4]
[Lin, W.-C., Tsai, C.-F., Hu, Y.-H. and Jhang, J.-S. 2017. Clustering-based undersampling in class-imbalanced data. Information Sciences, (May.2017), 409--410.
[5]
Rahman, M., & Davis, D. N. 2013. Cluster Based Under-Sampling for Unbalanced Cardiovascular Data. The International Conference of Data Mining and Knowledge Engineering. International Association of Engineers, (July.2013), online.
[6]
Sen Wu, Lu Liu and Dan Lu. 2017.Imbalanced data ensemble classification based on cluster-based under-sampling algorithm. Chinese Journal of Engineering, (August.2017), 1244--1253.
[7]
Xu Jin, Lei Wang, Guozi Sun and Huakang Li. 2019. Under-sampling Method for Unbalanced Data Based on centroid space. Computer science, 46, (February.2019), 59--64.
[8]
A. Agrawal, H. L. Viktor and E. Paquet. 2015. SCUT: Multi-class umbalanced data classification using SMOTE and cluster-based undersampling. 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), (January.2015), 226--234.
[9]
Bhatt, M., Dahiya, V. & Singh, A. 2019. Supervised Learning Algorithm: SVM with Advanced Kernel to classify Lower Back Pain. 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon).( February.2019), 17--19.
[10]
Arafat, M. Y, Hoque, S. & Farid, D. M. 2017. Cluster-Based Under-Sampling with Random Forest for Multi-Class Imbalanced Classification. 2017 11th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), (December.2017), 1--6.
[11]
Zhihua Zhou. 2016. Machine learning. Tsinghua University Press, (January. 2016), 197--203.
[12]
Bradley, A. P. 1996. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, ( November.1996), 1145--1159.
[13]
Spackman, K. A. 1989. Signal Detection Theory: Valuable Tools for Evaluating Inductive Learning. In Proceedings of the Sixth International Workshop on Machine Learning, (December.1989), 160--163.

Cited By

View all
  • (2025)EATSA-GNN: Edge-Aware and Two-Stage attention for enhancing graph neural networks based on teacher–student mechanisms for graph node classificationNeurocomputing10.1016/j.neucom.2024.128686612(128686)Online publication date: Jan-2025
  • (2024)A Class-Aware Representation Refinement Framework for Graph ClassificationInformation Sciences10.1016/j.ins.2024.121061(121061)Online publication date: Jun-2024
  • (2023)Mitigating Imbalanced Data in Online Social Networks using Stratified K-Means Sampling2023 8th International Conference on Business and Industrial Research (ICBIR)10.1109/ICBIR57571.2023.10147677(883-888)Online publication date: 18-May-2023
  • Show More Cited By

Index Terms

  1. K-means Clustering Based Undersampling for Lower Back Pain Data

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICBDT '20: Proceedings of the 3rd International Conference on Big Data Technologies
    September 2020
    250 pages
    ISBN:9781450387859
    DOI:10.1145/3422713
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. K-means
    2. Low back pain
    3. Manhattan distance
    4. Stratified random sampling
    5. Undersampling

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Natural Science Foundation of Shandong Province
    • Major Scientific and Technological Innovation Project of Shandong Province

    Conference

    ICBDT 2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 30 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)EATSA-GNN: Edge-Aware and Two-Stage attention for enhancing graph neural networks based on teacher–student mechanisms for graph node classificationNeurocomputing10.1016/j.neucom.2024.128686612(128686)Online publication date: Jan-2025
    • (2024)A Class-Aware Representation Refinement Framework for Graph ClassificationInformation Sciences10.1016/j.ins.2024.121061(121061)Online publication date: Jun-2024
    • (2023)Mitigating Imbalanced Data in Online Social Networks using Stratified K-Means Sampling2023 8th International Conference on Business and Industrial Research (ICBIR)10.1109/ICBIR57571.2023.10147677(883-888)Online publication date: 18-May-2023
    • (2022)Imbalanced data preprocessing techniques for machine learning: a systematic mapping studyKnowledge and Information Systems10.1007/s10115-022-01772-865:1(31-57)Online publication date: 9-Nov-2022
    • (2021)Boosting-GNN: Boosting Algorithm for Graph Networks on Imbalanced Node ClassificationFrontiers in Neurorobotics10.3389/fnbot.2021.77568815Online publication date: 25-Nov-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media