[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3608298.3608304acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmhiConference Proceedingsconference-collections
research-article

The Impact of Feature Normalization on Different Feature Types of Medical Datasets

Published: 18 October 2023 Publication History

Abstract

To obtain quality data mining results, data pre-processing is usually performed in the knowledge discovery in databases (KDD) process. Particularly, feature normalization or scaling is one important step in data pre-processing. This is because many datasets usually contain some features that have broad ranges of values, and feature normalization is applied to normalize or rescale each feature value to a fixed range, usually between 0 and 1. For the medical domain datasets, they usually contain three different kinds of data including categorical, numerical, and the mixed data type, this paper examines the effect of performing feature normalization on the three different types of medical datasets. Our experimental results indicate that for the categorical and some mixed types of datasets performing feature normalization does not necessarily make the k-NN and SVM classifiers perform better than the ones without feature normalization. On the other hand, for the numerical type of datasets k-NN and SVM by feature normalization perform better than the baseline classifiers.

References

[1]
Han, J. and Kamber, M. (2000) Data mining: concepts and techniques. Morgan Kaufmann.
[2]
Crone, S.F., Lessmann, S., and Stahlbock, R. (2006) The impact of preprocessing on data mining: an evaluation of classifier sensitivity in direct marketing. European Journal of Operational Research, vol. 173, no. 3, pp. 781-800.
[3]
Theodoridis, S. and Koutroumbas, K. (2008) Pattern recognition, 4th Edition. Academic Press.
[4]
Aksoy, S. and Haralick, R. (2001) Feature normalization and likelihood-based similarity measures for image retrieval. Pattern Recognition Letters, vol. 22, no. 5, pp. 563-582.
[5]
Bhamare, B. R., Jeyanthi, P., & Subhashini, R. (2019). Aspect category extraction for sentiment analysis using multivariate filter method of feature selection. International Journal of Recent Technology and Engineering, 8(3), 2138-2143.
[6]
Stolcke, A., Kajarekar, S., and Ferrer, L. (2008) Nonparametric feature normalization for SVM-based speaker verification. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1577-1580.
[7]
Bo, L., Wang, L., and Jiao, L. (2006) Feature scaling for kernel fisher discriminant analysis using leave-one-out cross validation. Neural Computation, vol. 18, no. 4, pp. 961-978.
[8]
Mitchell, T. (1997) Machine Learning. McGraw Hill, New York.
[9]
Jain, A.K. and Dubes, R.C. (1988) Algorithms of clustering data. prentice Hall.

Index Terms

  1. The Impact of Feature Normalization on Different Feature Types of Medical Datasets

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICMHI '23: Proceedings of the 2023 7th International Conference on Medical and Health Informatics
    May 2023
    386 pages
    ISBN:9798400700712
    DOI:10.1145/3608298
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data preprocessing
    2. feature normalization
    3. medical datasets
    4. pattern classification

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    ICMHI 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 26
      Total Downloads
    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media