[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3608298.3608315acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmhiConference Proceedingsconference-collections
research-article

A Novel Prediction Approach for Effective Medical Data Mining

Published: 18 October 2023 Publication History

Abstract

Data mining techniques have been employed for solving many medical problems, especially for disease prediction. For instance, given a dataset containing normal and cancerous patients, the goal is to develop a model to predict whether a new (unknown) patient belongs to the normal or cancerous class. In general, the model is constructed based on some machine learning technique over a collected training set. However, the quality of the training set can affect the final prediction performance of the model. That is, if the training set contains some certain amount of noisy data (or outliers), then the model's performance could be degraded. In literature, instance selection is performed over a given training set in order to filter out some noisy data and the reduced training set containing non-noisy data is used for developing the prediction model. In this paper, we present a novel approach where instance selection is performed to divide a given training set into noisy and non-noisy subsets. Then, they are used to train two models respectively. During prediction, the instance selection step is also executed over the testing set, in which the noisy and non-noisy subsets are used to test their corresponding models respectively. The experimental results based on various medical domain datasets show that our proposed approach performs better than the baseline, which is based on the conventional instance selection approach.

References

[1]
Rabiei, R., Ayyoubzadeh, S.M., Sohrabei, S., Esmaeili, M., and Atashi, A. (2022) Prediction f breast cancer using machine learning approaches. Journal of Biomedical Physics, and Engineering, vol. 12, no. 3, pp. 297-308.
[2]
Cheng, J., Bendjama, K., Rittner, K., and Malone, B. (2021) BERTMHC: improved MHC – peptide class II interaction prediction with transformer and multiple instance learning. Bioinformatics, vol. 37, no. 22, pp. 4172-4179.
[3]
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J., and Steinberg, D. (2008) Top 10 algorithms in data mining. Knowledge and Information Systems, vol. 14, pp. 1-37.
[4]
Dong, X., Yu, Z., Cao, W., Shi, Y., and Ma, Q. (2020) A survey on ensemble learning. Frontiers of Computer Science, vol. 14, pp. 241-258.
[5]
Galar, M., Fernandez, A., Barrenechea, E., and Herrera, F. (2013) EUBoost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recognition, vol. 46, no. 12, pp. 3460-3471.
[6]
Kazienko, P., Lughofer, E., and Trawinski, B. (2013) Hybrid and ensemble methods in machine learning (special issue). Journal of Universal Computer Science, vol. 19, no. 4, pp. 457-461.
[7]
Niu, X.-X. and Suen, C.Y. (2012) A novel hybrid CNN-SVM classifier for recognizing handwritten digits. Pattern Recognition, vol. 45, no. 4, pp. 1318-1325.
[8]
Verma, B. and Rahman, A. (2012) Cluster-oriented ensemble classifier: impact of multicluster characterization on ensemble classifier learning. IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 4, pp. 605-618.
[9]
Wilson, D.R. and Martinez, T.R. (2000) Reduction techniques for instance-based learning algorithms. Machine Learning, vol. 38, pp. 257-286.
[10]
Li, X.-B. and Jacob, V.S. (2008) Adaptive data reduction for large-scale transaction data. European Journal of Operational Research, vol. 188, no. 3, pp. 910-924.
[11]
Ougiaroglou, S., Diamantaras, K.I., and Evangelidis, G. (2018) Exploring the effect of data reduction on Neural Network and Support Vector Machine classification. Neurocomputing, vol. 280, pp. 101-110.
[12]
Saha, S., Sarker, P.S., Saud, A.A., Shatabda, S., and Newton, M.A.H. (2022) Cluster-oriented instance selection for classification problems. Information Sciences, vol. 602, pp. 143-158.
[13]
Cano, J.R., Herrera, F., and Lozano, M. (2003) Using evolutionary algorithms as instance selection for data reduction: an experimental study. IEEE Transactions on Evolutionary Computation, vol. 7, no. 6, pp. 561-575.
[14]
Derrac, J., García, S., and Herrera, F. (2010) A survey on evolutionary instance selection and generation. International Journal of Applied Metaheuristic Computing, vol. 1, no. 1, pp. 60-92.
[15]
Aha, D.W., Kibler, D., and Albert, M.K. (1991) Instance-based learning algorithms. Machine Learning, vol. 6, no. 1, pp. 37-66.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICMHI '23: Proceedings of the 2023 7th International Conference on Medical and Health Informatics
May 2023
386 pages
ISBN:9798400700712
DOI:10.1145/3608298
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Medical data mining
  2. instance selection
  3. machine learning

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICMHI 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 27
    Total Downloads
  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)4
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media