Article

Estimating replicability of classifier learning experiments

Author:

Remco R. BouckaertAuthors Info & Claims

ICML '04: Proceedings of the twenty-first international conference on Machine learning

Page 15

https://doi.org/10.1145/1015330.1015338

Published: 04 July 2004 Publication History

Get Access

Abstract

Replicability of machine learning experiments measures how likely it is that the outcome of one experiment is repeated when performed with a different randomization of the data. In this paper, we present an estimator of replicability of an experiment that is efficient. More precisely, the estimator is unbiased and has lowest variance in the class of estimators formed by a linear combination of outcomes of experiments on a given data set.We gathered empirical data for comparing experiments consisting of different sampling schemes and hypothesis tests. Both factors are shown to have an impact on replicability of experiments. The data suggests that sign tests should not be used due to low replicability. Ranked sum tests show better performance, but the combination of a sorted runs sampling scheme with a t-test gives the most desirable performance judged on Type I and II error and replicability.

References

[1]

C. L. Blake and C. J. Merz. UCI Repository of machine learning databases. Irvine, CA: University of California, 1998.

Google Scholar

[2]

R. R. Bouckaert. Choosing between two learning algorithms based on calibrated tests. ICML, 51--58, 2003.

Google Scholar

[3]

T. G. Dietterich. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10(7) 1895--1924, 1998.

Digital Library

Google Scholar

[4]

R. L. Graham, D. E. Knuth and O. Patashnik Concrete mathematics. Addison-Wesley, 1994.

Digital Library

Google Scholar

[5]

G. H. John and Pat Langley. Estimating Continuous Distributions in Bayesian Classifiers. UAI, 338--345, 1995.

Digital Library

Google Scholar

[6]

C. Nadeau and Y. Bengio. Inference for the generalization error. Advances in Neural Information Processing Systems 12, MIT Press, 2000.

Digital Library

Google Scholar

[7]

R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA, 1993.

Digital Library

Google Scholar

[8]

S. Salzberg. On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1:3, 317--327, 1997.

Digital Library

Google Scholar

[9]

I. H. Witten and E. Frank. Data mining: Practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco, 2000.

Digital Library

Google Scholar

Cited By

View all

Carvalho IMoreira A(2023)Analysis of power in preprocessing methodologies for datasets with missing valuesCommunications in Statistics - Simulation and Computation10.1080/03610918.2023.223468353:12(6525-6539)Online publication date: 11-Jul-2023
https://doi.org/10.1080/03610918.2023.2234683
Idowu SStrüber DBerger T(2022)Asset Management in Machine Learning: State-of-research and State-of-practiceACM Computing Surveys10.1145/354384755:7(1-35)Online publication date: 15-Dec-2022
https://dl.acm.org/doi/10.1145/3543847
Ghosh ANashaat MMiller JQuader S(2021)Context-Based Evaluation of Dimensionality Reduction Algorithms—Experiments and Statistical Significance AnalysisACM Transactions on Knowledge Discovery from Data10.1145/342807715:2(1-40)Online publication date: 4-Jan-2021
https://dl.acm.org/doi/10.1145/3428077
Show More Cited By

Estimating replicability of classifier learning experiments
1. Computing methodologies

Recommendations

Replicability of Dynamically Provisioned Scientific Experiments
SOCA '14: Proceedings of the 2014 IEEE 7th International Conference on Service-Oriented Computing and Applications

The ability to repeat an experiment, known as replicability, is a basic concept of scientific research and also an important aspect in the field of eScience. The principles of Service Oriented Computing (SOC) and Cloud Computing, both based on high ...
KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments
ACM REP '23: Proceedings of the 2023 ACM Conference on Reproducibility and Replicability

Distributed infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex scientific workflows to be executed across hybrid systems spanning from IoT Edge devices to Clouds, and sometimes to ...
Reproducibility, Replicability and Repeatability: A survey of reproducible research with a focus on high performance computing
Abstract
Reproducibility is widely acknowledged as a fundamental principle in scientific research. Currently, the scientific community grapples with numerous challenges associated with reproducibility, often referred to as the “reproducibility crisis”. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ICML '04: Proceedings of the twenty-first international conference on Machine learning

July 2004

934 pages

ISBN:1581138385

DOI:10.1145/1015330

Conference Chair:
Carla Brodley
Purdue University/Tufts University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 July 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
444
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)3

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Carvalho IMoreira A(2023)Analysis of power in preprocessing methodologies for datasets with missing valuesCommunications in Statistics - Simulation and Computation10.1080/03610918.2023.223468353:12(6525-6539)Online publication date: 11-Jul-2023
https://doi.org/10.1080/03610918.2023.2234683
Idowu SStrüber DBerger T(2022)Asset Management in Machine Learning: State-of-research and State-of-practiceACM Computing Surveys10.1145/354384755:7(1-35)Online publication date: 15-Dec-2022
https://dl.acm.org/doi/10.1145/3543847
Ghosh ANashaat MMiller JQuader S(2021)Context-Based Evaluation of Dimensionality Reduction Algorithms—Experiments and Statistical Significance AnalysisACM Transactions on Knowledge Discovery from Data10.1145/342807715:2(1-40)Online publication date: 4-Jan-2021
https://dl.acm.org/doi/10.1145/3428077
Mikusova MFuchs AKarasiński ABaruah RPalak RBurnell EWołk K(2021)Towards Layer-Wise Optimization of Contextual Neural Networks with Constant Field of AggregationIntelligent Information and Database Systems10.1007/978-3-030-73280-6_59(743-753)Online publication date: 5-Apr-2021
https://doi.org/10.1007/978-3-030-73280-6_59
Huk MShin KKuboyama THashimoto T(2021)Random Number Generators in Training of Contextual Neural NetworksIntelligent Information and Database Systems10.1007/978-3-030-73280-6_57(717-730)Online publication date: 5-Apr-2021
https://doi.org/10.1007/978-3-030-73280-6_57
Mohammadi MHofman WTan Y(2019)A Comparative Study of Ontology Matching Systems via Inferential StatisticsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2018.284201931:4(615-628)Online publication date: 1-Apr-2019
https://doi.org/10.1109/TKDE.2018.2842019
Henderson PIslam RBachman PPineau JPrecup DMeger DMcIlraith SWeinberger K(2018)Deep reinforcement learning that mattersProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504427(3207-3214)Online publication date: 2-Feb-2018
https://dl.acm.org/doi/10.5555/3504035.3504427
Drzewiecki W(2017)Thorough statistical comparison of machine learning regression models and their ensembles for sub-pixel imperviousness and imperviousness change mappingGeodesy and Cartography10.1515/geocart-2017-001266:2(171-210)Online publication date: 20-Dec-2017
https://doi.org/10.1515/geocart-2017-0012
de Assis Boldt FRauber TOliveira-Santos TRodrigues AVarejao FRibeiro M(2017)Binary feature selection classifier ensemble for fault diagnosis of submersible motor pump2017 IEEE 26th International Symposium on Industrial Electronics (ISIE)10.1109/ISIE.2017.8001523(1807-1812)Online publication date: Jun-2017
https://doi.org/10.1109/ISIE.2017.8001523
Stąpor K(2017)Evaluating and Comparing Classifiers: Review, Some Recommendations and LimitationsProceedings of the 10th International Conference on Computer Recognition Systems CORES 201710.1007/978-3-319-59162-9_2(12-21)Online publication date: 7-May-2017
https://doi.org/10.1007/978-3-319-59162-9_2
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Replicability of Dynamically Provisioned Scientific Experiments

KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments

Reproducibility, Replicability and Repeatability: A survey of reproducible research with a focus on high performance computing