More Web Proxy on the site http://driver.im/

research-article

Trusting My Predictions: On the Value of Instance-Level Analysis

Authors:

Pedro Y. A. Paiva,

Ricardo B. C. PrudêncioAuthors Info & Claims

ACM Computing Surveys, Volume 56, Issue 7

Article No.: 167, Pages 1 - 28

https://doi.org/10.1145/3615354

Published: 09 April 2024 Publication History

Abstract

Machine Learning solutions have spread along many domains, including critical applications. The development of such models usually relies on a dataset containing labeled data. This dataset is then split into training and test sets and the accuracy of the models in replicating the test labels is assessed. This process is often iterated in a cross-validation procedure for obtaining average performance estimates. But is the average of the predictive performance on test sets enough for assessing the trustfulness of a Machine Learning model? This paper discusses the importance of knowing which individual observations of a dataset are more challenging than others and how this characteristic can be measured and used in order to improve classification performance and trustfulness. A set of strategies for measuring the hardness level of the instances of a dataset is surveyed and a Python package containing their implementation is provided.

References

[1]

José L. M. Arruda, Ricardo B. C. Prudêncio, and Ana C. Lorena. 2020. Measuring instance hardness using data complexity measures. In Brazilian Conference on Intelligent Systems. Springer, 483–497.

Digital Library

[2]

Victor H. Barella, Luis P. F. Garcia, Marcilio C. P. de Souto, Ana C. Lorena, and André C. P. L. F. de Carvalho. 2021. Assessing the data complexity of imbalanced datasets. Information Sciences 553 (2021), 83–109.

[3]

Christopher M. Bishop and Nasser M. Nasrabadi. 2006. Pattern Recognition and Machine Learning, Vol. 4. Springer.

Digital Library

[4]

Björn Böken. 2021. On the appropriateness of Platt scaling in classifier calibration. Information Systems 95 (2021), 101641.

[5]

Leo Breiman. 1996. Bagging predictors. Machine Learning 24, 2 (1996), 123–140.

[6]

Leo Breiman. 2001. Random forests. Machine Learning 45, 1 (2001), 5–32.

Digital Library

[7]

André L. Brun, Alceu S. Britto, Luiz S. Oliveira, Fabricio Enembreck, and Robert Sabourin. 2016. Contribution of data complexity features on dynamic classifier selection. In 2016 International Joint Conference on Neural Networks (IJCNN’16). IEEE, 4396–4403.

[8]

David Charte, Francisco Charte, and Francisco Herrera. 2021. Reducing data complexity using autoencoders with class-informed loss functions. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 12 (2021), 9549–9560.

[9]

Yu Chen, Telmo M. Silva Filho, Ricardo B. Prudencio, Tom Diethe, and Peter Flach. 2019. \(\beta ^3\)-IRT: A new item response model and its applications. In Proceedings of Machine Learning Research(Proceedings of Machine Learning Research, Vol. 89). PMLR, 1013–1021.

[10]

Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C. Prati, Bartosz Krawczyk, and Francisco Herrera. 2018. Learning from Imbalanced Data Sets, Vol. 11. Springer.

[11]

Jerome H. Friedman. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics (2001), 1189–1232.

[12]

Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett (Eds.), Vol. 29. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2016/file/9d2682367c3935defcb1f9e247a97c0d-Paper.pdf

Digital Library

[13]

Tin Kam Ho and Mitra Basu. 2002. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 3 (2002), 289–300.

Digital Library

[14]

Andrew Houston, Georgina Cosma, Phillipa Turner, and Alexander Bennett. 2021. Predicting surgical outcomes for chronic exertional compartment syndrome using a machine learning framework with embedded trust by interrogation strategies. Scientific Reports 11, 1 (2021), 24281.

[15]

Heinrich Jiang, Been Kim, Melody Y. Guan, and Maya R. Gupta. 2018. To trust or not to trust a classifier. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 5546–5557.

[16]

Joanna Komorniczak, Paweł Ksieniewicz, and Michał Woźniak. 2022. Data complexity and classification accuracy correlation in oversampling algorithms. In Fourth International Workshop on Learning with Imbalanced Domains: Theory and Applications. PMLR, 175–186.

[17]

Carmen Lancho, Isaac Martín De Diego, Marina Cuesta, Victor Acena, and Javier M. Moguerza. 2023. Hostility measure for multi-level study of data complexity. Applied Intelligence 53, 7 (2023), 8073–8096.

Digital Library

[18]

E. Leyva, A. González, and R. Pérez. 2014. A set of complexity measures designed for applying meta-learning to instance selection. IEEE Transactions on Knowledge and Data Engineering 27, 2 (2014), 354–367.

[19]

Enrique Leyva, Antonio González, and Raúl Pérez. 2015. Three new instance selection methods based on local sets: A comparative study with several approaches from a bi-objective perspective. Pattern Recognition 48, 4 (2015), 1523–1537.

Digital Library

[20]

Ling Li and Yaser S. Abu-Mostafa. 2006. Data complexity in machine learning. Technical Report CaltechCSTR:2006.004 (2006).

[21]

Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin Kam Ho. 2019. How complex is your classification problem? A survey on measuring classification complexity. ACM Computing Surveys (CSUR) 52, 5 (2019), 1–34.

Digital Library

[22]

Fernando Martínez-Plumed, Ricardo B. C. Prudêncio, Adolfo Martínez-Usó, and José Hernández-Orallo. 2019. Item response theory in AI: Analysing machine learning classifiers at the instance level. Artificial Intelligence 271 (2019), 18–42.

Digital Library

[23]

Fernando Martínez-Plumed, David Castellano, Carlos Monserrat-Aranda, and José Hernández-Orallo. 2022. When AI difficulty is easy: The explanatory power of predicting IRT difficulty. Proceedings of the AAAI Conference on Artificial Intelligence 36, 7 (2022), 7719–7727.

[24]

João V. C. Moraes, Jéssica T. S. Reinaldo, Manuel Ferreira-Junior, Telmo Silva Filho, and Ricardo B.C. Prudêncio. 2022. Evaluating regression algorithms at the instance level using item response theory. Knowledge-Based Systems 240 (2022), 108076. DOI:

Digital Library

[25]

Mario A. Muñoz, Laura Villanova, Davaatseren Baatar, and Kate Smith-Miles. 2018. Instance spaces for machine learning classification. Machine Learning 107, 1 (2018), 109–147.

Digital Library

[26]

Gustavo H. Nunes, Gustavo O. Martins, Carlos H. Q. Forster, and Ana C. Lorena. 2021. Using instance hardness measures in curriculum learning. In Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional. SBC, 177–188.

[27]

Pedro Yuri Arbs Paiva, Camila Castro Moreno, Kate Smith-Miles, Maria Gabriela Valeriano, and Ana Carolina Lorena. 2022. Relating instance hardness to classification performance in a dataset: A visual approach. Machine Learning 111, 8 (2022), 3085–3123.

Digital Library

[28]

Pedro Yuri Arbs Paiva, Kate Smith-Miles, Maria Gabriela Valeriano, and Ana Carolina Lorena. 2021. PyHard: A novel tool for generating hardness embeddings to support data-centric analysis. arXiv preprint arXiv:2109.14430 (2021).

[29]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research 12, 85 (2011), 2825–2830.

Digital Library

[30]

J. Pimentel, P. J. Azevedo, and L. Torgo. 2022. Subgroup mining for performance analysis of regression models (to appear). Expert Systems (2022).

[31]

Ricardo B. C. Prudêncio. 2020. Cost sensitive evaluation of instance hardness in machine learning. In Machine Learning and Knowledge Discovery in Databases, Ulf Brefeld, Elisa Fromont, Andreas Hotho, Arno Knobbe, Marloes Maathuis, and Céline Robardet (Eds.). Springer International Publishing, 86–102.

[32]

R. B. C. Prudêncio and T. M. Silva-Filho. 2022. Explaining learning performance with local performance regions and maximally relevant meta-rules. In Brazilian Conference on Intelligent Systems (to appear).

Digital Library

[33]

J. Ross Quinlan. 1986. Induction of decision trees. Machine Learning 1, 1 (1986), 81–106.

[34]

Hana Řezanková. 2018. Different approaches to the silhouette coefficient calculation in cluster evaluation. In 21st International Scientific Conference AMSE Applications of Mathematics and Statistics in Economics. 1–10.

[35]

José A. Sáez, Mikel Galar, Julián Luengo, and Francisco Herrera. 2014. Analyzing the presence of noise in multi-class problems: Alleviating its influence with the one-vs-one decomposition. Knowledge and Information Systems 38, 1 (2014), 179–206.

[36]

Habiba Muhammad Sani, Ci Lei, and Daniel Neagu. 2018. Computational complexity analysis of decision tree algorithms. In International Conference on Innovative Techniques and Applications of Artificial Intelligence. Springer, 191–197.

[37]

Miriam Seoane Santos, Pedro Henriques Abreu, Nathalie Japkowicz, Alberto Fernández, and João Santos. 2023. A unifying view of class overlap and imbalance: Key concepts, multi-view panorama, and open avenues for research. Information Fusion 89 (2023), 228–253.

Digital Library

[38]

Peter Schulam and Suchi Saria. 2019. Can you trust this prediction? Auditing pointwise reliability after learning. In The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16–18 April 2019, Naha, Okinawa, Japan(Proceedings of Machine Learning Research, Vol. 89), Kamalika Chaudhuri and Masashi Sugiyama (Eds.). PMLR, 1022–1031.

[39]

Michael R. Smith, Tony Martinez, and Christophe Giraud-Carrier. 2014. An instance level analysis of data complexity. Machine Learning 95, 2 (2014), 225–256.

Digital Library

[40]

Kate A. Smith-Miles. 2009. Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys (CSUR) 41, 1 (2009), 1–25.

Digital Library

[41]

Mariana A. Souza, George D. C. Cavalcanti, Rafael M. O. Cruz, and Robert Sabourin. 2019. Online local pool generation for dynamic classifier selection. Pattern Recognition 85 (2019), 132–148.

[42]

Mariana A. Souza, Robert Sabourin, George D. C. Cavalcanti, and Rafael M. O. Cruz. 2023. OLP++: An online local classifier for high dimensional data. Information Fusion 90 (2023), 120–137.

Digital Library

[43]

Victor L. F. Souza, Adriano L. I. Oliveira, Rafael M. O. Cruz, and Robert Sabourin. 2020. A white-box analysis on the writer-independent dichotomy transformation applied to offline handwritten signature verification. Expert Systems with Applications 154 (2020), 113397.

[44]

Ingo Steinwart and Andreas Christmann. 2008. Support Vector Machines. Springer Science & Business Media.

[45]

Joaquin Vanschoren. 2019. Meta-learning. In Automated Machine Learning. Springer, Cham, 35–61.

[46]

Halbert White. 1992. Artificial Neural Networks: Approximation and Learning Theory. Blackwell Cambridge, Mass.

[47]

Jie Xie, Mingying Zhu, Kai Hu, and Jinglan Zhang. 2023. Instance hardness and multivariate Gaussian distribution-based oversampling technique for imbalance classification. Pattern Analysis and Applications (2023), 1–15.

[48]

Shen Yan, Hsien-Te Kao, and Emilio Ferrara. 2020. Fair class balancing: Enhancing model fairness without observing sensitive attributes. Proceedings of the 29th ACM International Conference on Information & Knowledge Management (2020).

[49]

Harry Zhang. 2004. The optimality of naive Bayes. AAAI 1, 2 (2004), 3.

Cited By

Prudêncio RLorena ASilva-Filho TDrapal PValeriano M(2024)Assessor Models for Explaining Instance Hardness in Classification Problems2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651521(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651521
Ferreira EPrudêncio RLorena A(2024)Measuring Latent Traits of Instance Hardness and Classifier Ability using Boltzmann Machines2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651497(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651497

Index Terms

Trusting My Predictions: On the Value of Instance-Level Analysis
1. Computing methodologies
  1. Machine learning

Recommendations

A novel ensemble method for classification in imbalanced datasets using split balancing technique based on instance hardness (sBal_IH)
Abstract
Classification tasks in datasets that suffer from high class imbalance pose challenge to machine learning algorithms and such datasets are prevalent in many real-world domains and applications. In machine learning research, ensemble methods for ...
Instance-level accuracy versus bag-level accuracy in multi-instance learning

In multi-instance learning, instances are organized into bags, and a bag is labeled positive if it contains at least one positive instance, and negative otherwise; the labels of the individual instances are not given. The task is to learn a classifier ...
Online MIL tracking with instance-level semi-supervised learning

In this paper we propose an online multiple instance boosting algorithm with instance-level semi-supervised learning, termed SemiMILBoost, to achieve robust object tracking. Our work revisits the multiple instance learning (MIL) formulation to alleviate ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 56, Issue 7

July 2024

1006 pages

EISSN:1557-7341

DOI:10.1145/3613612

Editors:
David Atienza
Swiss Federal Institute of Technology Lausanne (EPFL), Switzerland
,
Michela Milano
University of Bologna, Italy

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2024

Online AM: 09 August 2023

Accepted: 03 August 2023

Revised: 20 May 2023

Received: 15 October 2022

Published in CSUR Volume 56, Issue 7

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Brazilian Research Agencies FAPESP
CNPq

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
695
Total Downloads

Downloads (Last 12 months)423
Downloads (Last 6 weeks)14

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Prudêncio RLorena ASilva-Filho TDrapal PValeriano M(2024)Assessor Models for Explaining Instance Hardness in Classification Problems2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651521(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651521
Ferreira EPrudêncio RLorena A(2024)Measuring Latent Traits of Instance Hardness and Classifier Ability using Boltzmann Machines2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651497(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651497

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents