Abstract
Numerous sophisticated machine learning tools (e.g. ensembles or deep networks) have shown outstanding performance in terms of accuracy on different numeric forecasting tasks. In many real world application domains the numeric predictions of the models drive important and costly decisions. Frequently, decision makers require more than a black box model to be able to “trust” the predictions up to the point that they base their decisions on them. In this context, understanding these black boxes has become one of the hot topics in Machine Learning and Data Mining research. This paper proposes a series of visualisation tools that help in understanding the predictive performance of non-interpretable regression models. More specifically, these tools allow the user to relate the expected error of any model to the values of the predictor variables. This type of information allows end-users to correctly assess the risks associated with the use of the models, by showing how concrete values of the predictors may affect the performance of the models. Our illustrations with different real world data sets and learning algorithms provide insights on the type of usage and information these tools bring to both the data analyst and the end-user.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Please note that below each bin we provide the number of training cases belonging to each bin and the respective percentage of the full data set.
References
Bi, J., Bennett, K.P.: Regression error characteristic curves. In: Proceedings of the 20th International Conference on Machine Learning, pp. 43–50 (2003)
Diakopoulos, N.: Accountability in algorithmic decision making. Commun. ACM 59, 56–62 (2016)
Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., Weingessel, A.: e1071: Misc Functions of the Department of Statistics (e1071). TU Wien (2011)
Ferri, C., Hernández-orallo, J., Flach, P.A.: Brier curves: a new cost-based visualisation of classifier performance. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 585–592 (2011)
Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E.: Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation. J. Comput. Graph. Stat. 24(1), 44–65 (2015)
Greenwell, B., Boehmke, B., Cunningham, J., Developers, G.: GBM: Generalized Boosted Regression Models (2018). https://CRAN.R-project.org/package=gbm
Hernández-Orallo, J.: ROC curves for regression. Pattern Recogn. 46(12), 3395–3411 (2013)
Hinton, G., Dean, J., Vinyals, O.: Distilling the knowledge in a neural network. arXiv:1503.02531v1, 1–9 (2014)
Kim, B., Khanna, R., Koyejo, O.: Examples are not enough, learn to criticize! criticism for interpretability. In: AAdvances in Neural Information Processing Systems (NIPS 2016), pp. 2288–2296 (2016)
Kodratoff, Y., Nédellec, C.: Machine learning and comprehensibility. In: Proceedings of IJCAI 1995 (1995)
Lakkaraju, H., Bach, S., Leskovec, J.: Interpretable decision sets: a joint framework for description and prediction. In: Proceedings of ACM KDD 2016, pp. 1675–1684 (2016)
Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
Lipton, Z.C.: The mythos of model interpretability. ACM Queue 16(3), 30:31–30:57 (2018). https://doi.org/10.1145/3236386.3241340
Metz, C.E.: Basic principles of ROC analysis. Semin. Nucl. Med. 8, 283–298 (1978)
Miller, T.: Explanation in artificial intelligence: insights from the social sciences. arXiv:1706.07269 (2017)
Ribeiro, M., Singh, S., Guestrin, C.: Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of ACM KDD 2016, pp. 1135–1144 (2016)
Torgo, L.: Regression error characteristic surfaces. In: KDD 2005: Proceedings of the 11th ACM SIGKDD, pp. 697–702 (2005)
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, Heidelberg (2002). https://doi.org/10.1007/978-0-387-21706-2
Acknowledgments
This work is partially funded by Portuguese Science and Technology Foundation (FCT) through the NITROLIMIT project (PTDC/CTA-AMB/30997/2017) and the project UID/EEA/50014/2019. The work of L. Torgo was undertaken, in part, thanks to funding from the Canada Research Chairs program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Areosa, I., Torgo, L. (2019). Visual Interpretation of Regression Error. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11805. Springer, Cham. https://doi.org/10.1007/978-3-030-30244-3_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-30244-3_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30243-6
Online ISBN: 978-3-030-30244-3
eBook Packages: Computer ScienceComputer Science (R0)