[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Evaluating outlier probabilities: assessing sharpness, refinement, and calibration using stratified and weighted measures

Published: 19 July 2024 Publication History

Abstract

An outlier probability is the probability that an observation is an outlier. Typically, outlier detection algorithms calculate real-valued outlier scores to identify outliers. Converting outlier scores into outlier probabilities increases the interpretability of outlier scores for domain experts and makes outlier scores from different outlier detection algorithms comparable. Although several transformations to convert outlier scores to outlier probabilities have been proposed in the literature, there is no common understanding of good outlier probabilities and no standard approach to evaluate outlier probabilities. We require that good outlier probabilities be sharp, refined, and calibrated. To evaluate these properties, we adapt and propose novel measures that use ground-truth labels indicating which observation is an outlier or an inlier. The refinement and calibration measures partition the outlier probabilities into bins or use kernel smoothing. Compared to the evaluation of probability in supervised learning, several aspects are relevant when evaluating outlier probabilities, mainly due to the imbalanced and often unsupervised nature of outlier detection. First, stratified and weighted measures are necessary to evaluate the probabilities of outliers well. Second, the joint use of the sharpness, refinement, and calibration errors makes it possible to independently measure the corresponding characteristics of outlier probabilities. Third, equiareal bins, where the product of observations per bin times bin length is constant, balance the number of observations per bin and bin length, allowing accurate evaluation of different outlier probability ranges. Finally, we show that good outlier probabilities, according to the proposed measures, improve the performance of the follow-up task of converting outlier probabilities into labels for outliers and inliers.

References

[1]
Achtert E, Kriegel H, Reichert L, et al. (2010) Visual evaluation of outlier detection models. In: DASFAA (2), Lecture Notes in Computer Science, vol 5982. Springer, pp 396–399
[2]
Arrieta-Ibarra I, Gujral P, Tannen J, et al. Metrics of calibration for probabilistic predictions J Mach Learn Res 2022 23 1 15886-15940
[3]
Barnett V, Lewis T, et al. Outliers in statistical data 1994 New York Wiley
[4]
Bauder RA, Khoshgoftaar TM (2017) Estimating outlier score probabilities. In: 2017 IEEE International Conference on Information Reuse and Integration (IRI), IEEE, pp 559–568
[5]
Blasiok J, Nakkiran P (2023) Smooth ECE: Principled reliability diagrams via kernel smoothing. In: The Twelfth International Conference on Learning Representations
[6]
Bouguessa M (2012) Modeling outlier score distributions. In: ADMA, Springer, pp 713–725
[7]
Breunig MM, Kriegel H, Ng RT, et al. (2000) LOF: identifying density-based local outliers. In: SIGMOD Conference. ACM, pp 93–104
[8]
Brier GW Verification of forecasts expressed in terms of probability Mon Weather Rev 1950 78 1 1-3
[9]
Buja A, Stuetzle W, Shen Y (2005) Loss functions for binary class probability estimation and classification: Structure and applications. Working draft, November 3
[10]
Campos GO, Zimek A, Sander J, et al. On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study Data Min Knowl Disc 2016 30 891-927
[11]
Campos GO, Zimek A, Jr. WM (2018) An unsupervised boosting strategy for outlier detection ensembles. In: PAKDD (1), Lecture Notes in Computer Science, vol 10937. Springer, pp 564–576
[12]
Caruana R, Niculescu-Mizil A (2004) Data mining in metric space: an empirical analysis of supervised learning performance criteria. In: KDD. ACM, pp 69–78
[13]
Clifton LA, Clifton DA, Zhang Y, et al. Probabilistic novelty detection with support vector machines IEEE Trans Reliab 2014 63 2 455-467
[14]
Dawid AP The well-calibrated bayesian J Am Stat Assoc 1982 77 379 605-610
[15]
DeGroot MH and Fienberg SE Assessing probability assessors: calibration and refinement Statist Decis Theory Relat Top III 1982 1 291-314
[16]
DeGroot MH and Fienberg SE The comparison and evaluation of forecasters J R Statist Soc: Ser D (The Statistician) 1983 32 1–2 12-22
[17]
Dempster AP, Laird NM, and Rubin DB Maximum likelihood from incomplete data via the em algorithm J Roy Stat Soc: Ser B (Methodol) 1977 39 1 1-22
[18]
Flach PA, Matsubara ET (2007) A simple lexicographic ranker and probability estimator. In: ECML, Lecture Notes in Computer Science, vol 4701. Springer, pp 575–582
[19]
[20]
Fung K (2023b) More on equal-area histograms. https://junkcharts.typepad.com/junk_charts/2023/05/more-on-equal-area-histograms.html, Accessed: 2024-05-24
[21]
Gao J, Tan PN (2006) Converting output scores from outlier detection algorithms into probability estimates. In: Sixth International Conference on Data Mining (ICDM’06), IEEE, pp 212–221
[22]
Gneiting T, Balabdaoui F, and Raftery AE Probabilistic forecasts, calibration and sharpness J R Stat Soc Ser B Stat Methodol 2007 69 2 243-268
[23]
Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: poster and demo track 1:59–63
[24]
Gupta K, Rahimi A, Ajanthan T, et al. (2020) Calibration of neural networks using splines. In: International Conference on Learning Representations
[25]
Hastie T, Tibshirani R, Friedman JH (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition. Springer Series in Statistics, Springer
[26]
Hawkins DM (1980) Identification of Outliers. Springer, Monographs on Applied Probability and Statistics
[27]
Hernández-Orallo J, Flach PA, Ramirez CF (2011) Brier curves: a new cost-based visualisation of classifier performance. In: ICML. Omnipress, pp 585–592
[28]
Hernández-Orallo J, Flach PA, and Ferri C A unified view of performance metrics: translating threshold choice into expected classification loss J Mach Learn Res 2012 13 2813-2869
[29]
Hoffmann H Kernel PCA for novelty detection Pattern Recognit 2007 40 3 863-874
[30]
Kriegel H, Kröger P, Schubert E, et al. (2009) LoOP: local outlier probabilities. In: Cheung DW, Song I, Chu WW, et al (eds) Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, November 2-6, 2009. ACM, pp 1649–1652
[31]
Kriegel H, Kröger P, Schubert E, et al. (2011) Interpreting and unifying outlier scores. In: Proceedings of the Eleventh SIAM International Conference on Data Mining, SDM 2011, April 28-30, 2011, Mesa, Arizona, USA. SIAM / Omnipress, pp 13–24
[32]
Kriegel H, Kröger P, Schubert E, et al. (2012) Outlier detection in arbitrarily oriented subspaces. In: ICDM. IEEE Computer Society, pp 379–388
[33]
Kull M, Silva Filho TM, and Flach P Beyond sigmoids: How to obtain well-calibrated probabilities from binary classifiers with beta calibration Electron J Statist 2017 11 5052-5080
[34]
Li Z, Zhao Y, Hu X, et al. ECOD: unsupervised outlier detection using empirical cumulative distribution functions IEEE Trans Knowl Data Eng 2023 35 12 12181-12193
[35]
Liu FT, Ting KM, Zhou Z (2012) Isolation-based anomaly detection. ACM Trans Knowl Discov Data 6(1):3:1–3:39
[36]
MacKay DJC Information theory, inference, and learning algorithms 2003 Cambridge Cambridge University Press
[37]
Marques HO, Campello RJ, Sander J, et al. Internal evaluation of unsupervised outlier detection ACM Trans Knowl Discov Data (TKDD) 2020 14 4 1-42
[38]
Marques HO, Zimek A, Campello RJGB, et al. (2022) Similarity-based unsupervised evaluation of outlier detection. In: SISAP, Lecture Notes in Computer Science, vol 13590. Springer, pp 234–248
[39]
Muhr D, Affenzeller M, and Küng J A probabilistic transformation of distance-based outliers Mach Learn Knowl Extr 2023 5 3 782-802
[40]
Murphy AH Scalar and vector partitions of the probability score: Part i. two-state situation J Appl Meteorol 1972 1962–1982 273-282
[41]
Murphy AH A new vector partition of the probability score J Appl Meteorol Climatol 1973 12 4 595-600
[42]
Murphy AH and Winkler RL Scoring rules in probability assessment and evaluation Acta Physiol (Oxf) 1970 34 273-286
[43]
Naeini MP, Cooper G, Hauskrecht M (2015) Obtaining well calibrated probabilities using bayesian binning. In: Proceedings of the AAAI conference on artificial intelligence
[44]
Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. In: Proceedings of the 22nd international conference on Machine learning, pp 625–632
[45]
Nixon J, Dusenberry MW, Zhang L, et al. (2019) Measuring calibration in deep learning. In: CVPR workshops
[46]
Perini L, Vercruyssen V, Davis J (2021) Quantifying the confidence of anomaly detectors in their example-wise predictions. In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part III, Springer, pp 227–243
[47]
Pevný T Loda: Lightweight on-line detector of anomalies Mach Learn 2016 102 2 275-304
[48]
Platt J et al. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods Adv Large Margin Classifiers 1999 10 3 61-74
[49]
Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pp 427–438
[50]
Ramos D, Franco-Pedroso J, Lozano-Diez A, et al. Deconstructing cross-entropy for probabilistic binary classifiers Entropy 2018 20 3 208
[51]
Rayana S and Akoglu L Less is more Building selective anomaly ensembles ACM Trans Knowl Discov Data 2016 10 4 1-33
[52]
Röchner P and Rothlauf F Unsupervised anomaly detection of implausible electronic health records: a real-world evaluation in cancer registries BMC Med Res Methodol 2023 23 1 125
[53]
Ruff L, Kauffmann JR, Vandermeulen RA, et al. A unifying review of deep and shallow anomaly detection Proc IEEE 2021 109 5 756-795
[54]
Shuford EH Jr, Albert A, and Edward Massengill H Admissible probability measurement procedures Psychometrika 1966 31 2 125-145
[55]
Shyu ML, Chen SC, Sarinnapakorn K, et al. (2003) A novel anomaly detection scheme based on principal component classifier. In: Proceedings of the IEEE foundations and new directions of data mining workshop, IEEE Press, pp 172–179
[56]
Sotiris VA, Tse PW, and Pecht MG Anomaly detection through a bayesian support vector machine IEEE Trans Reliab 2010 59 2 277-286
[57]
Sugiyama M, Borgwardt K (2013) Rapid distance-based outlier detection via sampling. Advances in neural information processing systems 26
[58]
Tang J, Chen Z, Fu AW, et al. (2002) Enhancing effectiveness of outlier detections for low density patterns. In: PAKDD, Lecture Notes in Computer Science, vol 2336. Springer, pp 535–548
[59]
Vaicenavicius J, Widmann D, Andersson CR, et al. (2019) Evaluating model calibration in classification. In: AISTATS, Proceedings of Machine Learning Research, vol 89. PMLR, pp 3459–3467
[60]
Wallace BC and Dahabreh IJ Improving class probability estimates for imbalanced data Knowl Inf Syst 2014 41 1 33-52
[61]
wrkyle F (2016) Matplotlib: How to make a histogram with bins of equal area? https://stackoverflow.com/questions/37649342/matplotlib-how-to-make-a-histogram-with-bins-of-equal-area, Accessed: 2024-05-24
[62]
Zhao Y, Nasrullah Z, Li Z (2019) Pyod: A python toolbox for scalable outlier detection. J Mach Learn Res 20(96):1–7. http://jmlr.org/papers/v20/19-011.html

Index Terms

  1. Evaluating outlier probabilities: assessing sharpness, refinement, and calibration using stratified and weighted measures
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Data Mining and Knowledge Discovery
    Data Mining and Knowledge Discovery  Volume 38, Issue 6
    Nov 2024
    893 pages

    Publisher

    Kluwer Academic Publishers

    United States

    Publication History

    Published: 19 July 2024
    Accepted: 29 June 2024
    Received: 04 December 2023

    Author Tags

    1. Outlier detection
    2. Anomaly detection
    3. Unsupervised learning
    4. Outlier probabilities
    5. Calibration
    6. Refinement
    7. Sharpness
    8. Outlier ensembles

    Qualifiers

    • Research-article

    Funding Sources

    • Danmarks Frie Forskningsfond,Denmark
    • Johannes Gutenberg-Universität Mainz (1030)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media