More Web Proxy on the site http://driver.im/

research-article

Free access

Energy-based out-of-distribution detection

AUTHORs:

Yixuan LiAuthors Info & Claims

NIPS'20: Proceedings of the 34th International Conference on Neural Information Processing Systems

Article No.: 1802, Pages 21464 - 21475

Published: 06 December 2020 Publication History

PDF eReader Publisher Site

Abstract

Determining whether inputs are out-of-distribution (OOD) is an essential building block for safely deploying machine learning models in the open world. However, previous methods relying on the softmax confidence score suffer from overconfident posterior distributions for OOD data. We propose a unified framework for OOD detection that uses an energy score. We show that energy scores better distinguish in- and out-of-distribution samples than the traditional approach using the softmax scores. Unlike softmax confidence scores, energy scores are theoretically aligned with the probability density of the inputs and are less susceptible to the overconfidence issue. Within this framework, energy can be flexibly used as a scoring function for any pre-trained neural classifier as well as a trainable cost function to shape the energy surface explicitly for OOD detection. On a CIFAR-10 pre-trained WideResNet, using the energy score reduces the average FPR (at TPR 95%) by 18.03% compared to the softmax confidence score. With energy-based training, our method outperforms the state-of-the-art on common benchmarks.

Supplementary Material

Additional material (3495724.3497526_supp.pdf)

Supplemental material.

Download
215.55 KB

References

[1]

David H. Ackley, Geoffrey E. Hinton, and Terrence J. Sejnowski. A learning algorithm for Boltzmann machines. Cognitive Science, 9(1):147-169, 1985.

[2]

Petra Bevandić, Ivan Kreýo, Marin Orýić, and Siniýa üegvić. Discriminative out-of-distribution detection for semantic segmentation. arXiv preprint arXiv:1808.07703, 2018.

[3]

Jiefeng Chen, Yixuan Li, Xi Wu, Yingyu Liang, and Somesh Jha. Informative outlier matters: Robustifying out-of-distribution detection using outlier mining. arXiv preprint arXiv:2006.15207, 2020.

[4]

Hyunsun Choi and Eric Jang. WAIC, but why? Generative ensembles for robust anomaly detection. arXiv preprint arXiv:1810.01392, 2018.

[5]

Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3606-3613, 2014.

Digital Library

[6]

Terrance DeVries and Graham W Taylor. Learning confidence for out-of-distribution detection in neural networks. arXiv preprint arXiv:1802.04865, 2018.

[7]

Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real NVP. arXiv preprint arXiv:1605.08803, 2016.

[8]

Yilun Du and Igor Mordatch. Implicit generation and generalization in energy-based models. arXiv preprint arXiv:1903.08689, 2019.

[9]

Yonatan Geifman and Ran El-Yaniv. SelectiveNet: A deep neural network with an integrated reject option. arXiv preprint arXiv:1901.09192, 2019.

[10]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2672-2680, 2014.

Digital Library

[11]

Will Grathwohl, Kuan-Chieh Wang, Joern-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, and Kevin Swersky. Your classifier is secretly an energy based model and you should treat it like one. In International Conference on Learning Representations, 2020.

[12]

Matthias Hein, Maksym Andriushchenko, and Julian Bitterwolf. Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 41-50, 2019.

[13]

Dan Hendrycks and Kevin Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136, 2016.

[14]

Dan Hendrycks, Mantas Mazeika, and Thomas Dietterich. Deep anomaly detection with outlier exposure. In International Conference on Learning Representations, 2019.

[15]

Yen-Chang Hsu, Yilin Shen, Hongxia Jin, and Zsolt Kira. Generalized ODIN: Detecting out-of-distribution image without learning from out-of-distribution data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10951-10960, 2020.

[16]

Diederik P. Kingma and Max Welling. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114, 2013.

[17]

Durk P. Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pages 10215-10224, 2018.

[18]

Alex Krizhevsky. Learning multiple layers of features from tiny images. Master's thesis, University of Toronto, Department of Computer Science, 2009.

[19]

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems, pages 6402-6413, 2017.

Digital Library

[20]

Yann LeCun, Sumit Chopra, Raia Hadsell, Marc'Aurelio Ranzato, and Fu-Jie Huang. A tutorial on energy-based learning. In G. Bakir, T. Hofman, B. Schölkopf, A. Smola, and B. Taskar, editors, Predicting Structured Data. MIT Press, 2006.

[21]

Kimin Lee, Honglak Lee, Kibok Lee, and Jinwoo Shin. Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv preprint arXiv:1711.09325, 2017.

[22]

Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Advances in Neural Information Processing Systems, pages 7167-7177, 2018.

[23]

Shiyu Liang, Yixuan Li, and Rayadurgam Srikant. Enhancing the reliability of out-of-distribution image detection in neural networks. In 6th International Conference on Learning Representations, ICLR 2018, 2018.

[24]

Ilya Loshchilov and Frank Hutter. SGDR: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.

[25]

Andrey Malinin and Mark Gales. Predictive uncertainty estimation via prior networks. In Advances in Neural Information Processing Systems, pages 7047-7058, 2018.

[26]

Sina Mohseni, Mandar Pitale, JBS Yadawa, and Zhangyang Wang. Self-supervised learning for generalizable out-of-distribution detection. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04):5216-5223, April 2020.

[27]

Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, and Balaji Lakshminarayanan. Do deep generative models know what they don't know? arXiv preprint arXiv:1810.09136, 2018.

[28]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.

[29]

Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 427-436, 2015.

[30]

Marc'Aurelio Ranzato, Christopher Poultney, Sumit Chopra, and Yann LeCun. Efficient learning of sparse representations with an energy-based model. In Advances in Neural Information Processing Systems, pages 1137-1144, 2007.

[31]

Marc'Aurelio Ranzato, Y-Lan Boureau, Sumit Chopra, and Yann LeCun. A unified energy-based framework for unsupervised learning. In Artificial Intelligence and Statistics, pages 371-379, 2007.

[32]

Jie Ren, Peter J Liu, Emily Fertig, Jasper Snoek, Ryan Poplin, Mark Depristo, Joshua Dillon, and Balaji Lakshminarayanan. Likelihood ratios for out-of-distribution detection. In Advances in Neural Information Processing Systems, pages 14680-14691, 2019.

[33]

Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. Stochastic backpropagation and approximate inference in deep generative models. arXiv preprint arXiv:1401.4082, 2014.

[34]

Ruslan Salakhutdinov and Hugo Larochelle. Efficient learning of deep Boltzmann machines. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pages 693-700, 2010.

[35]

Joan Serrà, David Álvarez, Vicenç Gómez, Olga Slizovskaia, José F. Núñez, and Jordi Luque. Input complexity and out-of-distribution detection with likelihood-based generative models. In International Conference on Learning Representations, 2020.

[36]

Akshayvarun Subramanya, Suraj Srinivas, and R. Venkatesh Babu. Confidence estimation in deep neural networks via density modelling. arXiv preprint arXiv:1707.07013, 2017.

[37]

Esteban G Tabak and Cristina V Turner. A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics, 66(2):145-164, 2013.

[38]

Antonio Torralba, Rob Fergus, and William T. Freeman. 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1958-1970, 2008.

Digital Library

[39]

Aaron Van den Oord, Nal Kalchbrenner, Lasse Espeholt, Oriol Vinyals, Alex Graves, and Koray Kavukcuoglu. Conditional image generation with PixelCNN decoders. In Advances in Neural Information Processing Systems, pages 4790-4798, 2016.

[40]

Jianwen Xie, Yang Lu, Ruiqi Gao, Song-Chun Zhu, and Ying Nian Wu. Cooperative training of descriptor and generator networks. IEEE transactions on pattern analysis and machine intelligence, 42(1):27-45, 2018.

[41]

Jianwen Xie, Yang Lu, Song-Chun Zhu, and Yingnian Wu. A theory of generative convnet. In International Conference on Machine Learning, pages 2635-2644, 2016.

[42]

Jianwen Xie, Zilong Zheng, Ruiqi Gao, Wenguan Wang, Song-Chun Zhu, and Ying Nian Wu. Learning descriptor networks for 3d shape synthesis and analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8629-8638, 2018.

[43]

Jianwen Xie, Song-Chun Zhu, and Ying Nian Wu. Synthesizing dynamic patterns by spatial-temporal generative convnet. In Proceedings of the ieee conference on computer vision and pattern recognition, pages 7093-7101, 2017.

[44]

Jianwen Xie, Song-Chun Zhu, and Ying Nian Wu. Learning energy-based spatial-temporal generative convnets for dynamic patterns. IEEE transactions on pattern analysis and machine intelligence, 2019.

[45]

Pingmei Xu, Krista A Ehinger, Yinda Zhang, Adam Finkelstein, Sanjeev R. Kulkarni, and Jianxiong Xiao. TurkerGaze: Crowdsourcing saliency with webcam based eye tracking. arXiv preprint arXiv:1504.06755, 2015.

[46]

Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.

[47]

Sergey Zagoruyko and Nikos Komodakis. Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.

[48]

Junbo Zhao, Michael Mathieu, and Yann LeCun. Energy-based generative adversarial networks. In 5th International Conference on Learning Representations, ICLR 2017.

[49]

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6):1452-1464, 2017.

Cited By

Cao BXia YDing YZhang CHu QSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Predictive dynamic fusionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692288(5608-5628)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692288
Cheng TCourville AOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Versatile energy-based probabilistic models for high energy physicsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666967(19246-19262)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666967
Qi CFeng ZXing MSu YZheng JZhang Y(2023)Energy-Based Temporal Summarized Attentive Network for Zero-Shot Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2023.326484725(1940-1953)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2023.3264847
Show More Cited By

Recommendations

Attention-Guided Energy-Based Model for Out-of-Distribution Data Detection
Pattern Recognition
Abstract
Detecting out-of-distribution (OOD) data is crucial for the safe and reliable deployment of deep learning models in open-world scenarios. While energy-based models (EBMs) have shown promising potential in OOD detection through the use of an energy ...
Improving Energy-Based Out-of-Distribution Detection by Sparsity Regularization
Advances in Knowledge Discovery and Data Mining
Abstract
Out-of-distribution (OOD) detection is critical for safely deploying machine learning models in the open world. Recently, an energy-score based OOD detector was proposed for any pre-trained classification models. The energy score, which is less ...
Bounded and uniform energy-based out-of-distribution detection for graphs
ICML'24: Proceedings of the 41st International Conference on Machine Learning

Given the critical role of graphs in real-world applications and their high-security requirements, improving the ability of graph neural networks (GNNs) to detect out-of-distribution (OOD) data is an urgent research problem. The recent work GNNSAFE (Wu ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS '20: Proceedings of the 34th International Conference on Neural Information Processing Systems

December 2020

22651 pages

ISBN:9781713829546

Editors:
H. Larochelle
Google Research
,
M. Ranzato
Facebook AI Research
,
R. Hadsell
DeepMind
,
M.F. Balcan
Carnegie Mellon University
,
H. Lin
National Taiwan University

Copyright © 2020 Neural Information Processing Systems Foundation, Inc.

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 06 December 2020

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
191
Total Downloads

Downloads (Last 12 months)125
Downloads (Last 6 weeks)31

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cao BXia YDing YZhang CHu QSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Predictive dynamic fusionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692288(5608-5628)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692288
Cheng TCourville AOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Versatile energy-based probabilistic models for high energy physicsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666967(19246-19262)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666967
Qi CFeng ZXing MSu YZheng JZhang Y(2023)Energy-Based Temporal Summarized Attentive Network for Zero-Shot Action RecognitionIEEE Transactions on Multimedia10.1109/TMM.2023.326484725(1940-1953)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2023.3264847
Larson SLim GAi YKuang DLeach KKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Evaluating out-of-distribution performance on document image classifiersProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601118(11673-11685)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601118
Le FSrivatsa MGanti RSekar V(2022)Rethinking data-driven networking with foundation modelsProceedings of the 21st ACM Workshop on Hot Topics in Networks10.1145/3563766.3564109(188-197)Online publication date: 14-Nov-2022
https://dl.acm.org/doi/10.1145/3563766.3564109

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents