[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Unsupervised time series outlier detection with diversity-driven convolutional ensembles

Published: 01 November 2021 Publication History

Abstract

With the sweeping digitalization of societal, medical, industrial, and scientific processes, sensing technologies are being deployed that produce increasing volumes of time series data, thus fueling a plethora of new or improved applications. In this setting, outlier detection is frequently important, and while solutions based on neural networks exist, they leave room for improvement in terms of both accuracy and efficiency. With the objective of achieving such improvements, we propose a diversity-driven, convolutional ensemble. To improve accuracy, the ensemble employs multiple basic outlier detection models built on convolutional sequence-to-sequence autoencoders that can capture temporal dependencies in time series. Further, a novel diversity-driven training method maintains diversity among the basic models, with the aim of improving the ensemble's accuracy. To improve efficiency, the approach enables a high degree of parallelism during training. In addition, it is able to transfer some model parameters from one basic model to another, which reduces training time. We report on extensive experiments using real-world multivariate time series that offer insight into the design choices underlying the new approach and offer evidence that it is capable of improved accuracy and efficiency.

References

[1]
Charu C. Aggarwal. 2013. Outlier Analysis.
[2]
Charu C. Aggarwal and Saket Sathe. 2017. Outlier Ensembles - An Introduction.
[3]
James Bergstra and Yoshua Bengio. 2012. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 13 (2012), 281--305.
[4]
Leo Breiman. 1996. Bagging Predictors. Mach. Learn. 24, 2 (1996), 123--140.
[5]
Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying Density-Based Local Outliers. In SIGMOD. 93--104.
[6]
Hakan Cevikalp, Burak Benligiray, and Omer Nezih Gerek. 2020. Semi-supervised Robust Deep Neural Networks for Multi-label Image Classification. Pattern Recognition 100 (2020), 107164.
[7]
Jinghui Chen, Saket Sathe, Charu C. Aggarwal, and Deepak S. Turaga. 2017. Outlier Detection with Autoencoder Ensembles. In SDM. 90--98.
[8]
Razvan-Gabriel Cirstea, Darius-Valer Micu, Gabriel-Marcel Muresan, Chenjuan Guo, and Bin Yang. 2018. Correlated Time Series Forecasting using Multi-Task Deep Neural Networks. In CIKM. 1527--1530.
[9]
Razvan-Gabriel Cirstea, Tung Kieu, Chenjuan Guo, Bin Yang, and Sinno Jialin Pan. 2021. EnhanceNet: Plugin Neural Networks for Enhancing Correlated Time Series Forecasting. In ICDE. 1739--1750.
[10]
Razvan-Gabriel Cirstea, Bin Yang, and Chenjuan Guo. 2019. Graph Attention Recurrent Neural Networks for Correlated Time Series Forecasting. In MileTS19@KDD.
[11]
Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. 2017. Language Modeling with Gated Convolutional Networks. In ICML. 933--941.
[12]
Yoav Freund and Robert E. Schapire. 1997. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 55, 1 (1997), 119--139.
[13]
Tommaso Furlanello, Zachary Chase Lipton, Michael Tschannen, Laurent Itti, and Anima Anandkumar. 2018. Born-Again Neural Networks. In ICML. 1602--1611.
[14]
Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. Convolutional Sequence to Sequence Learning. In ICML. 1243--1252.
[15]
Chenjuan Guo, Bin Yang, Jilin Hu, Christian S. Jensen, and Lu Chen. 2020. Context-aware, preference-based vehicle routing. VLDB J. 29, 5 (2020), 1149--1170.
[16]
Manish Gupta, Jing Gao, Charu C. Aggarwal, and Jiawei Han. 2014. Outlier Detection for Temporal Data: A Survey. IEEE Trans. Knowl. Data Eng. 26, 9 (2014), 2250--2267.
[17]
Simon Hawkins, Hongxing He, Graham J. Williams, and Rohan A. Baxter. 2002. Outlier Detection Using Replicator Neural Networks. In DAWAK. 170--180.
[18]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. 770--778.
[19]
Dan Hendrycks, Mantas Mazeika, Saurav Kadavath, and Dawn Song. 2019. Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty. In NIPS. 15637--15648.
[20]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (1997), 1735--1780.
[21]
Jilin Hu, Bin Yang, Chenjuan Guo, and Christian S. Jensen. 2018. Risk-aware path selection with time-varying, uncertain travel costs: a time series approach. VLDB J. 27, 2 (2018), 179--200.
[22]
Jilin Hu, Bin Yang, Chenjuan Guo, Christian S. Jensen, and Hui Xiong. 2020. Stochastic Origin-Destination Matrix Forecasting Using Dual-Stage Graph Convolutional, Recurrent Neural Networks. In ICDE. 1417--1428.
[23]
Renjun Hu, Charu C. Aggarwal, Shuai Ma, and Jinpeng Huai. 2016. An embedding approach to anomaly detection. In ICDE. 385--396.
[24]
Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, and Kilian Q. Weinberger. 2017. Snapshot Ensembles: Train 1, Get M for Free. In ICLR. pp. 14.
[25]
Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Söderström. 2018. Detecting Spacecraft Anomalies Using LSTMs and Non-parametric Dynamic Thresholding. In SIGKDD. 387--395.
[26]
Tung Kieu, Bin Yang, Chenjuan Guo, and Christian S. Jensen. 2018. Distinguishing Trajectories from Different Drivers using Incompletely Labeled Trajectories. In CIKM. 863--872.
[27]
Tung Kieu, Bin Yang, Chenjuan Guo, and Christian S. Jensen. 2019. Outlier Detection for Time Series with Recurrent Autoencoder Ensembles. In IJCAI. 2725--2732.
[28]
Tung Kieu, Bin Yang, and Christian S. Jensen. 2018. Outlier Detection for Multidimensional Time Series Using Deep Neural Networks. In MDM. 125--134.
[29]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: a Method for Stochastic Optimization. In ICLR. pp. 15.
[30]
Kim-Hung Le and Paolo Papotti. 2020. User-driven Error Detection for Time Series with Events. In ICDE. 745--757.
[31]
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[32]
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. In ICDM. 413--422.
[33]
Huiping Liu, Cheqing Jin, Bin Yang, and Aoying Zhou. 2018. Finding Top-k Optimal Sequenced Routes. In ICDE. 569--580.
[34]
Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In EMNLP. 1412--1421.
[35]
Pankaj Malhotra, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, and Gautam M. Shroff. 2016. LSTM-based Encoder-Decoder for Multi-sensor Anomaly Detection. In ICML Anomaly Detection Workshop. 5.
[36]
Oleg Okun, Giorgio Valentini, and Matteo Ré (Eds.). 2011. Ensembles in Machine Learning Applications. Studies in Computational Intelligence, Vol. 373.
[37]
Simon Aagaard Pedersen, Bin Yang, and Christian S. Jensen. 2020. Anytime Stochastic Routing with Hybrid Learning. Proc. VLDB Endow. 13, 9 (2020), 1555--1567.
[38]
Simon Aagaard Pedersen, Bin Yang, and Christian S. Jensen. 2020. Fast stochastic routing under time-varying uncertainty. VLDB J. 29, 4 (2020), 819--839.
[39]
Mayu Sakurada and Takehisa Yairi. 2014. Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction. In MLSDA. 4--11.
[40]
Claude Sammut and Geoffrey I. Webb (Eds.). 2017. Encyclopedia of Machine Learning and Data Mining.
[41]
Bernhard Schölkopf, Alexander J. Smola, Robert C. Williamson, and Peter L. Bartlett. 2000. New Support Vector Algorithms. Neural Comput. 12, 5 (2000), 1207--1245.
[42]
Bernhard Schölkopf, Robert C. Williamson, Alexander J. Smola, John Shawe-Taylor, and John C. Platt. 1999. Support Vector Method for Novelty Detection. In NIPS. 582--588.
[43]
Maximilian Soelch, Justin Bayer, Marvin Ludersdorfer, and Patrick van der Smagt. 2016. Variational Inference for On-line Anomaly Detection in High-Dimensional Time Series. CoRR abs/1602.07109 (2016), 4.
[44]
Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu, Wei Sun, and Dan Pei. 2019. Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network. In SIGKDD. 2828--2837.
[45]
Luan Tran, Minyoung Mun, and Cyrus Shahabi. 2020. Real-Time Distance-Based Outlier Detection in Data Streams. Proc. VLDB Endow. 14, 2 (2020), 141--153.
[46]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS. 5998--6008.
[47]
Shuo Wang, Huanhuan Chen, and Xin Yao. 2010. Negative correlation learning for classification ensembles. In IJCNN. 1--8.
[48]
David H. Wolpert. 1992. Stacked generalization. Neural Networks 5, 2 (1992), 241--259.
[49]
Haowen Xu, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian Zhao, Dan Pei, Yang Feng, Jie Chen, Zhaogang Wang, and Honglin Qiao. 2018. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. In WWW. 187--196.
[50]
Sean Bin Yang, Chenjuan Guo, Jilin Hu, Jian Tang, and Bin Yang. 2021. Unsupervised Path Representation Learning with Curriculum Negative Sampling. In IJCAI. 3286--3292.
[51]
Sean Bin Yang, Chenjuan Guo, and Bin Yang. 2020. Context-Aware Path Ranking in Road Networks. IEEE Trans. Knowl. Data Eng. (2020).
[52]
Susik Yoon, Jae-Gil Lee, and Byung Suk Lee. 2019. NETS: Extremely Fast Outlier Detection from a Data Stream via Set-Based Processing. Proc. VLDB Endow. 12, 11 (2019), 1303--1315.
[53]
Aoqian Zhang, Shaoxu Song, Jianmin Wang, and Philip S. Yu. 2017. Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing. Proc. VLDB Endow. 10, 10 (2017), 1046--1057.
[54]
Chuxu Zhang, Dongjin Song, Yuncong Chen, Xinyang Feng, Cristian Lumezanu, Wei Cheng, Jingchao Ni, Bo Zong, Haifeng Chen, and Nitesh V. Chawla. 2019. A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data. In AAAI. 1409--1416.
[55]
Wentao Zhang, Jiawei Jiang, Yingxia Shao, and Bin Cui. 2020. Efficient Diversity-Driven Ensemble for Deep Neural Networks. In ICDE. 73--84.
[56]
Xuyun Zhang, Wan-Chun Dou, Qiang He, Rui Zhou, Christopher Leckie, Kotagiri Ramamohanarao, and Zoran A. Salcic. 2017. LSHiForest: A Generic Framework for Fast Tree Isolation Based Ensemble Anomaly Analysis. In ICDE. 983--994.

Cited By

View all
  • (2024)DeepSketch: A Query Sketching Interface for Deep Time Series Similarity SearchProceedings of the VLDB Endowment10.14778/3685800.368587717:12(4369-4372)Online publication date: 8-Nov-2024
  • (2024)AutoTSAD: Unsupervised Holistic Anomaly Detection for Time Series DataProceedings of the VLDB Endowment10.14778/3681954.368197817:11(2987-3002)Online publication date: 1-Jul-2024
  • (2024)Efficient Stochastic Routing in Path-Centric Uncertain Road NetworksProceedings of the VLDB Endowment10.14778/3681954.368197117:11(2893-2905)Online publication date: 1-Jul-2024
  • Show More Cited By

Index Terms

  1. Unsupervised time series outlier detection with diversity-driven convolutional ensembles
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 15, Issue 3
      November 2021
      364 pages
      ISSN:2150-8097
      Issue’s Table of Contents

      Publisher

      VLDB Endowment

      Publication History

      Published: 01 November 2021
      Published in PVLDB Volume 15, Issue 3

      Badges

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)37
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 06 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)DeepSketch: A Query Sketching Interface for Deep Time Series Similarity SearchProceedings of the VLDB Endowment10.14778/3685800.368587717:12(4369-4372)Online publication date: 8-Nov-2024
      • (2024)AutoTSAD: Unsupervised Holistic Anomaly Detection for Time Series DataProceedings of the VLDB Endowment10.14778/3681954.368197817:11(2987-3002)Online publication date: 1-Jul-2024
      • (2024)Efficient Stochastic Routing in Path-Centric Uncertain Road NetworksProceedings of the VLDB Endowment10.14778/3681954.368197117:11(2893-2905)Online publication date: 1-Jul-2024
      • (2024)QCore: Data-Efficient, On-Device Continual Calibration for Quantized ModelsProceedings of the VLDB Endowment10.14778/3681954.368195717:11(2708-2721)Online publication date: 30-Aug-2024
      • (2024)TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting MethodsProceedings of the VLDB Endowment10.14778/3665844.366586317:9(2363-2377)Online publication date: 1-May-2024
      • (2024)TSGBench: Time Series Generation BenchmarkProceedings of the VLDB Endowment10.14778/3632093.363209717:3(305-318)Online publication date: 20-Jan-2024
      • (2024)Deep Learning for Time Series Anomaly Detection: A SurveyACM Computing Surveys10.1145/369133857:1(1-42)Online publication date: 7-Oct-2024
      • (2024)Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language ModelsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679973(3892-3896)Online publication date: 21-Oct-2024
      • (2024)Navigating the metric maze: a taxonomy of evaluation metrics for anomaly detection in time seriesData Mining and Knowledge Discovery10.1007/s10618-023-00988-838:3(1027-1068)Online publication date: 1-May-2024
      • (2024)AutoCTS++: zero-shot joint neural architecture and hyperparameter search for correlated time series forecastingThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-024-00872-x33:5(1743-1770)Online publication date: 1-Sep-2024
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media