More Web Proxy on the site http://driver.im/

research-article

InParformer: evolutionary decomposition transformers with interactive parallel attention for long-term time series forecasting

AUTHORs:

Yangang WangAuthors Info & Claims

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

Article No.: 776, Pages 6906 - 6915

https://doi.org/10.1609/aaai.v37i6.25845

Published: 07 February 2023 Publication History

Abstract

Long-term time series forecasting (LTSF) provides substantial benefits for numerous real-world applications, whereas places essential demands on the model capacity to capture long-range dependencies. Recent Transformer-based models have significantly improved LTSF performance. It is worth noting that Transformer with the self-attention mechanism was originally proposed to model language sequences whose tokens (i.e., words) are discrete and highly semantic. However, unlike language sequences, most time series are sequential and continuous numeric points. Time steps with temporal redundancy are weakly semantic, and only leveraging time-domain tokens is hard to depict the overall properties of time series (e.g., the overall trend and periodic variations). To address these problems, we propose a novel Transformer-based forecasting model named InParformer with an Interactive Parallel Attention (InPar Attention) mechanism. The InPar Attention is proposed to learn long-range dependencies comprehensively in both frequency and time domains. To improve its learning capacity and efficiency, we further design several mechanisms, including query selection, key-value pair compression, and recombination. Moreover, In Parformer is constructed with evolutionary seasonal-trend decomposition modules to enhance intricate temporal pattern extraction. Extensive experiments on six real-world benchmarks show that InParformer outperforms the state-of-the-art forecasting Transformers.

References

[1]

Bahdanau, D.; Cho, K.; and Bengio, Y. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473.

[2]

Bai, S.; Kolter, J. Z.; and Koltun, V. 2018. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv:1803.01271.

[3]

Bengio, S.; Vinyals, O.; Jaitly, N.; and Shazeer, N. 2015. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks. In Cortes, C.; Lawrence, N.; Lee, D.; Sugiyama, M.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc.

[4]

Box, G. E.; and Jenkins, G. M. 1968. Some Recent Advances in Forecasting and Control. Journal of the Royal Statistical Society. Series C (Applied Statistics), 17(2): 91-109.

[5]

Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; and Yuille, A. L. 2017. Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected Crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4): 834-848.

[6]

Chen, T.; and Guestrin, C. 2016. Xgboost: A Scalable Tree Boosting System. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785-794.

Digital Library

[7]

Child, R.; Gray, S.; Radford, A.; and Sutskever, I. 2019. Generating Long Sequences with Sparse Transformers. arXiv:1904.10509.

[8]

Cho, K.; van Merrienboer, B.; Bahdanau, D.; and Bengio, Y. 2014. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. arXiv:1409.1259.

[9]

Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2018. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805.

[10]

Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; Uszkoreit, J.; and Houlsby, N. 2022. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations.

[11]

Gardner Jr, E. S. 1985. Exponential Smoothing: The State of the Art. Journal of forecasting, 4(1): 1-28.

[12]

Guo, Q.; Qiu, X.; Xue, X.; and Zhang, Z. 2019. Low-Rank and Locality Constrained Self-Attention for Sequence Modeling. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(12): 2213-2222.

Digital Library

[13]

He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; and Girshick, R. 2021. Masked Autoencoders Are Scalable Vision Learners. arXiv:2111.06377.

[14]

Hochreiter, S.; and Schmidhuber, J. 1997. Long Short-Term Memory. Neural computation, 9(8): 1735-1780.

[15]

Hyndman, R. J.; and Athanasopoulos, G. 2021. Forecasting: Principles and Practice. OTexts, 3rd edition.

[16]

Kingma, D. P.; and Ba, J. 2014. Adam: A Method for Stochastic Optimization. arXiv:1412.6980.

[17]

Kitaev, N.; Kaiser, L.; and Levskaya, A. 2020. Reformer: The Efficient Transformer. In Eighth International Conference on Learning Representations.

[18]

Lai, G.; Chang, W.-C.; Yang, Y.; and Liu, H. 2018. Modeling Long-and Short-Term Temporal Patterns with Deep Neural Networks. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 95-104.

[19]

Li, S.; Jin, X.; Xuan, Y.; Zhou, X.; Chen, W.; Wang, Y.-X.; and Yan, X. 2019. Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.

[20]

Lin, T.; Wang, Y.; Liu, X.; and Qiu, X. 2021. A Survey of Transformers. arXiv:2106.04554.

[21]

Lindberg, K.; Seljom, P.; Madsen, H.; Fischer, D.; and Korpas, M. 2019. Long-Term Electricity Load Forecasting: Current and Future Trends. Utilities Policy, 58: 102-119.

[22]

Liu, P. J.; Saleh, M.; Pot, E.; Goodrich, B.; Sepassi, R.; Kaiser, L.; and Shazeer, N. 2018. Generating Wikipedia by Summarizing Long Sequences. In International Conference on Learning Representations.

[23]

Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; and Guo, B. 2021. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10012-10022.

[24]

Oppenheim, A. V. 1999. Discrete-Time Signal Processing. Pearson Education India.

[25]

Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. 2019. Pytorch: An Imperative Style, High-Performance Deep Learning Library. Advances in neural information processing systems, 32.

[26]

Rangapuram, S. S.; Seeger, M. W.; Gasthaus, J.; Stella, L.; Wang, Y.; and Januschowski, T. 2018. Deep State Space Models for Time Series Forecasting. Advances in neural information processing systems, 31.

[27]

Robert, C.; William, C.; and Irma, T. 1990. STL: A Seasonal-Trend Decomposition Procedure Based on Loess. Journal of official statistics, 6(1): 3-73.

[28]

Salinas, D.; Flunkert, V.; Gasthaus, J.; and Januschowski, T. 2020. DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks. International Journal of Forecasting, 36(3): 1181-1191.

[29]

Sen, R.; Yu, H.-F.; and Dhillon, I. S. 2019. Think Globally, Act Locally: A Deep Neural Network Approach to High-Dimensional Time Series Forecasting. Advances in neural information processing systems, 32.

[30]

Shi, X.; and Yeung, D.-Y. 2018. Machine Learning for Spatiotemporal Sequence Forecasting: A Survey. arXiv:1808.06865.

[31]

Smola, A. J.; and Schölkopf, B. 2004. A Tutorial on Support Vector Regression. Statistics and computing, 14(3): 199-222.

[32]

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; and Polosukhin, I. 2017. Attention Is All You Need. arXiv:1706.03762.

[33]

Wang, S.; Li, B. Z.; Khabsa, M.; Fang, H.; and Ma, H. 2020. Linformer: Self-Attention with Linear Complexity. arXiv:2006.04768.

[34]

Wu, H.; Xu, J.; Wang, J.; and Long, M. 2021. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. In Advances in Neural Information Processing Systems, volume 34, 22419-22430. Curran Associates, Inc.

[35]

Yao, T.; Wang, J.; Wan, M.; Xin, Z.; Wang, Y.; Cao, R.; Li, S.; and Chi, X. 2022. VenusAI: An artificial intelligence platform for scientific discovery on supercomputers. Journal of Systems Architecture, 128: 102550.

Digital Library

[36]

Zhao, H.; Shi, J.; Qi, X.; Wang, X.; and Jia, J. 2017. Pyramid Scene Parsing Network. arXiv:1612.01105.

[37]

Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; and Zhang, W. 2021. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12): 11106-11115.

[38]

Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; and Jin, R. 2022. FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting. In Proceedings of the 39th International Conference on Machine Learning, 27268-27286. PMLR.

Cited By

Li JZhang MLi NWeyns DJin ZTei K(2024)Generative AI for Self-Adaptive Systems: State of the Art and Research RoadmapACM Transactions on Autonomous and Adaptive Systems10.1145/368680319:3(1-60)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3686803

Recommendations

Combining seasonal ARIMA models with computational intelligence techniques for time series forecasting

Seasonal autoregressive integrated moving average (SARIMA) models form one of the most popular and widely used seasonal time series models over the past three decades. However, in several researches it has been argued that they have two basic ...
The Accuracy of Combining Judgemental and Statistical Forecasts

Judgement based forecasts are widely used in practice either alone or in conjunction with computer prepared forecasts. This study empirically examines the improvement in accuracy which can be gained from combining judgemental forecasts, either with ...
Time series forecasting by a seasonal support vector regression model

The support vector regression (SVR) model is a novel forecasting approach and has been successfully used to solve time series problems. However, the applications of SVR models in a seasonal time series forecasting has not been widely investigated. This ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

February 2023

16496 pages

ISBN:978-1-57735-880-0

Copyright © 2023 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 07 February 2023

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li JZhang MLi NWeyns DJin ZTei K(2024)Generative AI for Self-Adaptive Systems: State of the Art and Research RoadmapACM Transactions on Autonomous and Adaptive Systems10.1145/368680319:3(1-60)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3686803

View Options

View options

Figures

Tables

Media

View Table of Conten