[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-44223-0_24guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

WAG-NAT: Window Attention and Generator Based Non-Autoregressive Transformer for Time Series Forecasting

Published: 26 September 2023 Publication History

Abstract

Time series forecasting plays a crucial part in many real-world applications. Recent studies have proven the power of Transformer to model long-range dependency for time series forecasting tasks. Nevertheless, the quadratic computational complexity of self-attention is the major obstacle to application. Previous studies focus on structural adjustments of the attention mechanism to achieve more efficient computation. In contrast, local attention has better performance than full attention in feature extraction and computation simplification due to the sparsity of the attention mechanism. Besides, in practice, the speed of inference is more significant, which is also a key factor. In response to these, we develop a novel non-autoregressive Transformer model based on window attention and generator, namely WAG-NAT. The generator allows one-step-forward inference. The window attention module contains a window self-attention layer to capture local patterns and a window interaction layer to fuse information among different windows. Experimental results show that WAG-NAT has a distinct improvement in prediction accuracy compared with RNNs, CNNs, and other previous Transformer-based models across various benchmarks. Our implementation is available at https://github.com/cybisolated/WAG-NAT.

References

[1]
Agarwal O and Nenkova A Temporal effects on pre-trained models for language processing tasks Trans. Assoc. Comput. Linguist. 2022 10 904-921
[2]
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: KDD (2019)
[3]
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)
[4]
Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv:1803.01271 (2018)
[5]
Chen, K., Chen, G., Xu, D., Zhang, L., Huang, Y., Knoll, A.: NAST: non-autoregressive spatial-temporal transformer for time series forecasting. arXiv preprint arXiv:2102.05624 (2021)
[6]
Chen Y, Chen X, Xu A, Sun Q, and Peng X A hybrid CNN-transformer model for ozone concentration prediction Air Qual. Atmos. Hlth. 2022 15 9 1533-1546
[7]
Gal, Y., Ghahramani, Z.: A theoretically grounded application of dropout in recurrent neural networks. In: NIPS (2016)
[8]
Gu, J., Bradbury, J., Xiong, C., Li, V.O.K., Socher, R.: Non-autoregressive neural machine translation. In: ICLR (2018)
[9]
Hewage P, Trovati M, Pereira E, and Behera A Deep learning-based effective fine-grained weather forecasting model Pattern Anal. Appl. 2021 24 1 343-366
[10]
Ioannou, Y., Robertson, D., Cipolla, R., Criminisi, A.: Deep roots: improving CNN efficiency with hierarchical filter groups. In: ICCV (2017)
[11]
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
[12]
Koprinska, I., Wu, D., Wang, Z.: Convolutional neural networks for energy time series forecasting. In: IJCNN (2018)
[13]
Li, S., et al.: Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. In: NIPS (2019)
[14]
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
[15]
Noh SH Analysis of gradient vanishing of RNNs and performance comparison Information 2021 12 11 442
[16]
Nosratabadi S et al. Data science in economics: comprehensive review of advanced machine learning and deep learning methods Mathematics 2020 8 10 1799
[17]
Salman, A.G., Kanigoro, B., Heryadi, Y.: Weather forecasting using deep learning techniques. In: ICACSIS (2015)
[18]
Woschank M, Rauch E, and Zsifkovits H A review of further directions for artificial intelligence, machine learning, and deep learning in smart logistics Sustainability 2020 12 9 3760
[19]
Wu, N., Green, B., Ben, X., O’Banion, S.: Deep transformer models for time series forecasting: the influenza prevalence case. arXiv preprint arXiv:2001.08317 (2020)
[20]
Zhou, H., et al.: Informer: beyond efficient transformer for long sequence time-series forecasting. In: AAAI (2021)

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Artificial Neural Networks and Machine Learning – ICANN 2023: 32nd International Conference on Artificial Neural Networks, Heraklion, Crete, Greece, September 26–29, 2023, Proceedings, Part VI
Sep 2023
620 pages
ISBN:978-3-031-44222-3
DOI:10.1007/978-3-031-44223-0
  • Editors:
  • Lazaros Iliadis,
  • Antonios Papaleonidas,
  • Plamen Angelov,
  • Chrisina Jayne

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 26 September 2023

Author Tags

  1. Time series forecasting
  2. Non-autoregressive Transformer
  3. Window attention
  4. Deep learning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media