Abstract
The attention-based Transformer architecture is earning increasing popularity for many machine learning tasks. In this study, we aim to explore the suitability of Transformers for time series forecasting, which is a crucial problem in different domains. We perform an extensive experimental study of the Transformer with different architecture and hyper-parameter configurations over 12 datasets with more than 50,000 time series. The forecasting accuracy and computational efficiency of Transformers are compared with state-of-the-art deep learning networks such as LSTM and CNN. The obtained results demonstrate that Transformers can outperform traditional recurrent or convolutional models due to their capacity to capture long-term dependencies, obtaining the most accurate forecasts in five out of twelve datasets. However, Transformers are generally more difficult to parametrize and show higher variability of results. In terms of efficiency, Transformer models proved to be less competitive in inference time and similar to the LSTM in training time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Athanasopoulos, G., Hyndman, R.J., Song, H., Wu, D.C.: Tourism forecasting part two (2010). www.kaggle.com/c/tourism2
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Fan, C., et al.: Multi-horizon time series forecasting with temporal attention learning. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, pp. 2527–2535 (2019). https://doi.org/10.1145/3292500.3330662
Google: Web traffic time series forecasting competition (2017). www.kaggle.com/c/web-traffic-time-series-forecasting
Karmy, J., Maldonado, S.: Hierarchical time series forecasting via support vector regression in the European travel retail industry. Expert Syst. Appl. 137, 59–73 (2019). https://doi.org/10.1016/j.eswa.2019.06.060
Lai, G., Chang, W.C., Yang, Y., Liu, H.: Modeling long- and short-term temporal patterns with deep neural networks. arXiv:1703.07015 (2017)
Lara-Benítez, P., Carranza-García, M., Luna-Romera, J.M., Riquelme, J.C.: Temporal convolutional networks applied to energy-related time series forecasting. Appl. Sci. 10(7), 2322 (2020)
Lara-Benítez, P., Carranza-García, M., Riquelme, J.C.: An experimental review on deep learning architectures for time series forecasting. Int. J. Neural Syst. 31(03), 2130001 (2021). https://doi.org/10.1142/S0129065721300011. pMID: 33588711
Lara-Benítez, P., Gallego-Ledesma, L., Carranza-García, M.: Time Series Forecasting - Deep Learning (2021). https://github.com/pedrolarben/TimeSeriesForecasting-DeepLearning
Li, S., et al.: Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. arXiv:1907.00235 (2020)
Li, Y., Yu, R., Shahabi, C., Liu, Y.: Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv:1707.01926 (2017)
Lim, B., Arik, S.O., Loeff, N., Pfister, T.: Temporal fusion transformers for interpretable multi-horizon time series forecasting. arXiv:1912.09363 (2020)
Luo, P., Wang, X., Shao, W., Peng, Z.: Towards understanding regularization in batch normalization. arXiv preprint arXiv:1809.00846 (2018)
Ma, J., Shou, Z., Zareian, A., Mansour, H., Vetro, A., Chang, S.F.: CDSA: cross-dimensional self-attention for multivariate, geo-tagged time series imputation. arXiv:1905.09904 (2019)
Makridakis, S., Hibon, M.: The M3-competition: results, conclusions and implications. Int. J. Forecast. 16(4), 451–476 (2000). https://doi.org/10.1016/S0169-2070(00)00057-1
Makridakis, S., Spiliotis, E., Assimakopoulos, V.: The M4 competition: 100,000 time series and 61 forecasting methods. Int. J. Forecast. 36(1), 54–74 (2020). https://doi.org/10.1016/j.ijforecast.2019.04.014
NNGC: NN5 time series forecasting competition for neural networks (2008). http://www.neural-forecasting-competition.com/NN5
NREL: Solar power data for integration studies (2007). www.nrel.gov/grid/solar-power-data.html
Sezer, O., Gudelek, M., Ozbayoglu, A.: Financial time series forecasting with deep learning: a systematic literature review: 2005–2019. Appl. Soft Comput. J. 90 (2020). https://doi.org/10.1016/j.asoc.2020.106181
Torres, J., Hadjout, D., Sebaa, A., Martínez-Álvarez, F., Troncoso, A.: Deep learning for time series forecasting: a survey. Big Data 9(1), 3–21 (2021). https://doi.org/10.1089/big.2020.0159
Vaswani, A., et al.: Attention is all you need (2017)
Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989). https://doi.org/10.1162/neco.1989.1.2.270
Wu, N., Green, B., Ben, X., O’Banion, S.: Deep transformer models for time series forecasting: the influenza prevalence case. arXiv:2001.08317 (2020)
Štěpnička, M., Burda, M.: Computational Intelligence in Forecasting (CIF) (2016). https://irafm.osu.cz/cif
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Lara-Benítez, P., Gallego-Ledesma, L., Carranza-García, M., Luna-Romera, J.M. (2021). Evaluation of the Transformer Architecture for Univariate Time Series Forecasting. In: Alba, E., et al. Advances in Artificial Intelligence. CAEPIA 2021. Lecture Notes in Computer Science(), vol 12882. Springer, Cham. https://doi.org/10.1007/978-3-030-85713-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-85713-4_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85712-7
Online ISBN: 978-3-030-85713-4
eBook Packages: Computer ScienceComputer Science (R0)