[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Clustering Nonlinear, Nonstationary Time Series Using BSLEX

  • Published:
Methodology and Computing in Applied Probability Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Accurate clustering of time series is a challenging problem for data arising from areas such as financial markets, biomedical studies, and environmental sciences, especially when some, or all, of the series exhibit nonlinearity and nonstationarity. When a subset of the series exhibits nonlinear characteristics, frequency domain clustering methods based on higher-order spectral properties, such as the bispectra or trispectra are useful. While these methods address nonlinearity, they rely on the assumption of series stationarity. We propose the Bispectral Smooth Localized Complex EXponential (BSLEX) approach for clustering nonlinear and nonstationary time series. BSLEX is an extension of the SLEX approach for linear, nonstationary series, and overcomes the challenges of both nonlinearity and nonstationarity through smooth partitions of the nonstationary time series into stationary subsets in a dyadic fashion. The performance of the BSLEX approach is illustrated via simulation where several nonstationary or nonlinear time series are clustered, as well as via accurate clustering of the records of 16 seismic events, eight of which are earthquakes and eight are explosions. We illustrate the utility of the approach by clustering S&P 100 financial returns.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ashley R, Patterson D, Hinich M (1986) A diagnostic test for nonlinear serial dependence in time series fitting errors. J Time Ser Anal 7:165–178

    Article  MathSciNet  MATH  Google Scholar 

  • Balian R (1987) Un principe d’incertitude fort en théorie du signal ou en mécanique quantique. CR Acad Sci Paris 292:1357–1362

    Google Scholar 

  • Bauwens L, Rambouts JVK (2007) Bayesian clustering of many GARCH models. Econ Rev 26:365–386

    Article  MathSciNet  MATH  Google Scholar 

  • Blandford RR (1993) Discrimination of earthquakes and explosions. Tech. rep., Air Force Technical Applications Center, Patrick Air Force Base, FL

  • Böhm H, Ombao H, von Sachs R, J S (2010) Classification of multivariate non-stationary signals: the slex-shrinkage approach. J Stat Plan Infer 140:3754–3763

    Article  MathSciNet  MATH  Google Scholar 

  • Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econ 31:307–327

    Article  MathSciNet  MATH  Google Scholar 

  • Bouveyron C, Côme E, Jacques J (2015) The discriminative functional mixture model for a comparative analysis of bike sharing systems. Ann Appl Stat 9:1726–1760

    Article  MathSciNet  MATH  Google Scholar 

  • Brillinger DR (1965) An introduction to polyspectra. Ann Math Stat 36:1351–1374

    Article  MathSciNet  MATH  Google Scholar 

  • Brillinger DR, Rosenblatt M (1967a) Asymptotic theory of estimates of k-th order spectra. In: Harris B (ed) Spectral analysis of time series. Wiley, New York

  • Brillinger DR, Rosenblatt M (1967b) Computation and interpretion of k-th order spectra. In: Harris B (ed) Spectral analysis of time series. Wiley, New York

  • Chan KS, Tong H (1985) On the use of the deterministic Lyapunov function for the ergodicity of stochastic difference equations. Adv Appl Probab 17:666–678

    Article  MathSciNet  MATH  Google Scholar 

  • Choi HI, Williams WJ (1989) Improved time-frequency representation of multiple component signal using exponential kernel. IEEE Trans Acoust Speech Signal Process 37:862–871

    Article  Google Scholar 

  • Coates DS, Diggle PJ (1986) Tests for comparing two estimated spectral densities. J Time Ser Anal 7:7–20

    Article  MathSciNet  MATH  Google Scholar 

  • Corduas M, Piccolo D (2008) Time series clustering and classification by the autoregressive metric. Comput Stat Data Anal 52:1860–1872

    Article  MathSciNet  MATH  Google Scholar 

  • Dahlhaus R (1997) Fitting time series models to nonstationary processes. Ann Stat 25:1–37

    Article  MathSciNet  MATH  Google Scholar 

  • Dahlhaus R (2001) A likelihood approximation for locally stationary processes. Ann Stat 28:1762–1794

    Article  MathSciNet  MATH  Google Scholar 

  • Fokianos K, Promponas V (2011) Biological applications of time series frequency domain clustering. J Time Ser Anal 33:744–756

    Article  MathSciNet  MATH  Google Scholar 

  • Fokianos K, Savvides A (2008) On comparing several spectral densities. Technometrics 50:317–331

    Article  MathSciNet  Google Scholar 

  • Fruwirth-Schnatter S, Kaufmann S (2008) Model-based clustering of multiple time series. J Bus Econ Stat 26:78–89

    Article  MathSciNet  Google Scholar 

  • Granger CWJ, Anderson AP (1978) An introduction to bilinear time series models. Vandenhoeck and Ruprecht, Göttingen

    Google Scholar 

  • Harvill JL (1999) Testing time series linearity via goodness of fit methods. J Stat Plan Infer 75:331–341

    Article  MathSciNet  Google Scholar 

  • Harvill JL, Ravishanker N, Ray BK (2011) Bispectral-based methods for clustering time series. Comput Stat Data Anal 64:113–131

    Article  MathSciNet  Google Scholar 

  • Heard NA, Homes C, Stephens DA (2006) A quantitative study of gene regulation involved in the immune response of anopheline mosquitoes: An application of Bayesian hierarchical clustering of curves. J Am Stat Assoc 101:18–29

    Article  MathSciNet  MATH  Google Scholar 

  • Hinich M (1982) Testing for gaussianity and linearity for a stationary time series. J Time Ser Anal 3:169–176

    Article  MathSciNet  MATH  Google Scholar 

  • Huang H, Ombao H, Stoffer D (2004) Classification and discrimination of non-stationary time series using the SLEX model. J Am Stat Assoc 99:763–774

    Article  MATH  Google Scholar 

  • Ioannou A, Fokianos K, Promponas V (2010) Spectral density ratio based clustering for the binary segmentation of protein sequences: a comparative study. BioSystems 100:132–143

    Article  Google Scholar 

  • Jahan N, Harvill JL (2008) Bispectral-based goodness-of-fit tests for gaussianity and linearity of stationary time series. Commun Stat Theory Methods 37:3216–3227

    Article  MathSciNet  MATH  Google Scholar 

  • Johnson D, Wichern D (2007) Applied multivariate statistical analysis, 6th edn. Prentice Hall, New York

    MATH  Google Scholar 

  • Juarez MA, Steel MFJ (2010) Model-based clustering of non-Gaussian panel data based on skew-t distributions. J Bus Econ Stat 28:52–66

    Article  MathSciNet  MATH  Google Scholar 

  • Kakizawa Y, Shumway RH, Taniguchi M (1998) Discrimination and clustering for multivariate time series. J Am Stat Assoc 93:328–340

    Article  MathSciNet  MATH  Google Scholar 

  • Kalpakis K, Gada D, Puttagunta V (2001) Distance measures for effective clustering of arima time-series. In: Proceedings of the 2001 IEEE international conference on data mining. San Jose, pp 273–280

  • Kaufman L, Rousseuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley-Interscience, New York

  • Low F (1985) Complete sets of wave-packets. In: A passion for physics-essays in honor of Geoffrey chew. Word Scientific, Singapore, pp 17–22

  • Milligan GW, Cooper MC (1985) An examiniation of procedures determining the number of clusters in a data set. Psychometrica 50:159–179

    Article  Google Scholar 

  • Newton HJ (1988) Timeslab: a time series analysis laboratory. Wadsworth/Brooks-Cole, California

    MATH  Google Scholar 

  • Ombao H, Raz J, von Sachs R, Malow B (2001) Automatic statistical analysis of bivariate nonstationary time series. J Am Stat Assoc 96:543–560

    Article  MathSciNet  MATH  Google Scholar 

  • Ombao H, Raz J, von Sachs R, Guo W (2002) The SLEX model of non-stationary random rocesses. Ann Inst Stat Math 54:171–200

    Article  MATH  Google Scholar 

  • Ombao H, von Sachs R, Guo W (2005) SLEX analysis of multivariate non-stationary time series. J Am Stat Assoc 100:519–531

    Article  MATH  Google Scholar 

  • Paparoditis E, Preuß P (2014) Estimation of the bispectrum for locally stationary processes. Stat Probab Lett 89:8–16

    Article  MathSciNet  MATH  Google Scholar 

  • Priestley MB (1965) Evolutionary spectra and nonstationary processes. J R Stat Soc Ser B 28:228–240

    Google Scholar 

  • Priestley MB (1981) Spectral analysis and time series, vol 1 and 2. Academic Press, London

    MATH  Google Scholar 

  • Priestley MB, Subba Rao T (1969) A test for non-stationarity of time-series. J R Stat Soc Ser B 31(1):140–149

    MathSciNet  MATH  Google Scholar 

  • Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850

    Article  Google Scholar 

  • Ravishanker N, Hosking JRM, Mukhopadhyay J (2010) Spectrum based comparison of multivariate time series. Methodol Comput Appl Probab 12:749–762

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput Appl Math 20:53–65

    Article  Google Scholar 

  • Saito N (1994) Local feature extraction and its applications. PhD thesis, Department of Mathematics, Yale University

  • Sakiyama K, Taniguchi M (2001) Discrimination for locally stationary random processes. Tech. rep., Technical Report

  • Sakiyama K, Taniguchi M (2004) Discrimination for locally stationary random processes. J Multivar Anal 90:282–300

    Article  MATH  Google Scholar 

  • Savvides A, Promponas V, Fokianos K (2008) Clustering of biological time series by cepstral coefficients based distances. Pattern Recogn 41:2398–2412

    Article  MATH  Google Scholar 

  • Shearer PM (2009) An introduction to seismology. Cambridge University Press, New York

    Book  Google Scholar 

  • Shumway RH, Stoffer DS (2011) Time series analysis and its applications: with R examples. Springer, New York

    Book  MATH  Google Scholar 

  • Stein S, Wysession M (2009) An introduction to seismology, earthquakes, and earth structure. Wiley, New York

    Google Scholar 

  • Subba Rao T, Gabr MM (1980) An introduction to bispectral analysis and bilinear time series. Springer-Verlag, New York

    MATH  Google Scholar 

  • Sugar CA, James GM (2003) Finding the number of clusters in a dataset: an information-theoretic approach. J Am Stat Assoc 98:750–763

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani R, Walther G, Hastie T (2001) Estimating the number of data clusters via the gap statistic. J R Stat Soc Ser B 63:411–423

    Article  MATH  Google Scholar 

  • Tong H, Lim KS (1980) Threshold autoregression, limit cycles, and cyclical data. J R Stat Soc Ser B 42:245–292

    MATH  Google Scholar 

  • Tsay RS (1991) Detecting and modeling nonlinearity in univariate time series analysis. Stat Sin 1:431–451

    MATH  Google Scholar 

  • Van Ness JW (1966) Asymptotic normality of bispectral estimates. Ann Math Stat 37(5):1257–1272

    Article  MathSciNet  MATH  Google Scholar 

  • Vlachos M, Lin J, Keogh E, D G (2003) A wavelet based anytime algorithm for k-means clustering of time series. In: Proceedings of the 3rd SIAM international conference on data mining may. San Fransisco

  • Wickerhauser M (1994) Adapted wavelet analysis from theory to software. IEEE Press, Wellesley

    MATH  Google Scholar 

  • Wigner E (1932) On the quantum correction for thermodynamic equilibrium. Phys Rev 40(5):749–759

    Article  MATH  Google Scholar 

  • Xiong X, DY Y (2004) Time series clustering with arma mixtures. Pattern Recogn 37:1675–1689

    Article  MATH  Google Scholar 

  • Zhao Y, Karypis G (2004) Empirical and theoretical comparisons of selected criterion functions for document clustering. Mach Learn 55:311–331

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jane L. Harvill.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Harvill, J.L., Kohli, P. & Ravishanker, N. Clustering Nonlinear, Nonstationary Time Series Using BSLEX. Methodol Comput Appl Probab 19, 935–955 (2017). https://doi.org/10.1007/s11009-016-9528-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11009-016-9528-1

Keywords

Mathematics Subject Classification (2010)

Navigation