Target-class guided sample length reduction and training set selection of univariate time-series

241 Accesses
Explore all metrics

Abstract

The novelty/anomaly detection in time-series (TS) data is an admired research domain, which is specifically a one-class classification (OCC) task, where only target-class samples are present during training and the samples from other classes are unavailable. The performance of OCC algorithms depends on quality and quantity of features and training samples because all the features/samples are not equally important for target-class representation. The present research focuses on OCC of univariate time-series (UTS) and proposes a novel way to acquire the knowledge of the target-class to ensure its strong separation from the other class samples. Apart from enormous training samples, the large sample length (span) increases the computing complexities together with its innate problem of curse of “dimensionality” (here, sample span is treated as dimension of time-series). In this context, the present article offers a concurrent way of target-class guided sample span reduction and training sample selection for UTS data. Initially, the vector representation is obtained using state-of-the-art dissimilarity-based representation (DBR) techniques and later, a novel target-class supervised sample span reduction algorithm is offered via Eigenspace analysis to obtain the minimal sample span. Furthermore, to select the most promising training samples as target-class representatives, state-of-the-art prototype methods are utilized. Finally, one-class support vector machine (OCSVM), 1-nearest neighbour (1-NN) and isolation forest (IF) are utilized to evaluate the performance of proposed approach. Intensive experiments are performed over the archive of 85 univariate datasets provided by University of California, Riverside (UCR) and University of East Anglia (UEA) (this repository is also known as UCR/UEA archive).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Target Class Supervised Sample Length and Training Sample Reduction of Univariate Time Series

Nearest Subspace with Discriminative Regularization for Time Series Classification

An ensemble of shapelet-based classifiers on inter-class and intra-class imbalanced multivariate time series at the early stage

Article 04 June 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Cassisi C, Montalto P, Aliotta M, Cannata A, Pulvirenti A (2012) Similarity measures and dimensionality reduction techniques for time series data mining. Advances in data mining knowledge discovery and applications, pp 71–96
Lin J, Williamson S, Borne K, DeBarr D (2012) Pattern recognition in time series. Advances in Machine Learning and Data Mining for Astronomy 1(617–645):3
Google Scholar
Yang Q, Wu X (2006) 10 challenging problems in data mining research. Int J Inf Technol Decis Mak 5(04):597–604
Article Google Scholar
Esling P, Agon C (2012) Time-series data mining. ACM Computing Surveys (CSUR) 45 (1):1–34
Article MATH Google Scholar
Wilson SJ (2017) Data representation for time series data mining: time domain approaches. Wiley Interdiscip Rev Comput Stat 9(1):e1392
Article MathSciNet Google Scholar
Duin RP, Roli F, De Ridder D (2002) A note on core research issues for statistical pattern recognition. Pattern Recognit Lett 23(4):493–499
Article MATH Google Scholar
Duin RP, Pkalska E (2009) The dissimilarity representation for pattern recognition: a tutorial. Tech. rep., Technical Report
Hoi SC, Sahoo D, Lu J, Zhao P (2018) Online learning: a comprehensive survey. arXiv:1802.02871
Verleysen M, François D. (2005) The curse of dimensionality in data mining and time series prediction. In: International work-conference on artificial neural networks, pp 758–770. Springer
Sonbhadra SK, Agarwal S, Nagabhushan P (2021) Target class supervised sample length and training sample reduction of univariate time series. In: International conference on industrial, engineering and other applications of applied intelligent systems, pp 603–614. Springer
Pkalska E, Duin RP, Paclík P (2006) Prototype selection for dissimilarity-based classifiers. Pattern Recogn 39(2):189–208
Article MATH Google Scholar
Xing Z, Pei J, Philip SY (2012) Early classification on time series. Knowl Inf Syst 31 (1):105–127
Article Google Scholar
Wang H, Zhang Q, Wu J, Pan S, Chen Y (2019) Time series feature learning with labeled and unlabeled data. Pattern Recogn 89:55–66
Article Google Scholar
Alam S, Sonbhadra SK, Agarwal S, Nagabhushan P (2020) One-class support vector classifiers: a survey. Knowl-Based Syst pp 105754
Alam S, Sonbhadra SK, Agarwal S, Nagabhushan P, Tanveer M (2020) Sample reduction using farthest boundary point estimation (fbpe) for support vector data description (svdd). Pattern Recogn Lett 131:268–276
Article Google Scholar
Sonbhadra SK, Agarwal S, Nagabhushan P (2021) Learning target class feature subspace (ltc-fs) using eigenspace analysis and n-ary search-based autonomous hyperparameter tuning for ocsvm. Int J Pattern Recognit Artif Intell:2151015
Mauceri S, Sweeney J, McDermott J (2020) Dissimilarity-based representations for one-class classification on time series. Pattern Recogn 100:107122
Article Google Scholar
Nakano K, Chakraborty B (2019) Effect of data representation for time series classification—a comparative study and a new proposal. Machine Learning and Knowledge Extraction 1(4):1100–1120
Article Google Scholar
Costa YM, Bertolini D, Britto AS, Cavalcanti GD, Oliveira LE (2019) The dissimilarity approach: a review. Artif Intell Rev. pp 1–26
Serra J, Arcos JL (2014) An empirical evaluation of similarity measures for time series classification. Knowl-Based Syst 67:305–314
Article Google Scholar
Giusti R, Batista G (2013) An empirical comparison of dissimilarity measures for time series classification, pp 82–88. https://doi.org/10.1109/BRACIS.2013.22
Huang X, Wu L, Ye Y (2019) A review on dimensionality reduction techniques. Int J Pattern Recognit Artif Intell 33(10):1950017
Article Google Scholar
Badhiye SS, Chatur P (2018) A review on time series dimensionality reduction. HELIX 8 (5):3957–3960
Article Google Scholar
Fu TC (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181
Article Google Scholar
Bien J, Tibshirani R (2011) Prototype selection for interpretable classification. Ann Appl Stat 5(4):2403–2424
Article MathSciNet MATH Google Scholar
Minter T (1975) Single-class classification. In: LARS symposia, pp 54
Koch MW, Moya MM, Hostetler LD, Fogler RJ (1995) Cueing, feature discovery, and one-class learning for synthetic aperture radar automatic target recognition. Neural Netw 8(7–8):1081–1102
Article Google Scholar
Ritter G, Gallegos MT (1997) Outliers in statistical pattern recognition and an application to automatic chromosome classification. Pattern Recogn Lett 18(6):525–539
Article Google Scholar
Bishop CM (1994) Novelty detection and neural network validation. IEE Proceedings-Vision Image and Signal processing 141(4):217–222
Article Google Scholar
Japkowicz N (1999) Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Rutgers University
Mazhelis O (2006) One-class classifiers: a review and analysis of suitability in the context of mobile-masquerader detection. S Afr Comput J 2006(36):29–48
Google Scholar
Chalapathy R, Chawla S (2019)
Pimentel MA, Clifton DA, Clifton L, Tarassenko L (2014) A review of novelty detection. Signal Process 99:215–249
Article Google Scholar
Khan SS, Madden MG (2014) One-class classification: taxonomy of study and review of techniques. Knowl Eng Rev 29(3):345–374
Article Google Scholar
Sonbhadra SK, Agarwal S, Nagabhushan P (2020) Early-stage covid-19 diagnosis in presence of limited posteroanterior chest x-ray images via novel pinball-ocsvm. arXiv:2010.08115
Xing Z, Pei J, Keogh E (2010) A brief survey on sequence classification. SIGKDD Explor Newsl 12(1):40–48. https://doi.org/10.1145/1882471.1882478
Article Google Scholar
Lines J, Taylor S, Bagnall A (2018) Time series classification with hive-cote: the hierarchical vote collective of transformation-based ensembles. ACM Transactions on Knowledge Discovery from Data 12(5)
Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Disc 29(3):565–592
Article MathSciNet MATH Google Scholar
Yin C, Zhang S, Wang J, Xiong NN (2020) Anomaly detection based on convolutional recurrent autoencoder for iot time series. IEEE Transactions on Systems, Man and cybernetics: Systems
Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26(2):275–309
Article MathSciNet Google Scholar
Batista GE, Wang X, Keogh EJ (2011) A complexity-invariant distance measure for time series. In: Proceedings of the 2011 SIAM international conference on data mining. SIAM, pp 699–710
Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pp 491–502
Stefan A, Athitsos V, Das G (2012) The move-split-merge metric for time series. IEEE Trans Knowl Data Eng 25(6):1425–1438
Article Google Scholar
Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 262–270
Peña D, Galeano P (2001) Multivariate analysis in vector time series. Des-Working Papers. Statistics And Econometrics Ws
Längkvist M, Karlsson L, Loutfi A (2014) A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recogn Lett 42:11–24
Article Google Scholar
Kakizawa Y, Shumway RH, Taniguchi M (1998) Discrimination and clustering for multivariate time series. J Am Stat Assoc 93(441):328–340
Article MathSciNet MATH Google Scholar
Villani C (2003) Topics in optimal transportation. 58 American Mathematical Soc
Jiang G, Wang W, Zhang W (2019) A novel distance measure for time series: maximum shifting correlation distance. Pattern Recogn Lett 117:58–65
Article Google Scholar
De Amorim RC, Mirkin B (2012) Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering. Pattern Recogn 45(3):1061–1075
Article Google Scholar
Mori U, Mendiburu A, Lozano JA (2016) Distance measures for time series in r: the tsdist package. R J 8(2):451
Article Google Scholar
Geun Kim M (2000) Multivariate outliers and decompositions of mahalanobis distance. Commun Stat - Theory Methods 29(7):1511–1526
Article MathSciNet MATH Google Scholar
Kuncheva LI, Bezdek JC (1998) Nearest prototype classification: Clustering, genetic algorithms, or random search? IEEE Transactions on Systems, Man, and Cybernetics. Part C (Applications and Reviews) 28 (1):160–164
Google Scholar
Triguero I, Derrac J, Garcia S, Herrera F (2011) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Transactions on Systems, Man, and Cybernetics. Part C (Applications and Reviews) 42(1):86–100
Google Scholar
Garcia S, Derrac J, Cano J, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE transactions on pattern analysis and machine intelligence 34 (3):417–435
Article Google Scholar
Rodríguez CE, Núñez-Antonio G , Escarela G (2020) A bayesian mixture model for clustering circular data. Computational Statistics & Data Analysis 106842:143
MathSciNet MATH Google Scholar
Zhang K, Gu X (2014) An affinity propagation clustering algorithm for mixed numeric and categorical datasets. Math Probl Eng, vol 2014
Peng K, Leung VC, Huang Q (2018) Clustering approach based on mini batch kmeans for intrusion detection system over big data. IEEE Access 6:11897–11906
Article Google Scholar
Dau HA, Bagnall A, Kamgar K, Yeh CCM, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The ucr time series archive. IEEE/CAA Journal of Automatica Sinica 6(6):1293–1305
Article Google Scholar
Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 31 (3):606–660
Article MathSciNet Google Scholar
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ (eds) Advances in neural information processing systems, vol 24. Curran Associates Inc, pp 2546–2554

Download references

Author information

Authors and Affiliations

Institute of Technical Education and Research, Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, Odisha, 751030, India
Sanjay Kumar Sonbhadra
Indian Institute of Information Technology Allahabad, Prayagraj, 211015, India
Sonali Agarwal & P. Nagabhushan

Authors

Sanjay Kumar Sonbhadra
View author publications
You can also search for this author in PubMed Google Scholar
Sonali Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
P. Nagabhushan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanjay Kumar Sonbhadra.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sonbhadra, S.K., Agarwal, S. & Nagabhushan, P. Target-class guided sample length reduction and training set selection of univariate time-series. Appl Intell 53, 7056–7073 (2023). https://doi.org/10.1007/s10489-022-03761-4

Download citation

Accepted: 10 May 2022
Published: 13 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03761-4

Target-class guided sample length reduction and training set selection of univariate time-series

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Target Class Supervised Sample Length and Training Sample Reduction of Univariate Time Series

Nearest Subspace with Discriminative Regularization for Time Series Classification

An ensemble of shapelet-based classifiers on inter-class and intra-class imbalanced multivariate time series at the early stage

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Target-class guided sample length reduction and training set selection of univariate time-series

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Target Class Supervised Sample Length and Training Sample Reduction of Univariate Time Series

Nearest Subspace with Discriminative Regularization for Time Series Classification

An ensemble of shapelet-based classifiers on inter-class and intra-class imbalanced multivariate time series at the early stage

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation