[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A Linear Time Approach to Computing Time Series Similarity Based on Deep Metric Learning

Published: 01 October 2022 Publication History

Abstract

Time series similarity computation is a fundamental primitive that underpins many time series data analysis tasks. However, many existing time series similarity measures have a high computation cost. While there has been much research effort for reducing the computational cost, such effort is usually specific to one similarity measure. We propose <sc>NeuTS</sc> (<bold>Neu</bold>ral metric learning for <bold>T</bold>ime <bold>S</bold>eries) to accelerate time series similarity computation in a generic fashion. <sc>NeuTS</sc> computes the similarity of a given time series pair in linear time and generic to handle any existing similarity measures. <sc>NeuTS</sc> samples a number of seed time series from the given database, and then uses their pair-wise similarities as guidance to approximate the similarity function with a neural metric learning framework. <sc>NeuTS</sc> features two novel modules to achieve accurate approximation of the similarity function: (1) a local attention memory module that augments existing recurrent neural networks for time series encoding; and (2) a distance-weighted ranking loss that effectively transcribes information from the seed-based guidance. With these two modules, <sc>NeuTS</sc> can yield high accuracies and fast convergence rates even if the training data is small. Our experiments with five real-life datasets and four similarity measures (Fr&#x00E9;chet, Hausdorff, ERP and DTW) show that <sc>NeuTS</sc> outperforms baselines consistently and significantly. Specifically, it achieves over 80 percent accuracies in most settings, while obtaining 50x-1000x speedup over bruteforce methods and 3x-350x speedup over approximate algorithms for top-k similarity search.

References

[1]
P. K. Agarwal, K. Fox, J. Pan, and R. Ying, “Approximating dynamic time warping and edit distance for a pair of point sequences,” in Proc. 24th ACM SIGSPATIAL Int. Conf. Advances Geographic Inf. Syst., 2016, pp. 6:1–6:16.
[2]
H. Alt and M. Godau, “Computing the fréchet distance between two polygonal curves,” Int. J. Comput. Geometry Appl., vol. 5, pp. 75–91, 1995.
[3]
T. Ando and J. Bai, “Clustering huge number of financial time series: A panel data approach with high-dimensional predictors and factor structures,” J. Amer. Statist. Assoc., vol. 112, no. 519, pp. 1182–1198, 2017.
[4]
S. Atev, G. Miller, and N. P. Papanikolopoulos, “Clustering of vehicle trajectories,” IEEE Trans. Intell. Transp. Syst., vol. 11, no. 3, pp. 647–657, Sep. 2010.
[5]
A. Backurs and A. Sidiropoulos, “Constant-distortion embeddings of hausdorff metrics into constant-dimensional l_p spaces,” Approximation Randomization Combinatorial Optim. Algorithms Techn., vol. 6, pp. 1:1–1:15, 2016.
[6]
J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, “Signature verification using a siamese time delay neural network,” in Proc. 6th Int. Conf. Neural Inf. Process. Syst., 1993, pp. 737–744.
[7]
Z. Cao, T. Qin, T. Liu, M. Tsai, and H. Li, “Learning to rank: From pairwise approach to listwise approach,” in Proc. 24th Int. Conf. Mach. Learn., 2007, pp. 129–136.
[8]
S. Chandar, S. Ahn, H. Larochelle, P. Vincent, G. Tesauro, and Y. Bengio, “Hierarchical memory networks,” 2016,.
[9]
L. Chen and R. T. Ng, “On the marriage of lp-norms and edit distance,” in Proc. 30th Int. Conf. Very large Data Bases, 2004, pp. 792–803.
[10]
L. Chen, M. T. Özsu, and V. Oria, “Robust and fast similarity search for moving object trajectories,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2005, pp. 491–502.
[11]
Y. Chenet al., “The UCR time series classification archive,” Jul. 2015.
[12]
Z. Chen, H. T. Shen, X. Zhou, Y. Zheng, and X. Xie, “Searching trajectories by locations: An efficiency study,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2010, pp. 255–266.
[13]
H. A. Dauet al., “The UCR time series archive,” IEEE/CAA J. Automatica Sinica, vol. 6, no. 6, pp. 1293–1305, Nov. 2019.
[14]
H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. J. Keogh, “Querying and mining of time series data: experimental comparison of representations and distance measures,” Proc. VLDB Endowment, vol. 1, no. 2, pp. 1542–1552, 2008. [Online]. Available: http://www.vldb.org/pvldb/vol1/1454226.pdf
[15]
A. Driemel and F. Silvestri, “Locality-sensitive hashing of curves,” in Proc. 33rd Int. Symp. Comput. Geometry, 2017, pp. 37:1–37:16.
[16]
K. Echihabi, K. Zoumpatianos, T. Palpanas, and H. Benbrahim, “The lernaean hydra of data series similarity search: An experimental evaluation of the state of the art,” Proc. VLDB Endowment, vol. 12, no. 2, pp. 112–127, 2018.
[17]
M. Ester, H. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proc. 2nd Int. Conf. Knowl. Discovery Data Mining, 1996, pp. 226–231.
[18]
M. Farach-Colton and P. Indyk, “Approximate nearest neighbor algorithms for hausdorff metrics via embeddings,” in Proc. 40th Annu. Symp. Foundations Comput. Sci., 1999, pp. 171–180.
[19]
X. Gong, Y. Xiong, W. Huang, L. Chen, Q. Lu, and Y. Hu, “Fast similarity search of multi-dimensional time series via segment rotation,” in Proc. Int. Conf. Database Syst. Advanced Appl., 2015, pp. 108–124.
[20]
M. G. Gowanlock and H. Casanova, “Distance threshold similarity searches: Efficient trajectory indexing on the GPU,” IEEE Trans. Parallel Distrib. Syst., vol. 27, no. 9, pp. 2533–2545, Sep. 2016.
[21]
A. Graves, G. Wayne, and I. Danihelka, “Neural turing machines,” 2014,.
[22]
X. Jinet al., “Similarity/dissimilarity calculation methods of dna sequences: A survey,” J. Mol. Graph. Modelling, vol. 76, pp. 342–355, 2017.
[23]
V. Kreinovich, “Arbitrary nonlinearity is sufficient to represent all functions by neural networks: A theorem,” Neural Netw., vol. 4, no. 3, pp. 381–383, 1991.
[24]
T. Lee and S. Lee, “OMT: Overlap minimizing top-down bulk loading algorithm for r-tree,” in Proc. Int. Conf. Advanced Inf. Syst. Eng., 2003.
[25]
X. Li, K. Zhao, G. Cong, C. S. Jensen, and W. Wei, “Deep representation learning for trajectory similarity computation,” in Proc. IEEE 34th Int. Conf. Data Eng., 2018, pp. 617–628.
[26]
R. Manmatha, C. Wu, A. J. Smola, and P. Krähenbühl, “Sampling matters in deep embedding learning,” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 2859–2867.
[27]
B. McFee and G. R. G. Lanckriet, “Metric learning to rank,” in Proc. Int. Conf. Mach. Learn., 2010, pp. 775–782.
[28]
L. Moreira-Matias, J. Gama, M. Ferreira, J. Mendes-Moreira, and L. Damas, “Time-evolving O-D matrix estimation using high-speed GPS data streams,” Expert Syst. Appl., vol. 44, pp. 275–288, 2016.
[29]
A. Pavloet al., “Self-driving database management systems,” in Proc. Biennial Conf. Innovative Data Syst. Res., 2017.
[30]
W. Pei, D. M. J. Tax, and L. van der Maaten, “Modeling time series similarity with siamese recurrent networks,” 2016, arXiv, vol. abs/1603.04713.
[31]
Q. Qian, R. Jin, S. Zhu, and Y. Lin, “Fine-grained visual categorization via multi-stage metric learning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 3716–3724.
[32]
T. Rakthanmanonet al., “Searching and mining trillions of time series subsequences under dynamic time warping,” in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2012, pp. 262–270.
[33]
A. Rosenberg and J. Hirschberg, “V-measure: A conditional entropy-based external cluster evaluation measure,” in Proc. Joint Conf. Empir. Methods Natural Lang. Process. Comput. Natural Lang. Learn., 2007, pp. 410–420.
[34]
H. Sakoe and S. Chiba, “Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust. Speech Signal Process., vol. 26, no. 1, pp. 43–49, Feb. 1978.
[35]
S. Salvador and P. Chan, “Toward accurate dynamic time warping in linear time and space,” Intell. Data Anal., vol. 11, no. 5, pp. 561–580, 2007.
[36]
S. Shang, L. Chen, Z. Wei, C. S. Jensen, K. Zheng, and P. Kalnis, “Trajectory similarity join in spatial networks,” Proc. VLDB Endowment, vol. 10, no. 11, pp. 1178–1189, 2017.
[37]
S. Sukhbaatar, A. Szlam, J. Weston, and R. Fergus, “End-to-end memory networks,” in Proc. 28th Int. Conf. Neural Inf. Process. Syst., 2015, pp. 2440–2448.
[38]
J. Weston, S. Chopra, and A. Bordes, “Memory networks,” 2014,.
[39]
D. Xie, F. Li, and J. M. Phillips, “Distributed trajectory similarity search,” Proc. VLDB Endowment, vol. 10, no. 11, pp. 1478–1489, 2017.
[40]
D. Yao, G. Cong, C. Zhang, and J. Bi, “Computing trajectory similarity in linear time: A generic seed-guided neural metric learning approach,” in Proc. IEEE 35th Int. Conf. Data Eng., 2019, pp. 1358–1369.
[41]
B. Yi, H. V. Jagadish, and C. Faloutsos, “Efficient retrieval of similar time sequences under time warping,” in Proc. 14th Int. Conf. Data Eng., 1998, pp. 201–208.
[42]
X. Zhan, S. V. Ukkusuri, and P. S. C. Rao, “Dynamics of functional failures and recovery in complex road networks,” Physical Rev. E, vol. 96, no. 5, 2017, Art. no.
[43]
Y. Zheng, X. Xie, and W. Ma, “Geolife: A collaborative social networking service among user, location and trajectory,” IEEE Data Eng. Bull., vol. 33, no. 2, pp. 32–39, Jun. 2010.

Cited By

View all
  • (2024)Causal Discovery from Temporal Data: An Overview and New PerspectivesACM Computing Surveys10.1145/370529757:4(1-38)Online publication date: 23-Nov-2024
  • (2024)RE-Trace: Re-identification of Modified GPS TrajectoriesACM Transactions on Spatial Algorithms and Systems10.1145/364368010:4(1-28)Online publication date: 5-Feb-2024
  • (2022)Representing spatial trajectories as distributionsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601268(13731-13744)Online publication date: 28-Nov-2022
  • Show More Cited By

Index Terms

  1. A Linear Time Approach to Computing Time Series Similarity Based on Deep Metric Learning
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Knowledge and Data Engineering
        IEEE Transactions on Knowledge and Data Engineering  Volume 34, Issue 10
        Oct. 2022
        502 pages

        Publisher

        IEEE Educational Activities Department

        United States

        Publication History

        Published: 01 October 2022

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 24 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Causal Discovery from Temporal Data: An Overview and New PerspectivesACM Computing Surveys10.1145/370529757:4(1-38)Online publication date: 23-Nov-2024
        • (2024)RE-Trace: Re-identification of Modified GPS TrajectoriesACM Transactions on Spatial Algorithms and Systems10.1145/364368010:4(1-28)Online publication date: 5-Feb-2024
        • (2022)Representing spatial trajectories as distributionsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601268(13731-13744)Online publication date: 28-Nov-2022
        • (2021)A Graph-based Approach for Trajectory Similarity Computation in Spatial NetworksProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467337(556-564)Online publication date: 14-Aug-2021

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media