[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

On Reporting Durable Patterns in Temporal Proximity Graphs

Published: 14 May 2024 Publication History

Abstract

Finding patterns in graphs is a fundamental problem in databases and data mining. In many applications, graphs are temporal and evolve over time, so we are interested in finding durable patterns, such as triangles and paths, which persist over a long time. While there has been work on finding durable simple patterns, existing algorithms do not have provable guarantees and run in strictly super-linear time. The paper leverages the observation that many graphs arising in practice are naturally proximity graphs or can be approximated as such, where nodes are embedded as points in some high-dimensional space, and two nodes are connected by an edge if they are close to each other. We work with an implicit representation of the proximity graph, where nodes are additionally annotated by time intervals, and design near-linear-time algorithms for finding (approximately) durable patterns above a given durability threshold. We also consider an interactive setting where a client experiments with different durability thresholds in a sequence of queries; we show how to compute incremental changes to result patterns efficiently in time near-linear to the size of the changes.

References

[1]
Doubling dimension in real-world graphs. https://slideplayer.com/slide/5331329/. Accessed: 2023-04--24.
[2]
A. Abboud and V. V. Williams. Popular conjectures imply strong lower bounds for dynamic problems. In FOCS, pages 434--443. IEEE, 2014.
[3]
P. K. Agarwal, X. Hu, S. Sintos, and J. Yang. Dynamic enumeration of similarity joins. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021), 2021.
[4]
P. K. Agarwal, X. Hu, S. Sintos, and J. Yang. On reporting durable patterns in temporal proximity graphs. https: //arxiv.org/abs/2403.16312, 2024.
[5]
N. Alon, T. Kaufman, M. Krivelevich, and D. Ron. Testing triangle-freeness in general graphs. SIAM Journal on Discrete Mathematics, 22(2):786--819, 2008.
[6]
M. Araujo, S. Günnemann, S. Papadimitriou, C. Faloutsos, P. Basu, A. Swami, E. E. Papalexakis, and D. Koutra. Discovery of "comet" communities in temporal and labeled graphs com2. Knowledge and Information Systems, 46(3):657--677, 2016.
[7]
C. Berkholz, J. Keppeler, and N. Schweikardt. Answering conjunctive queries under updates. In proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systems, pages 303--318, 2017.
[8]
A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. In Proceedings of the 23rd international conference on Machine learning, pages 97--104, 2006.
[9]
A. Björklund, R. Pagh, V. V. Williams, and U. Zwick. Listing triangles. In International Colloquium on Automata, Languages, and Programming, pages 223--234. Springer, 2014.
[10]
M. Borassi, A. Epasto, S. Lattanzi, S. Vassilvitskii, and M. Zadimoghaddam. Better sliding window algorithms to maximize subadditive and diversity objectives. In Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 254--268, 2019.
[11]
N. H. Bshouty, Y. Li, and P. M. Long. Using the doubling dimension to analyze the generalization of learning algorithms. Journal of Computer and System Sciences, 75(6):323--335, 2009.
[12]
H. Cai, V. W. Zheng, and K. C.-C. Chang. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE transactions on knowledge and data engineering, 30(9):1616--1637, 2018.
[13]
T. M. Chan. Optimal partition trees. Discrete & Computational Geometry, 47(4):661--690, 2012.
[14]
T. M. Chan. Klee's measure problem made easy. In 2013 IEEE 54th annual symposium on foundations of computer science, pages 410--419. IEEE, 2013.
[15]
T. M. Chan. Finding triangles and other small subgraphs in geometric intersection graphs. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1777--1805. SIAM, 2023.
[16]
J. Chen. Algorithmic graph embeddings. Theoretical Computer Science, 181(2):247--266, 1997.
[17]
P. Cunningham and S. J. Delany. k-nearest neighbour classifiers-a tutorial. ACM computing surveys (CSUR), 54(6):1--25, 2021.
[18]
M. Damian, S. Pandit, and S. Pemmaraju. Distributed spanner construction in doubling metric spaces. In Principles of Distributed Systems: 10th International Conference, OPODIS 2006, Bordeaux, France, December 12--15, 2006. Proceedings 10, pages 157--171. Springer, 2006.
[19]
S. Deng, S. Lu, and Y. Tao. Space-query tradeoffs in range subgraph counting and listing. In 26th International Conference on Database Theory (ICDT 2023). Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2023.
[20]
T. K. Dey and Y. Wang. Computational topology for data analysis. Cambridge University Press, 2022.
[21]
H. Edelsbrunner and J. L. Harer. Computational topology: an introduction. American Mathematical Society, 2022.
[22]
D. Eppstein and J. Erickson. Iterated nearest neighbors and finding minimal polytopes. Discrete & Computational Geometry, 11(3):321--350, 1994.
[23]
J. Erickson. Static-to-dynamic transformations. http://jeffe.cs.illinois.edu/teaching/datastructures/notes/01- statictodynamic.pdf.
[24]
E. Facco, M. d'Errico, A. Rodriguez, and A. Laio. Estimating the intrinsic dimension of datasets by a minimal neighborhood information. Scientific reports, 7(1):12140, 2017.
[25]
A. E. Feldmann and D. Marx. The parameterized hardness of the k-center problem in transportation networks. Algorithmica, 82:1989--2005, 2020.
[26]
M. Franzke, T. Emrich, A. Züfle, and M. Renz. Pattern search in temporal social networks. In Proceedings of the 21st International Conference on Extending Database Technology, 2018.
[27]
J. Gao, P. K. Agarwal, and J. Yang. Durable top-k queries on temporal data. Proceedings of the VLDB Endowment, 11(13):2223--2235, 2018.
[28]
J. Gao, S. Sintos, P. K. Agarwal, and J. Yang. Durable top-k instant-stamped temporal records with user-specified scoring functions. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 720--731. IEEE, 2021.
[29]
J. Gao, Y. Xu, P. K. Agarwal, and J. Yang. Efficiently answering durability prediction queries. In Proceedings of the 2021 International Conference on Management of Data, pages 591--604, 2021.
[30]
M.-G. Gong, L.-J. Zhang, J.-J. Ma, and L.-C. Jiao. Community detection in dynamic social networks based on multiobjective immune algorithm. Journal of computer science and technology, 27(3):455--467, 2012.
[31]
S. Har-Peled and M. Mendel. Fast construction of nets in low dimensional metrics, and their applications. In Proceedings of the twenty-first annual symposium on Computational geometry, pages 150--158, 2005.
[32]
D. S. Hochbaum. Approximating covering and packing problems: set cover, vertex cover, independent set, and related problems. In Approximation algorithms for NP-hard problems, pages 94--143. 1996.
[33]
X. Hu, S. Sintos, J. Gao, P. K. Agarwal, and J. Yang. Computing complex temporal join queries efficiently. In Proceedings of the 2022 International Conference on Management of Data, pages 2076--2090, 2022.
[34]
M. Idris, M. Ugarte, and S. Vansummeren. The dynamic yannakakis algorithm: Compact and efficient query processing under updates. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 1259--1274, 2017.
[35]
A. Itai and M. Rodeh. Finding a minimum circuit in a graph. In Proceedings of the ninth annual ACM symposium on Theory of computing, pages 1--10, 1977.
[36]
H. Kaplan, K. Klost,W. Mulzer, L. Roditty, P. Seiferth, and M. Sharir. Triangles and girth in disk graphs and transmission graphs. In 27th Annual European Symposium on Algorithms (ESA 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
[37]
N. Kumar, L. Zhang, and S. Nayar. What is a good nearest neighbors algorithm for finding similar patches in images? In Computer Vision--ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12--18, 2008, Proceedings, Part II 10, pages 364--378. Springer, 2008.
[38]
A. Kutuzov, M. Dorgham, O. Oliynyk, C. Biemann, and A. Panchenko. Making fast graph-based algorithms with graph metric embeddings. In ACL 2019--57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pages 3349--3355, 2020.
[39]
Y.-R. Lin, Y. Chi, S. Zhu, H. Sundaram, and B. L. Tseng. Facetnet: a framework for analyzing communities and their evolutions in dynamic networks. In Proceedings of the 17th international conference on World Wide Web, pages 685--694, 2008.
[40]
G. Locicero, G. Micale, A. Pulvirenti, and A. Ferro. Temporalri: a subgraph isomorphism algorithm for temporal networks. In Complex Networks & Their Applications IX: Volume 2, Proceedings of the Ninth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2020, pages 675--687. Springer, 2021.
[41]
d. B. Mark, C. Otfried, v. K. Marc, and O. Mark. Computational geometry algorithms and applications. Spinger, 2008.
[42]
T. E. Ng and H. Zhang. Predicting internet network distance with coordinates-based approaches. In Proceedings. Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies, volume 1, pages 170--179. IEEE, 2002.
[43]
H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms. Journal of the ACM (JACM), 65(3):1--40, 2018.
[44]
M. Overmars and J. van Leeuwen. Worst-case optimal insertion and deletion methods for decomposable searching problems. Inf. Process. Lett., 12(4):168--173, 1981.
[45]
M. H. Overmars. The design of dynamic data structures, volume 156. Springer Science & Business Media, 1987.
[46]
M. Patrascu. Towards polynomial lower bounds for dynamic problems. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 603--610, 2010.
[47]
P. Pope, C. Zhu, A. Abdelkader, M. Goldblum, and T. Goldstein. The intrinsic dimension of images and its impact on learning. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3--7, 2021, 2021.
[48]
K. Semertzidis and E. Pitoura. Durable graph pattern queries on historical graphs. In 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pages 541--552. IEEE, 2016.
[49]
J. B. Tenenbaum, V. d. Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. science, 290(5500):2319--2323, 2000.
[50]
T. L. Veldhuizen. Leapfrog triejoin: A simple, worst-case optimal join algorithm. In Proc. International Conference on Database Theory, 2014.
[51]
K. Verbeek and S. Suri. Metric embedding, hyperbolic space, and social networks. In Proceedings of the thirtieth annual symposium on Computational geometry, pages 501--510, 2014.
[52]
J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics, pages 1--8, 2012.
[53]
Y. Yang, D. Yan, H. Wu, J. Cheng, S. Zhou, and J. C. Lui. Diversified temporal subgraph pattern mining. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1965--1974, 2016.
[54]
H. Zeng, H. Zhou, A. Srivastava, R. Kannan, and V. Prasanna. Accurate, efficient and scalable graph embedding. In 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 462--471. IEEE, 2019.
[55]
X. Zhao, A. Sala, C. Wilson, H. Zheng, and B. Y. Zhao. Orion: shortest path estimation for large social graphs. networks, 1:5, 2010.
[56]
X. Zhao, A. Sala, H. Zheng, and B. Y. Zhao. Efficient shortest paths on massive social graphs. In 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), pages 77--86. IEEE, 2011.

Cited By

View all
  • (2024)Output-sensitive Conjunctive Query EvaluationProceedings of the ACM on Management of Data10.1145/36958382:5(1-24)Online publication date: 4-Nov-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 2
PODS
May 2024
852 pages
EISSN:2836-6573
DOI:10.1145/3665155
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2024
Published in PACMMOD Volume 2, Issue 2

Permissions

Request permissions for this article.

Author Tags

  1. cover tree
  2. doubling dimension
  3. durability
  4. durable pattern
  5. proximity graph
  6. temporal graph

Qualifiers

  • Research-article

Funding Sources

  • NSF
  • US-Israel grant

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)8
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Output-sensitive Conjunctive Query EvaluationProceedings of the ACM on Management of Data10.1145/36958382:5(1-24)Online publication date: 4-Nov-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media