[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-09574-0_7guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Lossy Kernelization of Same-Size Clustering

Published: 29 June 2022 Publication History

Abstract

In this work, we study the k-median clustering problem with an additional equal-size constraint on the clusters, from the perspective of parameterized preprocessing. Our main result is the first lossy (2-approximate) polynomial kernel for this problem, parameterized by the cost of clustering. We complement this result by establishing lower bounds for the problem that eliminate the existences of an (exact) kernel of polynomial size and a PTAS.

References

[1]
Aggarwal CC and Reddy CK Data Clustering: Algorithms and Applications 2013 Boca Raton CRC Press
[2]
Agrawal, A., Saurabh, S., Tale, P.: On the parameterized complexity of contraction to generalization of trees. In: 12th International Symposium on Parameterized and Exact Computation (IPEC). LIPIcs, vol. 89, pp. 1:1–1:12. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2017)
[3]
Ahmadian, S., Norouzi-Fard, A., Svensson, O., Ward, J.: Better guarantees for k-means and Euclidean k-median by primal-dual algorithms. In: 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, pp. 61–72. IEEE Computer Society (2017)
[4]
Alon N and Sudakov B On two segmentation problems J. Algorithms 1999 33 1 173-184
[5]
Arora, S., Raghavan, P., Rao, S.: Approximation schemes for Euclidean k-medians and related problems. In: Thirtieth Annual ACM Symposium on Theory of Computing, STOC 1998, pp. 106–113. ACM, New York (1998)
[6]
Baker, D., Braverman, V., Huang, L., Jiang, S.H.C., Krauthgamer, R., Wu, X.: Coresets for clustering in graphs of bounded treewidth. In: International Conference on Machine Learning, pp. 569–579. PMLR (2020)
[7]
Bandyapadhyay, S., Fomin, F.V., Purohit, N., Simonov, K.: Lossy kernelization of same-size clustering. CoRR abs/2107.07383 (2021)
[8]
Bandyapadhyay, S., Fomin, F.V., Simonov, K.: On coresets for fair clustering in metric and Euclidean spaces and their applications. In: 48th International Colloquium on Automata, Languages, and Programming (ICALP). LIPIcs, vol. 198, pp. 23:1–23:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)
[9]
Bandyapadhyay, S., Varadarajan, K.R.: On variants of k-means clustering. In: 32nd International Symposium on Computational Geometry, SoCG 2016. LIPIcs, Boston, MA, USA, 14–18 June 2016, vol. 51, pp. 14:1–14:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2016)
[10]
Basu S, Davidson I, and Wagstaff K Constrained Clustering: Advances in Algorithms, Theory, and Applications 2008 Boca Raton CRC Press
[11]
Bhattacharya A, Jaiswal R, and Kumar A Faster algorithms for the constrained k-means problem Theory Comput. Syst. 2018 62 1 93-115
[12]
Byrka, J., Fleszar, K., Rybicki, B., Spoerhase, J.: Bi-factor approximation algorithms for hard capacitated k-median problems. In: Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2015, pp. 722–736. SIAM (2015)
[13]
Byrka, J., Pensyl, T.W., Rybicki, B., Srinivasan, A., Trinh, K.: An improved approximation for k-median and positive correlation in budgeted optimization. ACM Trans. Algorithms 13(2), 23:1–23:31 (2017)
[14]
Byrka J, Rybicki B, and Uniyal S Louveaux Q and Skutella M An approximation algorithm for uniform capacitated k-median problem with 1+ϵ capacity violation Integer Programming and Combinatorial Optimization 2016 Cham Springer 262-274
[15]
Charikar M, Guha S, Tardos É, and Shmoys DB A constant-factor approximation algorithm for the k-median problem J. Comput. Syst. Sci. 2002 65 1 129-149
[16]
Chuzhoy, J., Rabani, Y.: Approximating k-median with non-uniform capacities. In: Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2005, pp. 952–958. SIAM (2005)
[17]
Cohen-Addad, V.: Approximation schemes for capacitated clustering in doubling metrics. In: ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, pp. 2241–2259. SIAM (2020)
[18]
Cohen-Addad, V., Gupta, A., Kumar, A., Lee, E., Li, J.: Tight FPT approximations for k-median and k-means. In: 46th International Colloquium on Automata, Languages, and Programming (ICALP). LIPIcs, vol. 132, pp. 42:1–42:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)
[19]
Cohen-Addad V, Klein PN, and Mathieu C Local search yields approximation schemes for k-means and k-median in Euclidean and minor-free metrics SIAM J. Comput. 2019 48 2 644-667
[20]
Cohen-Addad, V., Li, J.: On the fixed-parameter tractability of capacitated clustering. In: 46th International Colloquium on Automata, Languages, and Programming (ICALP). LIPIcs, vol. 132, pp. 41:1–41:14. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)
[21]
Cohen-Addad, V., Mathieu, C.: Effectiveness of local search for geometric optimization. In: 31st International Symposium on Computational Geometry, SoCG 2015. LIPIcs, vol. 34, pp. 329–343. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2015)
[22]
Cohen-Addad, V., de Mesmay, A., Rotenberg, E., Roytman, A.: The bane of low-dimensionality clustering. In: Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 441–456. SIAM (2018)
[23]
Cygan M et al. Parameterized Algorithms 2015 Cham Springer
[24]
Dell, H., Marx, D.: Kernelization of packing problems. CoRR abs/1812.03155 (2018)
[25]
Demirci, H.G., Li, S.: Constant approximation for capacitated k-median with (1+ε)-capacity violation. In: 43rd International Colloquium on Automata, Languages, and Programming (ICALP). LIPIcs, vol. 55, pp. 73:1–73:14. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2016)
[26]
Ding H and Xu J A unified framework for clustering constrained data without locality property Algorithmica 2020 82 4 808-852
[27]
Downey RG and Fellows MR Fundamentals of Parameterized Complexity 2013 London Springer
[28]
Dvořák, P., Feldmann, A.E., Knop, D., Masařík, T., Toufar, T., Veselý, P.: Parameterized approximation schemes for Steiner trees with small number of Steiner vertices. CoRR abs/1710.00668 (2017)
[29]
Eiben, E., Kumar, M., Mouawad, A.E., Panolan, F., Siebertz, S.: Lossy kernels for connected dominating set on sparse graphs. In: 34th International Symposium on Theoretical Aspects of Computer Science (STACS). LIPIcs, vol. 96, pp. 29:1–29:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2018)
[30]
Feldman, D., Langberg, M.: A unified framework for approximating and clustering data. In: 43rd Annual ACM Symposium on Theory of Computing (STOC), pp. 569–578. ACM (2011)
[31]
Feldman D, Schmidt M, and Sohler C Turning big data into tiny data: constant-size coresets for k-means, PCA, and projective clustering SIAM J. Comput. 2020 49 3 601-657
[32]
Feldmann AE, Karthik CS, Lee E, and Manurangsi P A survey on approximation in parameterized complexity: hardness and algorithms Algorithms 2020 13 6 146
[33]
Feng, Q., Zhang, Z., Huang, Z., Xu, J., Wang, J.: A unified framework of FPT approximation algorithms for clustering problems. In: 31st International Symposium on Algorithms and Computation (ISAAC). LIPIcs, vol. 181, pp. 5:1–5:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)
[34]
Fomin FV, Golovach PA, and Panolan F Parameterized low-rank binary matrix approximation Data Min. Knowl. Discov. 2020 34 2 478-532
[35]
Fomin FV, Lokshtanov D, Saurabh S, and Zehavi M Kernelization: Theory of Parameterized Preprocessing 2019 Cambridge Cambridge University Press
[36]
Friggstad Z, Rezapour M, and Salavatipour MR Local search yields a PTAS for k-means in doubling metrics SIAM J. Comput. 2019 48 2 452-480
[37]
Har-Peled, S., Mazumdar, S.: On coresets for k-means and k-median clustering. In: Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pp. 291–300. ACM (2004)
[38]
Höppner F and Klawonn F Jain LC, Sato-Ilic M, Virvou M, Tsihrintzis GA, Balas VE, and Abeynayake C Clustering with size constraints Computational Intelligence Paradigms, Innovative Applications 2008 Heidelberg Springer 167-180
[39]
Jain K and Vazirani VV Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and Lagrangian relaxation J. ACM 2001 48 2 274-296
[40]
Krithika, R., Misra, P., Rai, A., Tale, P.: Lossy kernels for graph contraction problems. In: 36th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS). LIPIcs, vol. 65, pp. 23:1–23:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2016)
[41]
Kumar, A., Sabharwal, Y., Sen, S.: Linear-time approximation schemes for clustering problems in any dimensions. J. ACM 57(2), 5:1–5:32 (2010)
[42]
Li, S.: On uniform capacitated k-median beyond the natural LP relaxation. ACM Trans. Algorithms 13(2), 22:1–22:18 (2017)
[43]
Li S and Svensson O Approximating k-median via pseudo-approximation SIAM J. Comput. 2016 45 2 530-547
[44]
Li, T.: A general model for clustering binary data. In: KDD 2005, pp. 188–197 (2005)
[45]
Lokshtanov, D., Panolan, F., Ramanujan, M.S., Saurabh, S.: Lossy kernelization. In: 49th Annual ACM Symposium on Theory of Computing (STOC), pp. 224–237. ACM (2017)
[46]
Manurangsi, P., Raghavendra, P.: A birthday repetition theorem and complexity of approximating dense CSPs. In: 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2017)
[47]
Petrank E The hardness of approximation: gap location Comput. Complex. 1994 4 133-157
[48]
Siebertz, S.: Lossy kernels for connected distance-r domination on nowhere dense graph classes. CoRR abs/1707.09819 (2017)
[49]
Sohler, C., Woodruff, D.P.: Strong coresets for k-median and subspace approximation: goodbye dimension. In: 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 802–813. IEEE (2018)
[50]
Sweeney L k-anonymity: a model for protecting privacy Int. J. Uncertain. Fuzziness Knowl. Based Syst. 2002 10 05 557-570
[51]
Vallejo-Huanga, D., Morillo, P., Ferri, C.: Semi-supervised clustering algorithms for grouping scientific articles. In: International Conference on Computational Science (ICCS) (2017). Procedia Comput. Sci. 108, 325–334. Elsevier
[52]
Zhang, Z., Li, T., Ding, C., Zhang, X.: Binary matrix factorization with applications. In: ICDM 2007, pp. 391–400. IEEE (2007)

Index Terms

  1. Lossy Kernelization of Same-Size Clustering
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        Computer Science – Theory and Applications: 17th International Computer Science Symposium in Russia, CSR 2022, Virtual Event, June 29 – July 1, 2022, Proceedings
        Jun 2022
        363 pages
        ISBN:978-3-031-09573-3
        DOI:10.1007/978-3-031-09574-0

        Publisher

        Springer-Verlag

        Berlin, Heidelberg

        Publication History

        Published: 29 June 2022

        Author Tags

        1. k-median clustering
        2. parameterized approximation
        3. kernelization
        4. lossy kernels

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 05 Jan 2025

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media