[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3377930.3390210acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Advanced statistical analysis of empirical performance scaling

Published: 26 June 2020 Publication History

Abstract

Theoretical running time complexity analysis is a widely adopted method for studying the scaling behaviour of algorithms. However, theoretical analysis remains intractable for many high-performance, heuristic algorithms. Recent advances in statistical methods for empirical running time scaling analysis have shown that many state-of-the-art algorithms can achieve significantly better scaling in practice than expected. However, current techniques have only been successfully applied to study algorithms on randomly generated instance sets, since they require instances that can be grouped into "bins", where each instance in a bin has the same size. In practice, real-world instance sets with this property are rarely available. We introduce a novel method that overcomes this limitation. We apply our method to a broad range of scenarios and demonstrate its effectiveness by revealing new insights into the scaling of several prominent algorithms; e.g., the SAT solver lingeling often appears to achieve sub-polynomial scaling on prominent bounded model checking instances, and the training times of scikit-learn's implementation of SVMs scale as a lower-degree polynomial than expected (≈ 1.51 instead of 2).

References

[1]
C. Ansótegui, Y. Malitsky, H. Samulowitz, M. Sellmann, and K. Tierney. 2015. Model-Based Genetic Algorithms for Algorithm Configuration. In Proc. of IJCAI 2015. 733--739.
[2]
J Bentley and MD McIlroy. 1993. Engineering a sort function. SPE 23, 11 (1993), 1249--1265.
[3]
A Biere. 2008. Hardware Model Checking Competition. http://fmv.jku.at/hwmcc/. (2008). Last visited on 2019-02-15.
[4]
A. Biere. 2017. CaDiCaL, Lingeling, Plingeling, Treengeling and YalSAT Entering the SAT Competition 2017. In Proc. of SAT Comp. 2017. 14--15.
[5]
A Biere, K Heljanko, and S Wieringa. 2011. AIGER 1.9 And Beyond. Technical Report. FMV Reports Series, Institute for Formal Models and Verification, Johannes Kepler University.
[6]
L Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5--32.
[7]
R Brown. 2015. Building a Balanced k-d Tree in O(kn logn) Time. JCGT 4, 1 (2015), 50--68.
[8]
E Coppa, C Demetrescu, and I Finocchi. 2012. Input-sensitive profiling. ACM SIGPLAN Notices 47, 6 (2012), 89--98.
[9]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297.
[10]
P Cunningham and SJ Delany. 2007. k-Nearest neighbour classifiers. MCS 34, 8 (2007), 1--17.
[11]
K Eggensperger, M Lindauer, and F Hutter. 2018. Neural Networks for Predicting Algorithm Runtime Distributions. In Proc. of IJCAI. 1442--1448.
[12]
E Fix and J Hodges Jr. 1951. Discriminatory analysis-nonparametric discrimination: consistency properties. Technical Report. UC Berkeley.
[13]
S Goldsmith. 2008. Measuring empirical computational complexity. Ph.D. Dissertation. UC Berkeley.
[14]
S Goldsmith, A Aiken, and D Wilkerson. 2007. Measuring empirical computational complexity. In Proc. of ESEC/FSE. ACM, 395--404.
[15]
M Heule and H van Marren. 2009. march_hi: Solver description. In Proc. of SAT Comp. 2009. 23--24.
[16]
C Hoare. 1962. Quicksort. Comput. J. 5, 1 (1962), 10--16.
[17]
P Holland and R Welsch. 1977. Robust regression using iteratively reweighted least-squares. Commun. Stat. Theory Methods 6, 9 (1977), 813--827.
[18]
F Hutter, H Hoos, and K Leyton-Brown. 2011. Sequential model-based optimization for general algorithm configuration. In Proc. of LION (LNCS), Vol. 6683. 507--523.
[19]
F Hutter, M López-Ibáñez, C Fawcett, M Lindauer, H Hoos, K Leyton-Brown, and T Stützle. 2014. AClib: A Benchmark Library for Algorithm Configuration. In Proc. of LION. 36--40.
[20]
T Joachims. 1999. In Advances in Kernel Methods. MIT Press, Chapter Making Large-scale Support Vector Machine Learning Practical, 169--184.
[21]
T Joachims. 2006. Training linear SVMs in linear time. In Proc. of KDD. ACM, 217--226.
[22]
R Koenker and K Hallock. 2001. Quantile regression. JEP 15, 4 (2001), 143--156.
[23]
Y LeCun, L Bottou, Y Bengio, and P Haffner. 1998. Gradient-based learning applied to document recognition. Proc. of the IEEE 86, 11 (1998), 2278--2324.
[24]
CLi, C Huang, and R Xu. 2014. Balance between intensification and diversification: a unity of opposites. In Proc. of SAT Comp. 2014. 10--11.
[25]
C McGeoch, P Sanders, R Fleischer, P Cohen, and D Precup. 2002. In Experimental Algorithmics. Springer-Verlag New York, Inc., Chapter Using Finite Experiments to Study Asymptotic Performance, 93--126.
[26]
Z Mu and H Hoos. 2015. On the empirical time complexity of random 3-SAT at the phase transition. In Proc. of IJCAI. 367--373.
[27]
Z Mu, H Hoos, and T Stützle. 2016. The Impact of Automated Algorithm Configuration on the Scaling Behaviour of State-of-the-Art Inexact TSP Solvers. In Proc. of LION (LNCS), Vol. 10079. 157--172.
[28]
Y Nagata and S Kobayashi. 2013. A Powerful Genetic Algorithm Using Edge Assembly Crossover for the Traveling Salesman Problem. INFORMS JOC 25, 2 (2013), 346--363.
[29]
N Newman, A Fréchette, and K Leyton-Brown. 2017. Deep optimization for spectrum repacking. Commun. ACM 61, 1 (2017), 97--104.
[30]
Y Pushak and Z Mu and H Hoos. 2020. Empirical Scaling Analyzer: An Automated System for Empirical Analysis of Performance Scaling. AI Communications, to appear (2020).
[31]
F Pedregosa, G Varoquaux, A Gramfort, V Michel, B Thirion, O Grisel, M Blondel, P Prettenhofer, R Weiss, V Dubourg, J Vanderplas, A Passos, D Cournapeau, M Brucher, M Perrot, and E Duchesnay. 2011. Scikit-learn: Machine Learning in Python. JMLR 12 (2011), 2825--2830.
[32]
T Peters. 2015. Timsort description. (2015).
[33]
K Yu and MC Jones. 1998. Local linear quantile regression. JASA 93, 441 (1998), 228--237.
[34]
D Zaparanuks and M Hauswirth. 2012. Algorithmic profiling. ACM SIGPLAN Notices 47, 6 (2012), 67--76.

Cited By

View all

Index Terms

  1. Advanced statistical analysis of empirical performance scaling

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference
          June 2020
          1349 pages
          ISBN:9781450371285
          DOI:10.1145/3377930
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 26 June 2020

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. empirical performance scaling
          2. empirical running time scaling
          3. empirical scaling analysis

          Qualifiers

          • Research-article

          Conference

          GECCO '20
          Sponsor:

          Acceptance Rates

          Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)2
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 29 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media