More Web Proxy on the site http://driver.im/

research-article

Advanced statistical analysis of empirical performance scaling

Authors:

Holger H. HoosAuthors Info & Claims

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference

Pages 236 - 244

https://doi.org/10.1145/3377930.3390210

Published: 26 June 2020 Publication History

Abstract

Theoretical running time complexity analysis is a widely adopted method for studying the scaling behaviour of algorithms. However, theoretical analysis remains intractable for many high-performance, heuristic algorithms. Recent advances in statistical methods for empirical running time scaling analysis have shown that many state-of-the-art algorithms can achieve significantly better scaling in practice than expected. However, current techniques have only been successfully applied to study algorithms on randomly generated instance sets, since they require instances that can be grouped into "bins", where each instance in a bin has the same size. In practice, real-world instance sets with this property are rarely available. We introduce a novel method that overcomes this limitation. We apply our method to a broad range of scenarios and demonstrate its effectiveness by revealing new insights into the scaling of several prominent algorithms; e.g., the SAT solver lingeling often appears to achieve sub-polynomial scaling on prominent bounded model checking instances, and the training times of scikit-learn's implementation of SVMs scale as a lower-degree polynomial than expected (≈ 1.51 instead of 2).

References

[1]

C. Ansótegui, Y. Malitsky, H. Samulowitz, M. Sellmann, and K. Tierney. 2015. Model-Based Genetic Algorithms for Algorithm Configuration. In Proc. of IJCAI 2015. 733--739.

[2]

J Bentley and MD McIlroy. 1993. Engineering a sort function. SPE 23, 11 (1993), 1249--1265.

[3]

A Biere. 2008. Hardware Model Checking Competition. http://fmv.jku.at/hwmcc/. (2008). Last visited on 2019-02-15.

[4]

A. Biere. 2017. CaDiCaL, Lingeling, Plingeling, Treengeling and YalSAT Entering the SAT Competition 2017. In Proc. of SAT Comp. 2017. 14--15.

[5]

A Biere, K Heljanko, and S Wieringa. 2011. AIGER 1.9 And Beyond. Technical Report. FMV Reports Series, Institute for Formal Models and Verification, Johannes Kepler University.

[6]

L Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5--32.

Digital Library

[7]

R Brown. 2015. Building a Balanced k-d Tree in O(kn logn) Time. JCGT 4, 1 (2015), 50--68.

[8]

E Coppa, C Demetrescu, and I Finocchi. 2012. Input-sensitive profiling. ACM SIGPLAN Notices 47, 6 (2012), 89--98.

Digital Library

[9]

Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297.

[10]

P Cunningham and SJ Delany. 2007. k-Nearest neighbour classifiers. MCS 34, 8 (2007), 1--17.

[11]

K Eggensperger, M Lindauer, and F Hutter. 2018. Neural Networks for Predicting Algorithm Runtime Distributions. In Proc. of IJCAI. 1442--1448.

[12]

E Fix and J Hodges Jr. 1951. Discriminatory analysis-nonparametric discrimination: consistency properties. Technical Report. UC Berkeley.

[13]

S Goldsmith. 2008. Measuring empirical computational complexity. Ph.D. Dissertation. UC Berkeley.

[14]

S Goldsmith, A Aiken, and D Wilkerson. 2007. Measuring empirical computational complexity. In Proc. of ESEC/FSE. ACM, 395--404.

Digital Library

[15]

M Heule and H van Marren. 2009. march_hi: Solver description. In Proc. of SAT Comp. 2009. 23--24.

[16]

C Hoare. 1962. Quicksort. Comput. J. 5, 1 (1962), 10--16.

[17]

P Holland and R Welsch. 1977. Robust regression using iteratively reweighted least-squares. Commun. Stat. Theory Methods 6, 9 (1977), 813--827.

[18]

F Hutter, H Hoos, and K Leyton-Brown. 2011. Sequential model-based optimization for general algorithm configuration. In Proc. of LION (LNCS), Vol. 6683. 507--523.

[19]

F Hutter, M López-Ibáñez, C Fawcett, M Lindauer, H Hoos, K Leyton-Brown, and T Stützle. 2014. AClib: A Benchmark Library for Algorithm Configuration. In Proc. of LION. 36--40.

[20]

T Joachims. 1999. In Advances in Kernel Methods. MIT Press, Chapter Making Large-scale Support Vector Machine Learning Practical, 169--184.

[21]

T Joachims. 2006. Training linear SVMs in linear time. In Proc. of KDD. ACM, 217--226.

Digital Library

[22]

R Koenker and K Hallock. 2001. Quantile regression. JEP 15, 4 (2001), 143--156.

[23]

Y LeCun, L Bottou, Y Bengio, and P Haffner. 1998. Gradient-based learning applied to document recognition. Proc. of the IEEE 86, 11 (1998), 2278--2324.

[24]

CLi, C Huang, and R Xu. 2014. Balance between intensification and diversification: a unity of opposites. In Proc. of SAT Comp. 2014. 10--11.

[25]

C McGeoch, P Sanders, R Fleischer, P Cohen, and D Precup. 2002. In Experimental Algorithmics. Springer-Verlag New York, Inc., Chapter Using Finite Experiments to Study Asymptotic Performance, 93--126.

[26]

Z Mu and H Hoos. 2015. On the empirical time complexity of random 3-SAT at the phase transition. In Proc. of IJCAI. 367--373.

[27]

Z Mu, H Hoos, and T Stützle. 2016. The Impact of Automated Algorithm Configuration on the Scaling Behaviour of State-of-the-Art Inexact TSP Solvers. In Proc. of LION (LNCS), Vol. 10079. 157--172.

[28]

Y Nagata and S Kobayashi. 2013. A Powerful Genetic Algorithm Using Edge Assembly Crossover for the Traveling Salesman Problem. INFORMS JOC 25, 2 (2013), 346--363.

Digital Library

[29]

N Newman, A Fréchette, and K Leyton-Brown. 2017. Deep optimization for spectrum repacking. Commun. ACM 61, 1 (2017), 97--104.

Digital Library

[30]

Y Pushak and Z Mu and H Hoos. 2020. Empirical Scaling Analyzer: An Automated System for Empirical Analysis of Performance Scaling. AI Communications, to appear (2020).

[31]

F Pedregosa, G Varoquaux, A Gramfort, V Michel, B Thirion, O Grisel, M Blondel, P Prettenhofer, R Weiss, V Dubourg, J Vanderplas, A Passos, D Cournapeau, M Brucher, M Perrot, and E Duchesnay. 2011. Scikit-learn: Machine Learning in Python. JMLR 12 (2011), 2825--2830.

Digital Library

[32]

T Peters. 2015. Timsort description. (2015).

[33]

K Yu and MC Jones. 1998. Local linear quantile regression. JASA 93, 441 (1998), 228--237.

[34]

D Zaparanuks and M Hauswirth. 2012. Algorithmic profiling. ACM SIGPLAN Notices 47, 6 (2012), 67--76.

Digital Library

Cited By

Dilkas P(2023)Generating Random Instances of Weighted Model CountingIntegration of Constraint Programming, Artificial Intelligence, and Operations Research10.1007/978-3-031-33271-5_26(395-416)Online publication date: 23-May-2023
https://doi.org/10.1007/978-3-031-33271-5_26
Kautz H(2022)The third AI summerAI Magazine10.1002/aaai.1203643:1(105-125)Online publication date: 31-Mar-2022
https://dl.acm.org/doi/10.1002/aaai.12036

Index Terms

Advanced statistical analysis of empirical performance scaling
1. General and reference
  1. Cross-computing tools and techniques

Recommendations

Empirical scaling analyzer: An automated system for empirical analysis of performance scaling

The time complexity of algorithms, i.e., the scaling of the time required for solving a problem instance as a function of instance size, is of key interest in theoretical computer science and practical applications. In this work, we present a fully ...
Empirical Scaling Analyser: An Automated System for Empirical Analysis of Performance Scaling
GECCO Companion '15: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation

The time complexity of problems and algorithms, i.e., the scaling of the time required for solving a problem instance as a function of instance size, is of key interest in theoretical computer science and practical applications. This paper presents an ...
Evolution of statistical analysis in empirical software engineering research: Current state and steps forward
Highlights
- A manual review of statistical usage in 161 papers in 5 top journals.
- A semi-...
Abstract
Software engineering research is evolving and papers are increasingly based on empirical data from a multitude of sources, using statistical tests to determine if and to what degree empirical evidence supports their hypotheses. To ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference

June 2020

1349 pages

ISBN:9781450371285

DOI:10.1145/3377930

General Chair:
Carlos Artemio Coello Coello
CINVESTAV-IPN

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

GECCO '20

Sponsor:

SIGEVO

GECCO '20: Genetic and Evolutionary Computation Conference

July 8 - 12, 2020

Cancún, Mexico

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
74
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dilkas P(2023)Generating Random Instances of Weighted Model CountingIntegration of Constraint Programming, Artificial Intelligence, and Operations Research10.1007/978-3-031-33271-5_26(395-416)Online publication date: 23-May-2023
https://doi.org/10.1007/978-3-031-33271-5_26
Kautz H(2022)The third AI summerAI Magazine10.1002/aaai.1203643:1(105-125)Online publication date: 31-Mar-2022
https://dl.acm.org/doi/10.1002/aaai.12036

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten