More Web Proxy on the site http://driver.im/

Article

Free access

Automatically reducing tree-structured test inputs

Authors:

Michael PradelAuthors Info & Claims

ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering

Pages 861 - 871

Published: 30 October 2017 Publication History

Abstract

Reducing the test input given to a program while preserving some property of interest is important, e.g., to localize faults or to reduce test suites. The well-known delta debugging algorithm and its derivatives automate this task by repeatedly reducing a given input. Unfortunately, these approaches are limited to blindly removing parts of the input and cannot reduce the input by restructuring it. This paper presents the Generalized Tree Reduction (GTR) algorithm, an effective and efficient technique to reduce arbitrary test inputs that can be represented as a tree, such as program code, PDF files, and XML documents. The algorithm combines tree transformations with delta debugging and a greedy backtracking algorithm. To reduce the size of the considered search space, the approach automatically specializes the tree transformations applied by the algorithm based on examples of input trees. We evaluate GTR by reducing Python files that cause interpreter crashes, JavaScript files that cause browser inconsistencies, PDF documents with malicious content, and XML files used to tests an XML validator. The GTR algorithm reduces the trees of these files to 45.3%, 3.6%, 44.2%, and 1.3% of the original size, respectively, outperforming both delta debugging and another state-of-the-art algorithm.

References

[1]

I. Vessey, “Expertise in debugging computer programs: A process analysis,” International Journal of Man-Machine Studies, vol. 23, no. 5, pp. 459–494, 1985.

[2]

A. Zeller and R. Hildebrandt, “Simplifying and isolating failure-inducing input,” IEEE Transactions on Software Engineering, vol. 28, no. 2, pp. 183–200, 2002.

Digital Library

[3]

G. Misherghi and Z. Su, “HDD: Hierarchical delta debugging,” in Proceedings of the 28th international conference on Software engineering. ACM, 2006, pp. 142–151.

Digital Library

[4]

J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang, “Testcase reduction for C compiler bugs,” in ACM SIGPLAN Notices, vol. 47, no. 6. ACM, 2012, pp. 335–346.

Digital Library

[5]

Python Software Foundation. (2017) Python AST library. {Online}. Available: https://docs.python.org/2/library/ast.html

[6]

J. Patra and M. Pradel, “Learning to fuzz: Application-independent fuzz testing with probabilistic, generative models of input data,” TU Darmstadt, Department of Computer Science, Tech. Rep. TUD-CS- 2016-14664, Nov 2016.

[7]

A. Hidayat. (2017) Esprima. {Online}. Available: http://esprima.org/

[8]

Multiple authors. (2016) Learning from ”Big Code”. {Online}. Available: http://learnbigcode.github.io/datasets/

[9]

M. Parkour. (2011) Contagio malware dump. {Online}. Available: http://contagiodump.blogspot.de/2010/08/ malicious-documents-archive-for.html

[10]

Y. Shinyama. (2014) PDFMiner. {Online}. Available: https://euske. github.io/pdfminer/

[11]

iText Group NV. (2017) iText PDF. {Online}. Available: http: //itextpdf.com/

[12]

F. Schmitt, J. Gassen, and E. Gerhards-Padilla, “PDF Scrutinizer: Detecting JavaScript-based attacks in PDF documents,” in Privacy, Security and Trust (PST), 2012 Tenth Annual International Conference on. IEEE, 2012, pp. 104–111.

Digital Library

[13]

Hindawi. (2017) Hindawi XML Corpus. {Online}. Available: https: //www.hindawi.com/corpus/

[14]

xmllint. (2017) The XML C parser and toolkit of Gnome. {Online}. Available: http://xmlsoft.org/xmllint.html

[15]

gcov. (2017) gcov – a test coverage program. {Online}. Available: https://gcc.gnu.org/onlinedocs/gcc-4.3.4/gcc/Gcov.html

[16]

R. Hodován and A. Kiss, “Practical improvements to the minimizing delta debugging algorithm,” in Proceedings of the 11th International Joint Conference on Software Technologies, 2016, pp. 241–248.

[17]

R. Hodován and Á. Kiss, “Modernizing hierarchical delta debugging,” in Proceedings of the 7th International Workshop on Automating Test Case Design, Selection, and Evaluation. ACM, 2016, pp. 31–37.

Digital Library

[18]

Z. Lin and X. Zhang, “Deriving input syntactic structure from execution,” in Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. ACM, 2008, pp. 83–93.

Digital Library

[19]

S. Artzi, A. Kiezun, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst, “Finding bugs in dynamic web applications,” in Proceedings of the 2008 international symposium on Software testing and analysis. ACM, 2008, pp. 261–272.

Digital Library

[20]

J. Clause and A. Orso, “Penumbra: automatically identifying failurerelevant inputs using dynamic tainting,” in Proceedings of the eighteenth international symposium on Software testing and analysis. ACM, 2009, pp. 249–260.

Digital Library

[21]

P. Oehlert, “Violating assumptions with fuzzing,” IEEE Security & Privacy, vol. 3, no. 2, pp. 58–62, 2005.

Digital Library

[22]

P. Godefroid, A. Kiezun, and M. Y. Levin, “Grammar-based whitebox fuzzing.” in PLDI, 2008, pp. 206–215.

Digital Library

[23]

V. Ganesh, T. Leek, and M. C. Rinard, “Taint-based directed whitebox fuzzing,” in 31st International Conference on Software Engineering, ICSE 2009, May 16-24, 2009, Vancouver, Canada, Proceedings, 2009, pp. 474–484.

Digital Library

[24]

C. Holler, K. Herzig, and A. Zeller, “Fuzzing with code fragments.” in USENIX Security Symposium, 2012, pp. 445–458.

Digital Library

[25]

V. Le, M. Afshari, and Z. Su, “Compiler validation via equivalence modulo inputs,” in ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, 2014, pp. 216–226. {Online}. Available:

Digital Library

[26]

C. Lidbury, A. Lascu, N. Chong, and A. F. Donaldson, “Many-core compiler fuzzing,” in Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, 2015, pp. 65–76. {Online}. Available:

Digital Library

[27]

B. Daniel, D. Dig, K. Garcia, and D. Marinov, “Automated testing of refactoring engines,” in European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE). ACM, 2007, pp. 185–194.

Digital Library

[28]

P. Godefroid, M. Y. Levin, and D. A. Molnar, “Automated whitebox fuzz testing,” in Network and Distributed System Security Symposium (NDSS), 2008.

[29]

P. Saxena, S. Hanna, P. Poosankam, and D. Song, “Flax: Systematic discovery of client-side validation vulnerabilities in rich web applications.” in NDSS, 2010.

[30]

S. Lekies, B. Stock, and M. Johns, “25 million flows later: Large-scale detection of DOM-based XSS,” in ACM Conference on Computer and Communications Security, 2013, pp. 1193–1204.

Digital Library

[31]

A. Zeller, “Isolating cause-effect chains from computer programs,” in Proceedings of the 10th ACM SIGSOFT symposium on Foundations of software engineering. ACM, 2002, pp. 1–10.

Digital Library

[32]

M. Renieres and S. P. Reiss, “Fault localization with nearest neighbor queries,” in Automated Software Engineering, 2003. Proceedings. 18th IEEE International Conference on. IEEE, 2003, pp. 30–39.

Digital Library

[33]

T. Janssen, R. Abreu, and A. J. van Gemund, “Zoltar: A toolset for automatic fault localization,” in Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 2009, pp. 662–664.

Digital Library

[34]

H. Agrawal, J. R. Horgan, S. London, and W. E. Wong, “Fault localization using execution slices and dataflow tests.” in ISSRE, vol. 95, 1995, pp. 143–151.

[35]

M. Weiser, “Program slicing,” in Proceedings of the 5th international conference on Software engineering. IEEE Press, 1981, pp. 439–449.

Digital Library

[36]

H. Agrawal and J. R. Horgan, “Dynamic program slicing,” in ACM SIGPLAN Notices, vol. 25, no. 6. ACM, 1990, pp. 246–256.

Digital Library

[37]

N. Gupta, H. He, X. Zhang, and R. Gupta, “Locating faulty code using failure-inducing chops,” in Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering. ACM, 2005, pp. 263–272.

Digital Library

[38]

M. Burger and A. Zeller, “Minimizing reproduction of software failures,” in Proceedings of the 2011 International Symposium on Software Testing and Analysis. ACM, 2011, pp. 221–231.

Digital Library

[39]

M. Hammoudi, B. Burg, G. Bae, and G. Rothermel, “On the use of delta debugging to reduce recordings and facilitate debugging of web applications,” in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 2015, pp. 333–344.

Digital Library

[40]

M. Jose and R. Majumdar, “Cause clue clauses: error localization using maximum satisfiability,” ACM SIGPLAN Notices, vol. 46, no. 6, pp. 437–446, 2011.

Digital Library

[41]

A. Leitner, M. Oriol, A. Zeller, I. Ciupa, and B. Meyer, “Efficient unit test case minimization,” in Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering. ACM, 2007, pp. 417–420.

Digital Library

[42]

Y. Lei and J. H. Andrews, “Minimization of randomized unit test cases,” in Software Reliability Engineering, 2005. ISSRE 2005. 16th IEEE International Symposium on. IEEE, 2005, pp. 10–pp.

Digital Library

[43]

A. Groce, M. A. Alipour, C. Zhang, Y. Chen, and J. Regehr, “Cause reduction for quick testing,” in Software Testing, Verification and Validation (ICST), 2014 IEEE Seventh International Conference on. IEEE, 2014, pp. 243–252.

Digital Library

[44]

M. A. Alipour, A. Shi, R. Gopinath, D. Marinov, and A. Groce, “Evaluating non-adequate test-case reduction,” in Automated Software Engineering (ASE), 2016 31st IEEE/ACM International Conference on. IEEE, 2016, pp. 16–26.

Digital Library

[45]

M. Höschele and A. Zeller, “Mining input grammars from dynamic taints,” in Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016, 2016, pp. 720–725.

Digital Library

[46]

O. Bastani, R. Sharma, A. Aiken, and P. Liang, “Synthesizing program input grammars,” in PLDI, 2017.

Digital Library

Cited By

Tian JZhang MXu ZTian YDong YSun CChandra SBlincoe KTonella P(2023)Ad Hoc Syntax-Guided Program ReductionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613101(2137-2141)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3613101
Kallingal Joshy AChen XSteenhoek BLe WCadar CZhang X(2021)Validating static warnings via testing code fragmentsProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464832(540-552)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3460319.3464832
Chen JPatra JPradel MXiong YZhang HHao DZhang L(2020)A Survey of Compiler TestingACM Computing Surveys10.1145/336356253:1(1-36)Online publication date: 6-Feb-2020
https://dl.acm.org/doi/10.1145/3363562
Show More Cited By

Recommendations

Evaluating String Distance Metrics for Reducing Automatically Generated Test Suites
AST '24: Proceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)

Regression test suites can have a large number of test cases, especially automatically generated ones, and tend to grow in size, making it costly to run the entire test suite. Test suite reduction aims to eliminate some test cases to reduce the test ...
Reducing Test Inputs Using Information Partitions
CAV '09: Proceedings of the 21st International Conference on Computer Aided Verification

Automatic symbolic techniques to generate test inputs, for example, through concolic execution, suffer from <em>path explosion</em> : the number of paths to be symbolically solved for grows exponentially with the number of inputs. In many applications ...
Reducing Test Runtime by Transforming Test Fixtures
ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering

Software testing is a fundamental part of software development, but the cost of running tests can be high. Existing approaches to speed up testing such as test-suite reduction or regression test selection aim to run only a subset of tests from the full ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering

October 2017

1033 pages

ISBN:9781538626849

General Chair:
Grigore Rosu
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Massimiliano Di Penta
University of Sannio, Italy
,
Tien N. Nguyen
University of Texas at Dallas, USA

Sponsors

Publisher

IEEE Press

Publication History

Published: 30 October 2017

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
222
Total Downloads

Downloads (Last 12 months)49
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tian JZhang MXu ZTian YDong YSun CChandra SBlincoe KTonella P(2023)Ad Hoc Syntax-Guided Program ReductionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613101(2137-2141)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3613101
Kallingal Joshy AChen XSteenhoek BLe WCadar CZhang X(2021)Validating static warnings via testing code fragmentsProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464832(540-552)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3460319.3464832
Chen JPatra JPradel MXiong YZhang HHao DZhang L(2020)A Survey of Compiler TestingACM Computing Surveys10.1145/336356253:1(1-36)Online publication date: 6-Feb-2020
https://dl.acm.org/doi/10.1145/3363562
Chen JMa HZhang LGrundy JLe Goues CLo D(2020)Enhanced compiler bug isolation via memoized searchProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3416570(78-89)Online publication date: 21-Dec-2020
https://dl.acm.org/doi/10.1145/3324884.3416570
Dutta SZhang WHuang ZMisailovic SDumas MPfahl DApel SRusso A(2019)Storm: program reduction for testing and debugging probabilistic programming systemsProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3338906.3338972(729-739)Online publication date: 12-Aug-2019
https://dl.acm.org/doi/10.1145/3338906.3338972
Chen JHan JSun PZhang LHao DZhang LDumas MPfahl DApel SRusso A(2019)Compiler bug isolation via effective witness test program generationProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3338906.3338957(223-234)Online publication date: 12-Aug-2019
https://dl.acm.org/doi/10.1145/3338906.3338957
Tolksdorf SLehmann DPradel MZhang DMøller A(2019)Interactive metamorphic testing of debuggersProceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3293882.3330567(273-283)Online publication date: 10-Jul-2019
https://dl.acm.org/doi/10.1145/3293882.3330567
Kiss ÁHodován RGyimóthy TPrasetya WVos TGetir S(2018)HDDr: a recursive variant of the hierarchical Delta debugging algorithmProceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation10.1145/3278186.3278189(16-22)Online publication date: 5-Nov-2018
https://dl.acm.org/doi/10.1145/3278186.3278189
Heo KLee WPashakhanloo PNaik MLie DMannan MBackes MWang X(2018)Effective Program Debloating via Reinforcement LearningProceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security10.1145/3243734.3243838(380-394)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3243734.3243838
Sun CLi YZhang QGu TSu ZChaudron MCrnkovic IChechik MHarman M(2018)PersesProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180236(361-371)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3180155.3180236

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents