[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3155562.3155669guideproceedingsArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
Article
Free access

Automatically reducing tree-structured test inputs

Published: 30 October 2017 Publication History

Abstract

Reducing the test input given to a program while preserving some property of interest is important, e.g., to localize faults or to reduce test suites. The well-known delta debugging algorithm and its derivatives automate this task by repeatedly reducing a given input. Unfortunately, these approaches are limited to blindly removing parts of the input and cannot reduce the input by restructuring it. This paper presents the Generalized Tree Reduction (GTR) algorithm, an effective and efficient technique to reduce arbitrary test inputs that can be represented as a tree, such as program code, PDF files, and XML documents. The algorithm combines tree transformations with delta debugging and a greedy backtracking algorithm. To reduce the size of the considered search space, the approach automatically specializes the tree transformations applied by the algorithm based on examples of input trees. We evaluate GTR by reducing Python files that cause interpreter crashes, JavaScript files that cause browser inconsistencies, PDF documents with malicious content, and XML files used to tests an XML validator. The GTR algorithm reduces the trees of these files to 45.3%, 3.6%, 44.2%, and 1.3% of the original size, respectively, outperforming both delta debugging and another state-of-the-art algorithm.

References

[1]
I. Vessey, “Expertise in debugging computer programs: A process analysis,” International Journal of Man-Machine Studies, vol. 23, no. 5, pp. 459–494, 1985.
[2]
A. Zeller and R. Hildebrandt, “Simplifying and isolating failure-inducing input,” IEEE Transactions on Software Engineering, vol. 28, no. 2, pp. 183–200, 2002.
[3]
G. Misherghi and Z. Su, “HDD: Hierarchical delta debugging,” in Proceedings of the 28th international conference on Software engineering. ACM, 2006, pp. 142–151.
[4]
J. Regehr, Y. Chen, P. Cuoq, E. Eide, C. Ellison, and X. Yang, “Testcase reduction for C compiler bugs,” in ACM SIGPLAN Notices, vol. 47, no. 6. ACM, 2012, pp. 335–346.
[5]
Python Software Foundation. (2017) Python AST library. {Online}. Available: https://docs.python.org/2/library/ast.html
[6]
J. Patra and M. Pradel, “Learning to fuzz: Application-independent fuzz testing with probabilistic, generative models of input data,” TU Darmstadt, Department of Computer Science, Tech. Rep. TUD-CS- 2016-14664, Nov 2016.
[7]
A. Hidayat. (2017) Esprima. {Online}. Available: http://esprima.org/
[8]
Multiple authors. (2016) Learning from ”Big Code”. {Online}. Available: http://learnbigcode.github.io/datasets/
[9]
M. Parkour. (2011) Contagio malware dump. {Online}. Available: http://contagiodump.blogspot.de/2010/08/ malicious-documents-archive-for.html
[10]
Y. Shinyama. (2014) PDFMiner. {Online}. Available: https://euske. github.io/pdfminer/
[11]
iText Group NV. (2017) iText PDF. {Online}. Available: http: //itextpdf.com/
[12]
F. Schmitt, J. Gassen, and E. Gerhards-Padilla, “PDF Scrutinizer: Detecting JavaScript-based attacks in PDF documents,” in Privacy, Security and Trust (PST), 2012 Tenth Annual International Conference on. IEEE, 2012, pp. 104–111.
[13]
Hindawi. (2017) Hindawi XML Corpus. {Online}. Available: https: //www.hindawi.com/corpus/
[14]
xmllint. (2017) The XML C parser and toolkit of Gnome. {Online}. Available: http://xmlsoft.org/xmllint.html
[15]
gcov. (2017) gcov – a test coverage program. {Online}. Available: https://gcc.gnu.org/onlinedocs/gcc-4.3.4/gcc/Gcov.html
[16]
R. Hodován and A. Kiss, “Practical improvements to the minimizing delta debugging algorithm,” in Proceedings of the 11th International Joint Conference on Software Technologies, 2016, pp. 241–248.
[17]
R. Hodován and Á. Kiss, “Modernizing hierarchical delta debugging,” in Proceedings of the 7th International Workshop on Automating Test Case Design, Selection, and Evaluation. ACM, 2016, pp. 31–37.
[18]
Z. Lin and X. Zhang, “Deriving input syntactic structure from execution,” in Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. ACM, 2008, pp. 83–93.
[19]
S. Artzi, A. Kiezun, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst, “Finding bugs in dynamic web applications,” in Proceedings of the 2008 international symposium on Software testing and analysis. ACM, 2008, pp. 261–272.
[20]
J. Clause and A. Orso, “Penumbra: automatically identifying failurerelevant inputs using dynamic tainting,” in Proceedings of the eighteenth international symposium on Software testing and analysis. ACM, 2009, pp. 249–260.
[21]
P. Oehlert, “Violating assumptions with fuzzing,” IEEE Security & Privacy, vol. 3, no. 2, pp. 58–62, 2005.
[22]
P. Godefroid, A. Kiezun, and M. Y. Levin, “Grammar-based whitebox fuzzing.” in PLDI, 2008, pp. 206–215.
[23]
V. Ganesh, T. Leek, and M. C. Rinard, “Taint-based directed whitebox fuzzing,” in 31st International Conference on Software Engineering, ICSE 2009, May 16-24, 2009, Vancouver, Canada, Proceedings, 2009, pp. 474–484.
[24]
C. Holler, K. Herzig, and A. Zeller, “Fuzzing with code fragments.” in USENIX Security Symposium, 2012, pp. 445–458.
[25]
V. Le, M. Afshari, and Z. Su, “Compiler validation via equivalence modulo inputs,” in ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, 2014, pp. 216–226. {Online}. Available:
[26]
C. Lidbury, A. Lascu, N. Chong, and A. F. Donaldson, “Many-core compiler fuzzing,” in Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, 2015, pp. 65–76. {Online}. Available:
[27]
B. Daniel, D. Dig, K. Garcia, and D. Marinov, “Automated testing of refactoring engines,” in European Software Engineering Conference and International Symposium on Foundations of Software Engineering (ESEC/FSE). ACM, 2007, pp. 185–194.
[28]
P. Godefroid, M. Y. Levin, and D. A. Molnar, “Automated whitebox fuzz testing,” in Network and Distributed System Security Symposium (NDSS), 2008.
[29]
P. Saxena, S. Hanna, P. Poosankam, and D. Song, “Flax: Systematic discovery of client-side validation vulnerabilities in rich web applications.” in NDSS, 2010.
[30]
S. Lekies, B. Stock, and M. Johns, “25 million flows later: Large-scale detection of DOM-based XSS,” in ACM Conference on Computer and Communications Security, 2013, pp. 1193–1204.
[31]
A. Zeller, “Isolating cause-effect chains from computer programs,” in Proceedings of the 10th ACM SIGSOFT symposium on Foundations of software engineering. ACM, 2002, pp. 1–10.
[32]
M. Renieres and S. P. Reiss, “Fault localization with nearest neighbor queries,” in Automated Software Engineering, 2003. Proceedings. 18th IEEE International Conference on. IEEE, 2003, pp. 30–39.
[33]
T. Janssen, R. Abreu, and A. J. van Gemund, “Zoltar: A toolset for automatic fault localization,” in Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society, 2009, pp. 662–664.
[34]
H. Agrawal, J. R. Horgan, S. London, and W. E. Wong, “Fault localization using execution slices and dataflow tests.” in ISSRE, vol. 95, 1995, pp. 143–151.
[35]
M. Weiser, “Program slicing,” in Proceedings of the 5th international conference on Software engineering. IEEE Press, 1981, pp. 439–449.
[36]
H. Agrawal and J. R. Horgan, “Dynamic program slicing,” in ACM SIGPLAN Notices, vol. 25, no. 6. ACM, 1990, pp. 246–256.
[37]
N. Gupta, H. He, X. Zhang, and R. Gupta, “Locating faulty code using failure-inducing chops,” in Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering. ACM, 2005, pp. 263–272.
[38]
M. Burger and A. Zeller, “Minimizing reproduction of software failures,” in Proceedings of the 2011 International Symposium on Software Testing and Analysis. ACM, 2011, pp. 221–231.
[39]
M. Hammoudi, B. Burg, G. Bae, and G. Rothermel, “On the use of delta debugging to reduce recordings and facilitate debugging of web applications,” in Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 2015, pp. 333–344.
[40]
M. Jose and R. Majumdar, “Cause clue clauses: error localization using maximum satisfiability,” ACM SIGPLAN Notices, vol. 46, no. 6, pp. 437–446, 2011.
[41]
A. Leitner, M. Oriol, A. Zeller, I. Ciupa, and B. Meyer, “Efficient unit test case minimization,” in Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering. ACM, 2007, pp. 417–420.
[42]
Y. Lei and J. H. Andrews, “Minimization of randomized unit test cases,” in Software Reliability Engineering, 2005. ISSRE 2005. 16th IEEE International Symposium on. IEEE, 2005, pp. 10–pp.
[43]
A. Groce, M. A. Alipour, C. Zhang, Y. Chen, and J. Regehr, “Cause reduction for quick testing,” in Software Testing, Verification and Validation (ICST), 2014 IEEE Seventh International Conference on. IEEE, 2014, pp. 243–252.
[44]
M. A. Alipour, A. Shi, R. Gopinath, D. Marinov, and A. Groce, “Evaluating non-adequate test-case reduction,” in Automated Software Engineering (ASE), 2016 31st IEEE/ACM International Conference on. IEEE, 2016, pp. 16–26.
[45]
M. Höschele and A. Zeller, “Mining input grammars from dynamic taints,” in Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016, 2016, pp. 720–725.
[46]
O. Bastani, R. Sharma, A. Aiken, and P. Liang, “Synthesizing program input grammars,” in PLDI, 2017.

Cited By

View all
  • (2023)Ad Hoc Syntax-Guided Program ReductionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613101(2137-2141)Online publication date: 30-Nov-2023
  • (2021)Validating static warnings via testing code fragmentsProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464832(540-552)Online publication date: 11-Jul-2021
  • (2020)A Survey of Compiler TestingACM Computing Surveys10.1145/336356253:1(1-36)Online publication date: 6-Feb-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ASE '17: Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering
October 2017
1033 pages
ISBN:9781538626849

Sponsors

Publisher

IEEE Press

Publication History

Published: 30 October 2017

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)49
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Ad Hoc Syntax-Guided Program ReductionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3613101(2137-2141)Online publication date: 30-Nov-2023
  • (2021)Validating static warnings via testing code fragmentsProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464832(540-552)Online publication date: 11-Jul-2021
  • (2020)A Survey of Compiler TestingACM Computing Surveys10.1145/336356253:1(1-36)Online publication date: 6-Feb-2020
  • (2020)Enhanced compiler bug isolation via memoized searchProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3416570(78-89)Online publication date: 21-Dec-2020
  • (2019)Storm: program reduction for testing and debugging probabilistic programming systemsProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3338906.3338972(729-739)Online publication date: 12-Aug-2019
  • (2019)Compiler bug isolation via effective witness test program generationProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3338906.3338957(223-234)Online publication date: 12-Aug-2019
  • (2019)Interactive metamorphic testing of debuggersProceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3293882.3330567(273-283)Online publication date: 10-Jul-2019
  • (2018)HDDr: a recursive variant of the hierarchical Delta debugging algorithmProceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation10.1145/3278186.3278189(16-22)Online publication date: 5-Nov-2018
  • (2018)Effective Program Debloating via Reinforcement LearningProceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security10.1145/3243734.3243838(380-394)Online publication date: 15-Oct-2018
  • (2018)PersesProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180236(361-371)Online publication date: 27-May-2018

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media