[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3482909.3482911acmotherconferencesArticle/Chapter ViewAbstractPublication PagessastConference Proceedingsconference-collections
research-article

On Using Decision Tree Coverage Criteria forTesting Machine Learning Models

Published: 12 October 2021 Publication History

Abstract

Over the past decade, there has been a growing interest in applying machine learning (ML) to address a myriad of tasks. Owing to this interest, the adoption of ML-based systems has gone mainstream. However, this widespread adoption of ML-based systems poses new challenges for software testers that must improve the quality and reliability of these ML-based solutions. To cope with the challenges of testing ML-based systems, we propose novel test adequacy criteria based on decision tree models. Differently from the traditional approach to testing ML models, which relies on manual collection and labelling of data, our criteria leverage the internal structure of decision tree models to guide the selection of test inputs. Thus, we introduce decision tree coverage (DTC) and boundary value analysis (BVA) as approaches to systematically guide the creation of effective test data that exercises key structural elements of a given decision tree model. To evaluate these criteria, we carried out an experiment using 12 datasets. We measured the effectiveness of test inputs in terms of the difference in model’s behavior between the test input and the training data. The experiment results indicate that our testing criteria can be used to guide the generation of effective test data.

References

[1]
Mauricio Aniche, Erick Maziero, Rafael S. Durelli, and Vinicius H. S. Durelli. 2020. The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring. IEEE Transactions on Software Engineering (Early Access) 00, 01(2020), 1–15.
[2]
Houssem Ben Braiek and Foutse Khomh. 2020. On testing machine learning programs. Journal of Systems and Software 164, 00 (2020), 110542.
[3]
Jaganmohan Chandrasekaran, Huadong Feng, Yu Lei, Raghu Kacker, and D. Richard Kuhn. 2020. Effectiveness of Dataset Reduction in Testing Machine Learning Algorithms. In 2nd IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, New York, NY, 133–140.
[4]
V. H. S. Durelli, R. S. Durelli, S. S. Borges, A. T. Endo, M. M. Eler, D. R. C. Dias, and M. P. Guimarães. 2019. Machine Learning Applied to Software Testing: A Systematic Mapping Study. IEEE Transactions on Reliability 68, 3 (2019), 1189–1212.
[5]
R. A. Fisher. 1936. The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7, 2 (1936), 179–188.
[6]
Q. Hu, L. Ma, X. Xie, B. Yu, Y. Liu, and J. Zhao. 2019. DeepMutation++: A Mutation Testing Framework for Deep Learning Systems. In 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, New York, NY, 1157–1161.
[7]
Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning: with Applications in R. Springer Texts in Statistics, New York, NY.
[8]
Zenan Li, Xiaoxing Ma, Chang Xu, and Chun Cao. 2019. Structural Coverage Criteria for Neural Networks Could Be Misleading. In 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER). ACM, New York, NY, 89–92.
[9]
L. Ma, F. Juefei-Xu, M. Xue, B. Li, L. Li, Y. Liu, and J. Zhao. 2019. DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems. In 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, New York, NY, 614–618.
[10]
L. Ma, F. Zhang, J. Sun, M. Xue, B. Li, F. Juefei-Xu, C. Xie, L. Li, Y. Liu, J. Zhao, and Y. Wang. 2018. DeepMutation: Mutation Testing of Deep Learning Systems. In 29th IEEE International Symposium on Software Reliability Engineering (ISSRE). IEEE, New York, NY, 100–111.
[11]
D. Marijan, A. Gotlieb, and M. Kumar Ahuja. 2019. Challenges of Testing Machine Learning Based Systems. In 1st IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, New York, NY, 101–102.
[12]
A. C. Müller and Sarah Guido. 2016. Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media, Sebastopol, CA.
[13]
Glenford J. Myers, Corey Sandler, and Tom Badgett. 2011. The Art of Software Testing. Wiley, Hoboken, NJ.
[14]
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2019. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. Commun. ACM 62, 11 (2019), 137–145.
[15]
V. Riccio, G. Jahangirova, A. Stocco, N. Humbatova, M. Weiss, and P. Tonella. 2020. Testing Machine Learning Based Systems: A Systematic Mapping. Empirical Software Engineering 25, 6 (2020), 5193–5254.
[16]
Youcheng Sun, Xiaowei Huang, Daniel Kroening, James Sharp, Matthew Hill, and Rob Ashmore. 2019. Structural Test Coverage Criteria for Deep Neural Networks. ACM Transactions on Embedded Computing Systems 18, 5 (2019), 320–321.
[17]
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars. In 40th International Conference on Software Engineering (ICSE). ACM, New York, NY, 303–314.
[18]
Scott W. VanderStoep and Deidre D. Johnson. 2008. Research Methods for Everyday Life: Blending Qualitative and Quantitative Approaches. Jossey-Bass, San Francisco, CA.
[19]
Dong Wang, Ziyuan Wang, Chunrong Fang, Yanshan Chen, and Zhenyu Chen. 2019. DeepPath: Path-Driven Testing Criteria for Deep Neural Networks. In 1st IEEE International Conference On Artificial Intelligence Testing (AITest). IEEE, New York, NY, 119–120.
[20]
C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. 2012. Experimentation in Software Engineering. Springer, New York, NY.
[21]
J. M. Zhang, M. Harman, L. Ma, and Y. Liu. 2020. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering (Early Access) 00, 00(2020), 1–37.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SAST '21: Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing
September 2021
63 pages
ISBN:9781450385039
DOI:10.1145/3482909
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Decision Tree
  2. Software Testing
  3. Testing Criterion

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • CAPES
  • FAPESP
  • CNPq

Conference

SAST'21

Acceptance Rates

Overall Acceptance Rate 45 of 92 submissions, 49%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 156
    Total Downloads
  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)8
Reflects downloads up to 07 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media