[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3475716.3475781acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article

An Empirical Study of Rule-Based and Learning-Based Approaches for Static Application Security Testing

Published: 11 October 2021 Publication History

Abstract

Background: Static Application Security Testing (SAST) tools purport to assist developers in detecting security issues in source code. These tools typically use rule-based approaches to scan source code for security vulnerabilities. However, due to the significant shortcomings of these tools (i.e., high false positive rates), learning-based approaches for Software Vulnerability Prediction (SVP) are becoming a popular approach. Aims: Despite the similar objectives of these two approaches, their comparative value is unexplored. We provide an empirical analysis of SAST tools and SVP models, to identify their relative capabilities for source code security analysis. Method: We evaluate the detection and assessment performance of several common SAST tools and SVP models on a variety of vulnerability datasets. We further assess the viability and potential benefits of combining the two approaches. Results: SAST tools and SVP models provide similar detection capabilities, but SVP models exhibit better overall performance for both detection and assessment. Unification of the two approaches is difficult due to lacking synergies. Conclusions: Our study generates 12 main findings which provide insights into the capabilities and synergy of these two approaches. Through these observations we provide recommendations for use and improvement.

References

[1]
Bushra Aloraini, Meiyappan Nagappan, Daniel M German, Shinpei Hayashi, and Yoshiki Higo. 2019. An empirical study of security warnings from static application security testing tools. Journal of Systems and Software 158 (2019), 110427.
[2]
Moritz Beller, Radjino Bholanath, Shane McIntosh, and Andy Zaidman. 2016. Analyzing the state of static analysis: A large-scale evaluation in open source software. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 470--481.
[3]
Tim Boland and Paul E Black. 2012. Juliet 1.1 C/C++ and Java test suite. IEEE Computer Architecture Letters 45, 10 (2012), 88--90.
[4]
CERN. [n.d.]. Rough Auditing Tool for Security (RATS). https://security.web.cern.ch/recommendations/en/codetools/rats.shtml
[5]
Davide Chicco and Giuseppe Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics 21, 1 (2020), 1--13.
[6]
Istehad Chowdhury and Mohammad Zulkernine. 2011. Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. Journal of Systems Architecture 57, 3 (2011), 294--313.
[7]
Maria Christakis and Christian Bird. 2016. What developers want and need from program analysis: an empirical study. In Proceedings of the 31st IEEE/ACM international conference on automated software engineering. 332--343.
[8]
William G Cochran. 2007. Sampling techniques. John Wiley & Sons.
[9]
Rory Coulter, Qing-Long Han, Lei Pan, Jun Zhang, and Yang Xiang. 2020. Code analysis for intelligent cyber systems: A data-driven approach. Information sciences 524 (2020), 46--58.
[10]
Roland Croft, Dominic Newlands, Ziyu Chen, and Ali Babar. 2021. Reproduction package for "An Empirical Study of Rule-Based and Learning-Based Approaches for Static Application Security Testing". https://doi.org/10.6084/m9.figshare.14585076.v1
[11]
CWE. [n.d.]. Common Weakness Enumeration. https://cwe.mitre.org/
[12]
Gabriel Díaz and Juan Ramón Bermejo. 2013. Static analysis of source code security: Assessment of tools against SAMATE tests. Information and software technology 55, 8 (2013), 1462--1476.
[13]
Davide Falessi, Jacky Huang, Likhita Narayana, Jennifer Fong Thai, and Burak Turhan. 2020. On the need of preserving order of data when validating within-project defect classifiers. Empirical Software Engineering 25, 6 (2020), 4805--4830.
[14]
Yuanrui Fan, D Alencar da Costa, D Lo, AE Hassan, and L Shanping. 2020. The impact of mislabeled changes by szz on just-in-time defect prediction. IEEE Transactions on Software Engineering (2020).
[15]
OWASP Foundation. [n.d.]. Static Code Analysis. https://owasp.org/www-community/controls/Static_Code_Analysis
[16]
Michael Gegick and Laurie Williams. 2007. Toward the use of automated static analysis alerts for early identification of vulnerability-and attack-prone components. In Second International Conference on Internet Monitoring and Protection (ICIMP 2007). IEEE, 18--18.
[17]
Seyed Mohammad Ghaffarian and Hamid Reza Shahriari. 2017. Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey. ACM Computing Surveys (CSUR) 50, 4 (2017), 1--36.
[18]
Baljinder Ghotra, Shane McIntosh, and Ahmed E Hassan. 2017. A large-scale study of the impact of feature selection techniques on defect classification models. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 146--157.
[19]
Hazim Hanif, Mohd Hairul Nizam Md Nasir, Mohd Faizal Ab Razak, Ahmad Firdaus, and Nor Badrul Anuar. 2021. The rise of software vulnerability: Taxonomy of software vulnerabilities detection and machine learning approaches. Journal of Network and Computer Applications (2021), 103009.
[20]
Nasif Imtiaz, Akond Rahman, Effat Farhana, and Laurie Williams. 2019. Challenges with responding to static analysis tool alerts. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 245--249.
[21]
Paul Jaccard. 1912. The distribution of the flora in the alpine zone. 1. New phytologist 11, 2 (1912), 37--50.
[22]
Matthieu Jimenez, Yves Le Traon, and Mike Papadakis. 2018. Enabling the continous analysis of security vulnerabilities with vuldata7. In IEEE International Working Conference on Source Code Analysis and Manipulation.
[23]
Brittany Johnson, Yoonki Song, Emerson Murphy-Hill, and Robert Bowdidge. 2013. Why don't software developers use static analysis tools to find bugs?. In 2013 35th International Conference on Software Engineering (ICSE). IEEE, 672--681.
[24]
Arvinder Kaur and Ruchikaa Nayyar. 2020. A comparative study of static code analysis tools for vulnerability detection in c/c++ and java source code. Procedia Computer Science 171 (2020), 2023--2029.
[25]
Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2 (1938), 81--93.
[26]
Saad Khan and Simon Parkinson. 2018. Review into state of the art of vulnerability assessment using artificial intelligence. In Guide to Vulnerability Analysis for Computer Networks and Systems. Springer, 3--32.
[27]
Gary A Kildall. 1973. A unified approach to global program optimization. In Proceedings of the 1st annual ACM SIGACT-SIGPLAN symposium on Principles of programming languages. 194--206.
[28]
Triet Huynh Minh Le, David Hin, Roland Croft, and M Ali Babar. 2020. PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU Learning. In Proceedings of the 17th International Conference on Mining Software Repositories. 350--361.
[29]
Triet Le Le Huynh Minh, Roland Croft, David Hin, and Muhammad Ali Ali Babar. 2021. A Large-scale Study of Security Vulnerability Support on Developer Q&A Websites. In Evaluation and Assessment in Software Engineering. 109--118.
[30]
Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. 2018. Vuldeepecker: A deep learning-based system for vulnerability detection. In 25th Annual Network and Distributed System Symposium.
[31]
Daniel Marjamaki. [n.d.]. Cppcheck. http://cppcheck.sourceforge.net/
[32]
Jian-Xun Mi, An-Di Li, and Li-Fang Zhou. 2020. Review Study of Interpretation Methods for Future Interpretable Machine Learning. IEEE Access 8 (2020), 191969--191985.
[33]
Patrick Morrison, Kim Herzig, Brendan Murphy, and Laurie Williams. 2015. Challenges with applying vulnerability prediction models. In Proceedings of the 2015 Symposium and Bootcamp on the Science of Security. 1--9.
[34]
Patrick J Morrison, Rahul Pandita, Xusheng Xiao, Ram Chillarege, and Laurie Williams. 2018. Are vulnerabilities discovered and resolved like other defects? Empirical Software Engineering 23, 3 (2018), 1383--1421.
[35]
Nuthan Munaiah and Andrew Meneely. 2019. Data-driven insights from vulnerability discovery metrics. In 2019 IEEE/ACM Joint 4th International Workshop on Rapid Continuous Software Engineering and 1st International Workshop on Data-Driven Decisions, Experimentation and Evolution (RCoSE/DDrEE). IEEE, 1--7.
[36]
Zaigham Mushtaq, Ghulam Rasool, and Balawal Shehzad. 2017. Multilingual source code analysis: A systematic literature review. IEEE Access 5 (2017), 11307--11336.
[37]
National Institute of Standards and Technology. [n.d.]. Software Assurance and Reference Dataset. https://samate.nist.gov/SARD/testsuite.php
[38]
National Institute of Standards and Technology. [n.d.]. Source Code Security Analyzers. https://samate.nist.gov/index.php/Source_Code_Security_Analyzers.html
[39]
Tosin Daniel Oyetoyan, Bisera Milosheska, Mari Grini, and Daniela Soares Cruzes. 2018. Myths and facts about static application security testing tools: an action research at telenor digital. In International Conference on Agile Software Development. Springer, Cham, 86--103.
[40]
Rajshakhar Paul, Asif Kamal Turzo, and Amiangshu Bosu. 2021. Why Security Defects Go Unnoticed during Code Reviews? A Case-Control Study of the Chromium OS Project. In 2021 43rd International Conference on Software Engineering (ICSE). IEEE.
[41]
Jose D'Abruzzo Pereira, João R Campos, and Marco Vieira. 2019. An exploratory study on machine learning to combine security vulnerability alerts from static analysis tools. In 2019 9th Latin-American Symposium on Dependable Computing (LADC). IEEE, 1--10.
[42]
Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. Vccfinder: Finding potential vulnerabilities in open-source projects to assist code audits. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 426--437.
[43]
Nico Poel. 2010. Automated Security Review of PHP Web Applications with Static Code Analysis. Master's thesis. University of Groningen.
[44]
Chanathip Pornprasit and Chakkrit Tantithamthavorn. 2021. JITLine: A Simpler, Better, Faster, Finer-grained Just-In-Time Defect Prediction. arXiv preprint arXiv:2103.07068 (2021).
[45]
Foyzur Rahman, Sameer Khatri, Earl T Barr, and Premkumar Devanbu. 2014. Comparing static bug finders and statistical prediction. In Proceedings of the 36th International Conference on Software Engineering. 424--434.
[46]
Athos Ribeiro, Paulo Meirelles, Nelson Lago, and Fabio Kon. 2019. Ranking warnings from multiple source code static analyzers via ensemble learning. In Proceedings of the 15th International Symposium on Open Collaboration. 1--10.
[47]
Emre Sahal and Ayse Tosun. 2018. Identifying bug-inducing changes for code additions. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 1--2.
[48]
Riccardo Scandariato, James Walden, Aram Hovsepyan, and Wouter Joosen. 2014. Predicting vulnerable software components via text mining. IEEE Transactions on Software Engineering 40, 10 (2014), 993--1006.
[49]
Robert C Seacord. 2005. Secure Coding in C and C++. Pearson Education.
[50]
Hossain Shahriar and Mohammad Zulkernine. 2012. Mitigating program security vulnerabilities: Approaches and challenges. ACM Computing Surveys (CSUR) 44, 3 (2012), 1--46.
[51]
Yonghee Shin, Andrew Meneely, Laurie Williams, and Jason A Osborne. 2010. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE transactions on software engineering 37, 6 (2010), 772--787.
[52]
Yonghee Shin and Laurie Williams. 2013. Can traditional fault prediction models be used for vulnerability prediction? Empirical Software Engineering 18, 1 (2013), 25--59.
[53]
Justin Smith, Brittany Johnson, Emerson Murphy-Hill, Bill Chu, and Heather Richter Lipford. 2018. How developers diagnose potential security vulnerabilities with a static analysis tool. IEEE Transactions on Software Engineering 45, 9 (2018), 877--897.
[54]
Vincent Smyth. 2017. Software vulnerability management: how intelligence helps reduce the risk. Network Security 2017, 3 (2017), 10--12.
[55]
Chakkrit Tantithamthavorn, Ahmed E Hassan, and Kenichi Matsumoto. 2018. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Transactions on Software Engineering 46, 11 (2018), 1200--1219.
[56]
Chakkrit Tantithamthavorn, Jirayus Jiarpakdee, and John Grundy. 2020. Explainable AI for Software Engineering. arXiv preprint arXiv:2012.01614 (2020).
[57]
Chakkrit Tantithamthavorn, Shane McIntosh, Ahmed E Hassan, and Kenichi Matsumoto. 2016. An empirical comparison of model validation techniques for defect prediction models. IEEE Transactions on Software Engineering 43, 1 (2016), 1--18.
[58]
TIOBE. [n.d.]. TIOBE Index. https://www.tiobe.com/tiobe-index/
[59]
John Viega, JT Bloch, Tadayoshi Kohno, and Gary McGraw. 2002. Token-based scanning of source code for security problems. ACM Transactions on Information and System Security (TISSEC) 5, 3 (2002), 238--261.
[60]
Andreas Wagner and Johannes Sametinger. 2014. Using the Juliet test suite to compare static security scanners. In 2014 11th International Conference on Security and Cryptography (SECRYPT). IEEE, 1--9.
[61]
James Walden, Jeff Stuckman, and Riccardo Scandariato. 2014. Predicting vulnerable components: Software metrics vs text mining. In 2014 IEEE 25th international symposium on software reliability engineering. IEEE, 23--33.
[62]
David Wheeler. [n.d.]. Flawfinder. https://dwheeler.com/flawfinder/
[63]
Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in statistics. Springer, 196--202.
[64]
Yichen Xie, Andy Chou, and Dawson Engler. 2003. Archer: using symbolic, path-sensitive analysis to detect memory access errors. In Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering. 327--336.
[65]
Yanming Yang, Xin Xia, David Lo, Tingting Bi, John Grundy, and Xiaohu Yang. 2020. Predictive Models in Software Engineering: Challenges and Opportunities. arXiv preprint arXiv:2008.03656 (2020).
[66]
Jongwon Yoon, Minsik Jin, and Yungbum Jung. 2014. Reducing false alarms from an industrial-strength static analyzer by SVM. In 2014 21st Asia-Pacific Software Engineering Conference, Vol. 2. IEEE, 3--6.
[67]
Peng Zeng, Guanjun Lin, Lei Pan, Yonghang Tai, and Jun Zhang. 2020. Software Vulnerability Analysis and Discovery using Deep Learning Techniques: A Survey. IEEE Access (2020).
[68]
Thomas Zimmermann, Nachiappan Nagappan, and Laurie Williams. 2010. Searching for a needle in a haystack: Predicting security vulnerabilities for windows vista. In 2010 Third International Conference on Software Testing, Verification and Validation. IEEE, 421--428.

Cited By

View all
  • (2024)Advances and challenges in artificial intelligence text generation人工智能文本生成的进展与挑战Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.230041025:1(64-83)Online publication date: 8-Feb-2024
  • (2024)Do Developers Use Static Application Security Testing (SAST) Tools Straight Out of the Box? A large-scale Empirical StudyProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3690750(454-460)Online publication date: 24-Oct-2024
  • (2024)Automatic Data Labeling for Software Vulnerability Prediction Models: How Far Are We?Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686675(131-142)Online publication date: 24-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
October 2021
368 pages
ISBN:9781450386654
DOI:10.1145/3475716
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Machine Learning
  2. Security
  3. Static Application Security Testing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Cyber Security Cooperative Research Centre

Conference

ESEM '21
Sponsor:

Acceptance Rates

ESEM '21 Paper Acceptance Rate 24 of 124 submissions, 19%;
Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)162
  • Downloads (Last 6 weeks)15
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Advances and challenges in artificial intelligence text generation人工智能文本生成的进展与挑战Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.230041025:1(64-83)Online publication date: 8-Feb-2024
  • (2024)Do Developers Use Static Application Security Testing (SAST) Tools Straight Out of the Box? A large-scale Empirical StudyProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3690750(454-460)Online publication date: 24-Oct-2024
  • (2024)Automatic Data Labeling for Software Vulnerability Prediction Models: How Far Are We?Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686675(131-142)Online publication date: 24-Oct-2024
  • (2024)Code Defect Detection Model with Multi-layer Bi-directional Long Short Term Memory based on Self-Attention MechanismProceedings of the 2023 7th International Conference on Electronic Information Technology and Computer Engineering10.1145/3650400.3650676(1656-1660)Online publication date: 17-Apr-2024
  • (2024)Evaluating C/C++ Vulnerability Detectability of Query-Based Static Application Security Testing ToolsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.3354789(1-18)Online publication date: 2024
  • (2024)Methods and Algorithms for Cross-Language Search of Source Code Fragments2024 International Conference on Information Technologies (InfoTech)10.1109/InfoTech63258.2024.10701403(1-4)Online publication date: 11-Sep-2024
  • (2024)LLM-CloudSec: Large Language Model Empowered Automatic and Deep Vulnerability Analysis for Intelligent CloudsIEEE INFOCOM 2024 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFOCOMWKSHPS61880.2024.10620804(1-6)Online publication date: 20-May-2024
  • (2024)Incivility detection in open source code review and issue discussionsJournal of Systems and Software10.1016/j.jss.2023.111935209:COnline publication date: 14-Mar-2024
  • (2024)Securing tomorrow: a comprehensive survey on the synergy of Artificial Intelligence and information securityAI and Ethics10.1007/s43681-024-00529-zOnline publication date: 30-Jul-2024
  • (2024)Seq2Seq-AFL: Fuzzing via sequence-to-sequence modelInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02153-z15:10(4403-4421)Online publication date: 23-Apr-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media