[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

HGIVul: : Detecting inter-procedural vulnerabilities based on hypergraph convolution

Published: 01 August 2023 Publication History

Abstract

Context:

Detecting source code vulnerabilities is one way to block cyber attacks from an early stage. Vulnerability-triggered code typically involves one or more function procedures, while current research pays more attention to the code on a single procedure. Due to lacking a comprehensive analysis of multiple vulnerability-related procedures, current methods suffer disorder false-positive and false-negative rates, especially in detecting inter-procedural vulnerability.

Objective:

This paper proposes HGIVul, an inter-procedural vulnerability detection method for source code based on hypergraph convolution. The key of HGIVul is to derive the syntax-semantic characteristic from multiple procedures in a suitable code information space, which brings more balanced detection.

Methods:

Firstly, the potential vulnerability-related code trace across multiple procedures is located via static analyzer Infer. Then, HGIVul reconstructs the soft inter-procedural control flow graph (ICFG) from the trace to restore the complex relationship between multiple-procedural codes. Next, HGIVul performs multi-level graph convolution on the soft ICFG to grasp holistic code characteristics within multiple procedures. Finally, a classifier is applied to the extracted code features for vulnerability detection.

Results:

The experimental results show that HGIVul outperforms in detecting vulnerabilities and identifying vulnerability types, with the F1-measure of 66.33% and 79.58% for detection and identification, respectively. Moreover, the experiment on cross-projects indicates HGIVul has a better detection ability.

Conclusion:

The proposed HGIVul achieves a balanced detection performance than the related state-of-the-art methods, which proves that fusing syntactic–semantic information from multiple procedures benefits inter-procedural vulnerability detection. In addition, the results applied to five actual projects indicate that HGIVul has the feasibility of detection in practical.

References

[1]
Murphy-Hill Emerson, Zimmermann Thomas, Bird Christian, Nagappan Nachiappan, The design space of bug fixes and how developers navigate it, IEEE Trans. Softw. Eng. 41 (1) (2015) 65–81,.
[2]
Sonatype, 2021 State of the software supply chian, 2022, https://www.sonatype.com/hubfs/SSSC-Report-2021_0913_PM_2.pdf?hsLang=en-us. Accessed July 2, 2022.
[3]
Kim Seulbae, Woo Seunghoon, Lee Heejo, Oh Hakjoo, VUDDY: A scalable approach for vulnerable code clone discovery, in: 2017 IEEE Symposium on Security and Privacy, SP, 2017, pp. 595–614,.
[4]
Russell Rebecca, Kim Louis, Hamilton Lei, Lazovich Tomo, Harer Jacob, Ozdemir Onur, Ellingwood Paul, McConley Marc, Automated vulnerability detection in source code using deep representation learning, in: 2018 17th IEEE International Conference on Machine Learning and Applications, ICMLA, 2018, pp. 757–762,.
[5]
Zhou Yaqin, Liu Shangqing, Siow Jingkai, Du Xiaoning, Liu Yang, Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks, in: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Curran Associates Inc., Red Hook, NY, USA, 2019, https://dl.acm.org/doi/10.5555/3454287.3455202.
[6]
Li Zhen, Zou Deqing, Xu Shouhuai, Jin Hai, Zhu Yawei, Chen Zhaoxuan, SySeVR: A framework for using deep learning to detect software vulnerabilities, IEEE Trans. Dependable Secure Comput. (2021) 1,.
[7]
Zheng Yunhui, Pujar Saurabh, Lewis Burn, Buratti Luca, Epstein Edward, Yang Bo, Laredo Jim, Morari Alessandro, Su Zhong, D2A: A dataset built for AI-based vulnerability detection methods using differential analysis, in: 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 2021, pp. 111–120,.
[8]
Cheng Xiao, Wang Haoyu, Hua Jiayi, Xu Guoai, Sui Yulei, DeepWukong: Statically detecting software vulnerabilities using deep graph neural network, ACM Trans. Softw. Eng. Methodol. 30 (3) (2021),.
[9]
Software S., Rough audit tool for security, 2022, https://code.google.com/archive/p/rough-auditing-tool-for-security/. Accessed July 2, 2022.
[10]
Wheeler D.A., Flawfinder, 2022, https://www.dwheeler.com/flawfinder/. Accessed July 2, 2022.
[11]
Meta, Finding inter-procedural bugs at scale with Infer static analyzer, 2022, https://engineering.fb.com/2017/09/06/android/finding-inter-procedural-bugs-at-scale-with-infer-static-analyzer/. Accessed July 2, 2022.
[12]
Wu D., Gao D., Deng R.H., Rocky C.K.C., When program analysis meets bytecode search: Targeted and efficient inter-procedural analysis of modern android apps in BackDroid, in: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN, IEEE Computer Society, Los Alamitos, CA, USA, 2021, pp. 543–554,.
[13]
Li Zhen, Zou Deqing, Xu Shouhuai, Ou Xinyu, Jin Hai, Wang Sujuan, Deng Zhijun, Zhong Yuyi, VulDeePecker: A deep learning-based system for vulnerability detection, in: 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-21, 2018, The Internet Society, 2018,.
[14]
Zheng Weining, Jiang Yuan, Su Xiaohong, Vu1SPG: Vulnerability detection based on slice property graph representation learning, in: 2021 IEEE 32nd International Symposium on Software Reliability Engineering, ISSRE, 2021, pp. 457–467,.
[15]
O’Hearn Peter, Separation logic, Commun. ACM 62 (2) (2019) 86–95,.
[16]
Reynolds J.C., Separation logic: a logic for shared mutable data structures, in: Proceedings 17th Annual IEEE Symposium on Logic in Computer Science, 2002, pp. 55–74,.
[17]
Meta, Infer, 2022, https://fbinfer.com/. Accessed July 5, 2022.
[18]
Cui Lei, Hao Zhiyu, Jiao Yang, Fei Haiqiang, Yun Xiaochun, VulDetector: Detecting vulnerabilities using weighted feature graph comparison, IEEE Trans. Inf. Forensics Secur. 16 (2021) 2004–2017,.
[19]
Allamanis Miltiadis, Brockschmidt Marc, Khademi Mahmoud, Learning to represent programs with graphs, 2017,. arXiv preprint arXiv:1711.00740.
[20]
Lin Guanjun, Zhang Jun, Luo Wei, Pan Lei, Xiang Yang, POSTER: Vulnerability discovery with function representation learning from unlabeled projects, in: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS ’17, Association for Computing Machinery, New York, NY, USA, 2017, pp. 2539–2541,.
[21]
Sahu Madhusmita, Dash Rasmita, A survey on deep learning: Convolution Neural Network (CNN), in: Mishra Debahuti, Buyya Rajkumar, Mohapatra Prasant, Patnaik Srikanta (Eds.), Intelligent and Cloud Computing, Springer Singapore, Singapore, 2021, pp. 317–325,.
[22]
Lin Guanjun, Wen Sheng, Han Qing-Long, Zhang Jun, Xiang Yang, Software vulnerability detection using deep neural networks: A survey, Proc. IEEE 108 (10) (2020) 1825–1848,.
[23]
Chakraborty Saikat, Krishna Rahul, Ding Yangruibo, Ray Baishakhi, Deep learning based vulnerability detection: Are we there yet, IEEE Trans. Softw. Eng. (2021) 1,.
[24]
Lin Guanjun, Zhang Jun, Luo Wei, Pan Lei, De Vel Olivier, Montague Paul, Xiang Yang, Software vulnerability discovery via learning multi-domain knowledge bases, IEEE Trans. Dependable Secure Comput. 18 (5) (2021) 2469–2485,.
[25]
Cao Sicong, Sun Xiaobing, Bo Lili, Wei Ying, Li Bin, BGNN4VD: Constructing bidirectional graph neural-network for vulnerability detection, Inf. Softw. Technol. 136 (2021),.
[26]
Apple Inc., Clang static analyzer, 2022, https://clang-analyzer.llvm.org/scan-build.html. Accessed July 2, 2022.
[27]
Checkmarx, Checkmarx, 2022, https://www.checkmarx.com/. Accessed July 2, 2022.
[28]
Wagner David A., Foster Jeffrey S., Brewer Eric A., Aiken Alexander, A first step towards automated detection of buffer overrun vulnerabilities, in: NDSS, Vol. 20, 2000, https://www.cs.umd.edu/class/spring2021/cmsc614/papers/automated-buffer.pdf.
[29]
Yamaguchi Fabian, Wressnegger Christian, Gascon Hugo, Rieck Konrad, Chucky: Exposing missing checks in source code for vulnerability discovery, in: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, CCS ’13, Association for Computing Machinery, New York, NY, USA, 2013, pp. 499–510,.
[30]
Yamaguchi Fabian, Maier Alwin, Gascon Hugo, Rieck Konrad, Automatic inference of search patterns for taint-style vulnerabilities, in: 2015 IEEE Symposium on Security and Privacy, 2015, pp. 797–812,.
[31]
Yan Hua, Sui Yulei, Chen Shiping, Xue Jingling, Spatio-temporal context reduction: A pointer-analysis-based static approach for detecting use-after-free vulnerabilities, in: 2018 IEEE/ACM 40th International Conference on Software Engineering, ICSE, 2018, pp. 327–337,.
[32]
Sui Yulei, Xue Jingling, Value-flow-based demand-driven pointer analysis for C and C++, IEEE Trans. Softw. Eng. 46 (8) (2020) 812–835,.
[33]
Ma Xutong, Yan Jiwei, Wang Wei, Yan Jun, Zhang Jian, Qiu Zongyan, Detecting memory-related bugs by tracking heap memory management of C++ smart pointers, in: 2021 36th IEEE/ACM International Conference on Automated Software Engineering, ASE, 2021, pp. 880–891,.
[34]
Roy Chanchal K., Cordy James R., NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization, in: 2008 16th IEEE International Conference on Program Comprehension, 2008, pp. 172–181,.
[35]
Li Z., Lu S., Myagmar S., Zhou Y., CP-Miner: finding copy-paste and related bugs in large-scale software code, IEEE Trans. Softw. Eng. 32 (3) (2006) 176–192,.
[36]
Sajnani Hitesh, Saini Vaibhav, Svajlenko Jeffrey, Roy Chanchal K., Lopes Cristina V., SourcererCC: Scaling code clone detection to big-code, in: Proceedings of the 38th International Conference on Software Engineering, ICSE ’16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 1157–1168,.
[37]
Jang Jiyong, Agrawal Abeer, Brumley David, ReDeBug: Finding unpatched code clones in entire OS distributions, in: 2012 IEEE Symposium on Security and Privacy, 2012, pp. 48–62,.
[38]
Xiao Yang, Chen Bihuan, Yu Chendong, Xu Zhengzi, Yuan Zimu, Li Feng, Liu Binghong, Liu Yang, Huo Wei, Zou Wei, Shi Wenchang, MVP: Detecting vulnerabilities using Patch-Enhanced vulnerability signatures, in: 29th USENIX Security Symposium (USENIX Security 20), USENIX Association, 2020, pp. 1165–1182. URL: https://www.usenix.org/system/files/sec20-xiao.pdf.
[39]
Koschke Rainer, Falke Raimar, Frenzel Pierre, Clone detection using abstract syntax suffix trees, in: 2006 13th Working Conference on Reverse Engineering, 2006, pp. 253–262,.
[40]
Jiang Lingxiao, Misherghi Ghassan, Su Zhendong, Glondu Stephane, DECKARD: Scalable and accurate tree-based detection of code clones, in: 29th International Conference on Software Engineering (ICSE’07), 2007, pp. 96–105,.
[41]
Yamaguchi Fabian, Lottmann Markus, Rieck Konrad, Generalized vulnerability extrapolation using abstract syntax trees, in: Proceedings of the 28th Annual Computer Security Applications Conference, ACSAC ’12, Association for Computing Machinery, New York, NY, USA, 2012, pp. 359–368,.
[42]
Zou Deqing, Qi Hanchao, Li Zhen, Wu Song, Jin Hai, Sun Guozhong, Wang Sujuan, Zhong Yuyi, SCVD: A new semantics-based approach for cloned vulnerable code detection, in: Polychronakis Michalis, Meier Michael (Eds.), Detection of Intrusions and Malware, and Vulnerability Assessment, Springer International Publishing, Cham, 2017, pp. 325–344,.
[43]
Johnson Andrew, Waye Lucas, Moore Scott, Chong Stephen, Exploring and enforcing security guarantees via program dependence graphs, in: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’15, Association for Computing Machinery, New York, NY, USA, 2015, pp. 291–302,.
[44]
Pham Nam H., Nguyen Tung Thanh, Nguyen Hoan Anh, Nguyen Tien N., Detection of recurring software vulnerabilities, in: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, ASE ’10, Association for Computing Machinery, New York, NY, USA, 2010, pp. 447–456,.
[45]
Li Jingyue, Ernst Michael D., CBCD: Cloned buggy code detector, in: 2012 34th International Conference on Software Engineering, ICSE, 2012, pp. 310–320,.
[46]
Shin Yonghee, Williams Laurie, Can traditional fault prediction models be used for vulnerability prediction?, Empir. Softw. Eng. 18 (1) (2013) 25–59,.
[47]
Yamaguchi Fabian, Rieck Konrad, et al., Vulnerability extrapolation: Assisted discovery of vulnerabilities using machine learning, in: 5th USENIX Workshop on Offensive Technologies (WOOT 11), 2011, https://www.usenix.org/legacy/events/woot11/tech/final_files/Yamaguchi.pdf.
[48]
Pradel Michael, Sen Koushik, DeepBugs: A learning approach to name-based bug detection, Proc. ACM Program. Lang. 2 (OOPSLA) (2018),.
[49]
Cheng Xiao, Nie Xu, Li Ningke, Wang Haoyu, Zheng Zheng, Sui Yulei, How about bug-triggering paths? - Understanding and characterizing learning-based vulnerability detectors, IEEE Trans. Dependable Secure Comput. (2022) 1–18,.
[50]
Nadim Md, Mondal Debajyoti, Roy Chanchal K., Leveraging structural properties of source code graphs for just-in-time bug prediction, Autom. Softw. Eng. 29 (1) (2022) 1–30,.
[51]
Harer Jacob A., Kim Louis Y., Russell Rebecca L., Ozdemir Onur, Kosta Leonard R., Rangamani Akshay, Hamilton Lei H., Centeno Gabriel I., Key Jonathan R., Ellingwood Paul M., et al., Automated software vulnerability detection with machine learning, 2018,. arXiv preprint arXiv:1803.04497.
[52]
Bai Song, Zhang Feihu, Torr Philip H.S., Hypergraph convolution and hypergraph attention, Pattern Recognit. 110 (2021),.
[53]
Feng Yifan, You Haoxuan, Zhang Zizhao, Ji Rongrong, Gao Yue, Hypergraph neural networks, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 3558–3565,. (01).
[54]
Infer, List of all issue types, 2022, https://fbinfer.com/docs/all-issue-types. Accessed December 1 2022.
[55]
Buratti Luca, Pujar Saurabh, Bornea Mihaela, McCarley Scott, Zheng Yunhui, Rossiello Gaetano, Morari Alessandro, Laredo Jim, Thost Veronika, Zhuang Yufan, et al., Exploring software naturalness through neural language models, 2020,. arXiv preprint arXiv:2006.12641.

Cited By

View all
  • (2024)A Full-Lifecycle Malicious Code Detection Scheme Based on RASP and Random ForestAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5666-7_24(281-293)Online publication date: 5-Aug-2024
  • (2023)Learning to Detect Memory-related VulnerabilitiesACM Transactions on Software Engineering and Methodology10.1145/362474433:2(1-35)Online publication date: 23-Dec-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information and Software Technology
Information and Software Technology  Volume 160, Issue C
Aug 2023
251 pages

Publisher

Butterworth-Heinemann

United States

Publication History

Published: 01 August 2023

Author Tags

  1. Vulnerability detection
  2. Inter-procedural vulnerability
  3. Hypergraph neural network
  4. Software security engineering
  5. Static analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Full-Lifecycle Malicious Code Detection Scheme Based on RASP and Random ForestAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5666-7_24(281-293)Online publication date: 5-Aug-2024
  • (2023)Learning to Detect Memory-related VulnerabilitiesACM Transactions on Software Engineering and Methodology10.1145/362474433:2(1-35)Online publication date: 23-Dec-2023

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media