Title: Explainable C/C++ vulnerability detection

Authors: Zhen Huang; Amy Aumpansub; Sameer Shaik

Addresses: School of Computing, DePaul University, Chicago, IL, USA ' School of Computing, DePaul University, Chicago, IL, USA ' School of Computing, DePaul University, Chicago, IL, USA

Abstract: Detecting software vulnerabilities in C/C++ code is critical for ensuring software security. In this paper, we explore the use of neural networks to detect vulnerabilities using program slices that capture syntactic and semantic information. Our approach involves extracting vulnerability-related constructs such as API function calls, array usage, pointer usage, and arithmetic expressions, and converting them into numerical vectors. We experiment with two approaches: one where we randomly sample and downsample non-vulnerable data to balance the dataset, and another where we include all vulnerable data points and match them with an equal number of non-vulnerable points. Our model achieves high precision (90.7%), F1-score (93.5%), and Matthews correlation coefficient (MCC 86.8%), outperforming prior work in these metrics. We also use local interpretable model-agnostic explanations (LIME) to provide clear insights into why code segments are flagged as vulnerable. This approach improves both the accuracy and interpretability of vulnerability detection for developers.

Keywords: software vulnerabilities; vulnerability detection; explainable AI; deep learning; neural networks; program analysis.

DOI: 10.1504/IJICS.2025.149450

International Journal of Information and Computer Security, 2025 Vol.28 No.3, pp.348 - 376

Received: 03 Dec 2023
Accepted: 16 Jan 2025

Published online: 31 Oct 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article