Broken Promises: Measuring Confounding Effects in Learning-based Vulnerability Discovery
2023Konferenz / Journal
Autor*innen
Konrad Rieck Niklas Risse Lukas Pirch Martin Härterich Tom Ganz Erik Imgrund
Research Hub
Research Hub C: Sichere Systeme
Research Challenges
RC 9: Intelligent Security Systems
Abstract
Several learning-based vulnerability detection methods have been proposed to assist developers during the secure software development life-cycle. In particular, recent learning-based large transformer networks have shown remarkably high performance in various vulnerability detection and localization benchmarks. However, these models have also been shown to have difficulties accurately locating the root cause of flaws and generalizing to out-of-distribution samples. In this work, we investigate this problem and identify spurious correlations as the main obstacle to transferability and generalization, resulting in performance losses of up to 30% for current models. We propose a method to measure the impact of these spurious correlations on learning models and estimate their true, unbiased performance. We present several strategies to counteract the underlying confounding bias, but ultimately our work highlights the limitations of evaluations in the laboratory for complex learning tasks such as vulnerability discovery.