Comma Separated Vulnerabilities: Detecting Formula Injection in the Wild
2025Conference / Journal
Authors
Louis Bettels David Klein Martin Johns Manuel Karl
Research Hub
Research Hub C: Sichere Systeme
Research Challenges
RC 8: Security with Untrusted Components
Abstract
Comma-Separated Values (CSV) is one of the premier data exchange formats due to its simplicity and software independence. Once humans want to analyze the contained data, they import the CSV file into a spreadsheet application, such as Microsoft Excel. Spreadsheet applications are used across many sensitive industries or government sectors for financial, supply chain, or human resources management tasks.
In this work, we investigate the prevalence of formula injection, an overlooked security risk. This vulnerability class abuses the lack of separation between data and text in the CSV format to inject malicious formulas that are evaluated on import. Consequences of such an attack range from data exfiltration to remote code execution. To assess the severity of this threat, we first analyzed eight spreadsheet applications for for- mulas usable for nefarious purposes and four libraries for their provided security protections, of which there are none. This lack of security mechanisms means applications have to actively defend against formula injection. To determine whether they do so, and to study the prevalence of formula injection vulnerabilities in open-source Java applications, we propose a static analysis tool, CSVScan, that detects user-controlled input reaching CSV exports.
We uncover eight applications containing code patterns at risk for formula injection patterns. Out of those, four are vulnerable in realistic scenarios, allowing unprivileged users to attack users with higher privileges.