Ruhr-Uni-Bochum

Comma Separated Vulnerabilities: Detecting Formula Injection in the Wild

2025

Conference / Journal

Research Hub

Research Hub C: Sichere Systeme

Research Challenges

RC 8: Security with Untrusted Components

Abstract

Comma-Separated Values (CSV) is one of the premier data exchange formats due to its simplicity and software independence. Once humans want to analyze the contained data, they import the CSV file into a spreadsheet application, such as Microsoft Excel. Spreadsheet applications are used across many sensitive industries or government sectors for financial, supply chain, or human resources management tasks.
In this work, we investigate the prevalence of formula injection, an overlooked security risk. This vulnerability class abuses the lack of separation between data and text in the CSV format to inject malicious formulas that are evaluated on import. Consequences of such an attack range from data exfiltration to remote code execution. To assess the severity of this threat, we first analyzed eight spreadsheet applications for for- mulas usable for nefarious purposes and four libraries for their provided security protections, of which there are none. This lack of security mechanisms means applications have to actively defend against formula injection. To determine whether they do so, and to study the prevalence of formula injection vulnerabilities in open-source Java applications, we propose a static analysis tool, CSVScan, that detects user-controlled input reaching CSV exports.
We uncover eight applications containing code patterns at risk for formula injection patterns. Out of those, four are vulnerable in realistic scenarios, allowing unprivileged users to attack users with higher privileges.

Tags

Program Analysis