Ruhr-Uni-Bochum

Andrei Sabelfeld (Chalmers University of Technology)

"Black Ostrich: Web Application Scanning with String Solvers"

Copyright: CASA, Andrei Sabelfeld

Wann: 09.05.2023, 14:00 Uhr
Wo: Gebäude TZR ("MB"), Ebene 1, Raum S-MO-104, Universitätsstraße 142, 44799 Bochum
Online-Teilnahme: Zoom-Webinar

Topic: Black Ostrich: Web Application Scanning with String Solvers

Abstract: Securing web applications remains a pressing challenge. Unfortunately, the state of the art in web crawling and security scanning still falls short of deep crawling. A major roadblock is the crawlers’ limited ability to pass input validation checks when web applications require data of a certain format, such as email, phone number, or zip code. This talk presents Black Ostrich, a principled approach to deep web crawling and scanning. The key idea is to equip web crawling with string constraint-solving capabilities to dynamically infer suitable inputs from regular expression patterns in web applications and thereby pass input validation checks. To enable this use of constraint solvers, we develop new automata-based techniques to handle complex real-world regular expressions, including support for the relevant features of ECMA JavaScript regular expressions. We implement our approach by extending and combining the Ostrich constraint solver with the Black Widow web crawler. We evaluate Black Ostrich on a set of 8,820 unique validation patterns gathered from over 21,667,978 forms from a combination of the July 2021 Common Crawl and Tranco top 100K. For these forms and reconstructions of input elements corresponding to the patterns, we demonstrate that Black Ostrich achieves a 99% coverage of the form validations compared to an average of 36% for the state-of-the-art scanners. Moreover, out of the 66,377 domains using these patterns, we solve all patterns on 66,309 (99%) while the combined efforts of the other scanners cover 52,632 (79%). We further show that our approach can boost coverage by evaluating it on three open-source applications. Our empirical studies include a study of email validation patterns, simultaneously demonstrating that our regular expression encoding is practical, where we find that 213 (26%) out of the 825 found email validation patterns liberally admit XSS injection payloads.

Joint work with Benjamin Eriksson, Amanda Stjerna, Riccardo De Masellis, and Philipp Ruemmer, to appear in the ACM Conference on Computer and Communications Security (CCS) 2023.

Biography: Andrei Sabelfeld is Professor at Chalmers University of Technology. Before joining Chalmers as faculty, he was a Research Associate at Cornell University in Ithaca, NY, USA. His research ranges from foundations to applications in a range of topics in cybersecurity and privacy. He is a recipient of a number of prestigious prizes and awards from ERC, SSF, VR, WASP, Chalmers, Google, Amazon, and Meta (Facebook). Today, he leads a group of researchers at Chalmers engaged in a number of internationally visible projects on software security, web security, IoT security, security foundations, and applied cryptography.

Zum YouTube-Video