FP-tracer: Fine-grained Browser Fingerprinting Detection via Taint-tracking and Multi-level Entropy-based Thresholds
2024Conference / Journal
Authors
Thomas Barber Daniele Antonioli Luca Compagna Martin Johns David Klein Angel Cuevas Rumin Ruben Cuevas Rumin Miguel Bermejo Lukas Hock Soumaya Boussaha
Research Hub
								
									Research Hub C: Sichere Systeme
									
								
							
Research Challenges
										
											RC 7: Building Secure Systems
										
											RC 8: Security with Untrusted Components
										
									
Abstract
Browser fingerprinting is an effective technique to track web users by building a fingerprint from their browser attributes. It is also stealthy because the tracker uses legitimate JavaScript API calls offered by the browser engine, which can be obfuscated before they are sent to a (third-party) server. Current browser fingerprinting methodologies employ coarse-grained collection and classification techniques, such as binary classification of fingerprinters based on the number of non-obfuscated exfiltrated attributes. As a result, they produce inconsistent findings. Meanwhile, the privacy of millions of web users is at risk daily. We address this gap by presenting FP-tracer, a novel methodology to detect and classify browser fingerprinters based on dynamic taint tracking and joint entropy classification. Our methodology enables detecting first- and third-party fingerprinters even when they use obfuscation by tainting attributes, propagating them, and logging when they are leaked (via 62 sources and 25 sinks). Moreover, it discriminates the invasiveness of fingerprinting activities, even from the same service, by measuring the joint entropy of the collected attributes and clustering them. We implement FP-tracer by extending Foxhound, a privacy-oriented Firefox fork with numeric type tainting, more taint tracking sources and sinks, support for multiple sources, and better logging capabilities. We embed our implementation in our automated crawling infrastructure, which is capable of testing websites in parallel using programmable and reproducible logic. We will open-source our implementation. We evaluate FP-tracer by performing a large-scale crawl over the Tranco Top 100K, and detect, amongst others, audio, canvas, and storage fingerprinting on the web. Among others, we find high fingerprinting activities in 8% of domains, with more moderate activity reaching 75%. Notably, fingerprinting is almost five times more likely to be performed by third-party scripts for high activity levels. In addition, we measure that the most severe category of fingerprinting obfuscates 46% of transmitted attributes, and 38% of fingerprinters involve two or more domains. Finally, we find that existing consent banners do not provide an effective defense against browser fingerprinting.