FP-tracer: Fine-grained Browser Fingerprinting Detection via Taint-tracking and Entropy-based Thresholds

Authors: Soumaya Boussaha (SAP Security Research / EURECOM), Lukas Hock (SAP Security Research), Miguel Bermejo (UC3M), Ruben Cuevas Rumin (UC3M), Angel Cuevas Rumin (UC3M), David Klein (Technische Universität Braunschweig), Martin Johns (Technische Universität Braunschweig), Luca Compagna (SAP Security Research), Daniele Antonioli (EURECOM), Thomas Barber (SAP Security Research)

Volume: 2024
Issue: 3
Pages: 540–560
DOI: https://doi.org/10.56553/popets-2024-0092

Download PDF

Abstract: Browser fingerprinting is an effective technique to track web users by building a fingerprint from their browser attributes. It is also stealthy because the tracker uses legitimate JavaScript API calls offered by the browser engine, which can be obfuscated before they are sent to a (third-party) server. Current browser fingerprinting methodologies employ coarse-grained collection and classification techniques, such as binary classification of fingerprinters based on the number of non-obfuscated exfiltrated attributes. As a result, they produce inconsistent findings. Meanwhile, the privacy of millions of web users is at risk daily. We address this gap by presenting FP-tracer, a novel methodology to detect and classify browser fingerprinters based on dynamic taint tracking and joint entropy classification. Our methodology enables detecting first- and third-party fingerprinters even when they use obfuscation by tainting attributes, propagating them, and logging when they are leaked (via 62 sources and 25 sinks). Moreover, it discriminates the invasiveness of fingerprinting activities, even from the same service, by measuring the joint entropy of the collected attributes and clustering them. We implement FP-tracer by extending Foxhound, a privacy-oriented Firefox fork with numeric type tainting, more taint tracking sources and sinks, support for multiple sources, and better logging capabilities. We embed our implementation in our automated crawling infrastructure, which is capable of testing websites in parallel using programmable and reproducible logic. We will open-source our implementation. We evaluate FP-tracer by performing a large-scale crawl over the Tranco Top 100K, and detect, amongst others, audio, canvas, and storage fingerprinting on the web. Among others, we find high fingerprinting activities in 8% of domains, with more moderate activity reaching 75%. Notably, fingerprinting is almost five times more likely to be performed by third-party scripts for high activity levels. In addition, we measure that the most severe category of fingerprinting obfuscates 46% of transmitted attributes, and 38% of fingerprinters involve two or more domains. Finally, we find that existing consent banners do not provide an effective defense against browser fingerprinting

Keywords: Browser fingerprinting, Tainting, JavaScript, Privacy, Entropy, GDPR

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution 4.0 license.