Explaining DrawnApart, a remote GPU fingerprinting technique

DrawnApart is a new technique to fingerprint GPUs using the WebGL API. It can distinguish identical GPUs.

Explaining DrawnApart, a remote GPU fingerprinting technique

Browsing the web exposes us to constant tracking. In recent years, and due in large part to the general public's interest in their privacy on the web, tracking methods have been scrutinized, new techniques being regularly revealed. Along with colleagues from the Ben-Gurion University of the Negev and the University of Adelaide, we uncovered a new threat to privacy that can efficiently fingerprint your GPU, making you more identifiable on the Internet.

📰
DRAWNAPART: A Device Identification Technique based on Remote GPU Fingerprinting
Tomer Laor, Naif Mehanna, Antonin Durey, Vitaly Dyadyuk, Pierre Laperdrix, Clémentine Maurice, Yossi Oren, Romain Rouvoy, Walter Rudametkin and Yuval Yarom
NDSS'2022

Citation

@inproceedings{laor:hal-03526240, TITLE = {{DRAWNAPART: A Device Identification Technique based on Remote GPU Fingerprinting}}, AUTHOR = {Laor, Tomer and Mehanna, Naif and Durey, Antonin and Dyadyuk, Vitaly and Laperdrix, Pierre and Maurice, Cl{\'e}mentine and Oren, Yossi and Rouvoy, Romain and Rudametkin, Walter and Yarom, Yuval}, URL = {https://hal.inria.fr/hal-03526240}, BOOKTITLE = {{Network and Distributed System Security Symposium}}, ADDRESS = {San Diego, United States}, YEAR = {2022}, MONTH = Feb, DOI = {10.14722/ndss.2022.24093}, PDF = {https://hal.inria.fr/hal-03526240/file/ndss22_laor.pdf}, HAL_ID = {hal-03526240}, HAL_VERSION = {v1}, }

In a nutshell, our newly discovered technique that we named DrawnApart measures the time it takes for the GPU to draw specific points in a 3D environment. What we found is that even devices with the exact same GPU present noticeable variations that can be used to track users on the web. Notably, in our paper, DrawnApart can extend by up to 67% the tracking time of a device compared to known methods.

Tracking on the web

Tracking on the web is nothing recent. As far back as 2001, the New York Time shared concerns about the growing use of cookies on the web. The rapid adoption of the Internet has only made things worse and tracking keeps evolving towards more advanced techniques.
Cookie-based tracking has been the most widespread: from a simple technique, when introduced, it has changed over time to become a complex pipeline of first-and third-party cookies that allows for very precise tracking across the web. Fouad et al. explain in great detail most of the advanced cookie-based tracking techniques that are available on the modern web. Advertisers and trackers exploit these possibilities, making cookie-tracking still the most widespread and effective tracking method.
However, as users start to realize that their privacy is at risk from constant targeting and tracking, browsers are starting to take a stand against these practices. For instance, private browsing modes have been introduced in all major browsers and allow users to wipe data persisted by websites during the browsing session. Other actors in the market have taken a harsher stand: the Brave browser embeds an ad-blocker to restrict most of the tracking and the unwanted advertising. The Tor browser includes much stricter options to limit tracking, at the cost of speed and breakages. Safari and Firefox have also included many features to reduce tracking, like limiting third-party cookies.

This newly shed light on cookie-tracking and the self-awareness of users has pushed advertisers to look towards other techniques to track users. A potential candidate that has been studied is browser fingerprinting. Its stateless nature makes it more difficult to block than cookies, nothing is stored on the client device, while proving to be able to discriminate devices. The Am I Unique platform provides a testing tool that outlines the uniqueness of your setup.
One of the difficulties of browser fingerprint tracking is its limited stability. As attributes evolve over time, it becomes harder to keep track of a device, because the fingerprint changes frequently. This can be countered by choosing a specific set of stable attributes or by taking advantage of hardware-based attributes, which are significantly more stable and arguably more difficult to consistently forge.

We found one such hardware based method we call DrawnApart. Our work outlines how the WebGL API can be exploited to generate a GPU fingerprint through JavaScript. This, combined with other attributes, provides a significant boost of over 66% to the average tracking time over the state-of-the-art algorithm for browser fingerprint tracking (our previous work, called FP-Stalker).

What is DrawnApart ?

DrawnApart is a Graphic Processing Unit (GPU) fingerprinting technique which attempts to uniquely identify your graphics card. It can be run using unprivileged JavaScript in the browser and takes advantage of the WebGL API, which is enabled by default on all the browsers. In our research paper, which is to be presented at the Network And Distributed System Security Symposium 2022, we show that DrawnApart is able to extend by almost 67% the average tracking time of the state-of-the-art. More importantly, we can find differences in what would be otherwise identical hardware (e.g., the same model and version of GPUs from the same vendor, in the purchase order). DrawnApart is able to function for both dedicated or integrated GPUs.

To better understand how DrawnApart works, it is important to understand that GPUs consists of highly parallelized architectures, composed of various Execution Units (EU). Those can all perform independent operations simultaneously. On the software side, WebGL is a cross-platform API for 3D rendering that is implemented in most major browsers. WebGL abstracts the process of rendering into a multi-step pipeline. Two of those steps are exploited by DrawnApart: the vertex shader and the fragment shader. The first takes care of positioning a point in an area, while the second mainly determines the color.

Overview of our GPU fingerprinting technique: (1) points are rendered in parallel using several EUs; (2) the EU drawing point i executes a stall function (dark), while other EUs return a hard-coded value (light); (3) the execution time of each iteration is bounded by the slowest EU.

The first step in the DrawnApart fingerprinting process is to instruct the WebGL API to draw a number of points in parallel. For most points, the vertex shader returns as expected, however, for a specific set of points, we execute a stall function which significantly pauses the rendering process. Since a non-stalled point instantly returns, a timing measurement for the set of points is dominated by the time taken by the stalled points to execute. DrawnApart then builds a trace by repeating this drawing process multiple times. The official demo shows the result of this process. This trace can then be used as a GPU fingerprint.

Visual representation of raw traces collected by DrawnApart, from a single device
The measurements are divided into 16 groups of 11, where in each group we stall a different point. The color of a point indicates the rendering time, ranging from virtually 0 (white) to 90 ms (blue). Red vertical bars indicate group boundaries

Exploiting the trace

DrawnApart is evaluated in two scenarios, as described in the paper. First, in a controlled environment to better control different variables. The controlled environment is composed of different groups of devices, all equipped with identical GPUs (within a same group) that DrawnApart intends to distinguish.
For this purpose, the evaluation consists in correctly classifying devices by using a  Random Forest classifier. The following table present the resulting accuracy gain over the base rate and shows how DrawnApart provides a significant boost in  identifying capabilities in the controlled environment.
Note we mention three type of timers: onscreen, offscreen, and GPU. Each of these timing methods comes with its advantages: the onscreen method makes use of Window.requestAnimationFrame, which is called after the rendering is done. Measurements are recorded once an iteration of the method has been completed. The onscreen method is compatible with all the major browsers, but present the limitation of a capped refresh rate (to the browser's maximum frame rate). Consequently, one element of the trace must take at least 16ms to be correctly measured.
The offscreen method is more limited in term of compatibility at the time of writing, as it works mostly with Chromium browsers and is subject to manual activation on Firefox.
The GPU timing technique is a variation of the offscreen method that measures the duration of a set of graphics commands directly on the GPU side (as opposed as a measure on the CPU side for both previous methods). However, this method presents serious disadvantages as it strongly varies between different GPU architectures and is not supported anymore by the Google SwiftShader renderer.



The second scenario, which is much more representative of the real world, uses traces collected in the wild through the Am I Unique extension. The extension performs  a scheduled data collection in the background comprised of attributes shown on the Am I Unique fingerprinting page and 7 DrawnApart traces at each collection. As such, the collected dataset is representative of a real world setting with various users from different parts of the globe, with different hardware and software configurations and a realistic computing load. We collected data from over 2,500 devices, with more than 1,605 distincts GPUs, the main task is to be able to distinguish and consistently track the users over time. This scenario provides additional challenges compared to the controlled environment as an attacker (in this case, we are acting as the attacker) cannot use long training phases as the user must be identified in real-time. The attacker should also handle new devices as they arrive, without having to retrain his classifier each time.
In order to respond to those challenges, we proposed a pipeline consisting of an embedding neural network that takes the raw traces as input, and outputs an embedding of the trace in euclidean space. Consequently, the attacker simply has to compute the Euclidean distance between an incoming trace and the traces he previously collected and stored. He basically compares the similarity of the GPU fingerprint. If the Euclidean distance is below a threshold, two traces are very similar and are classified as originating from the same device.
We chose to evaluate DrawnApart in the wild in two steps. First, we evaluate our system without any additional information other than the processed trace (through the previous pipeline) on a k-Nearest Neighbor classifier and we compare the resulting accuracy to a base rate. We find that DrawnApart significantly improves on the base rate.
Second, we evaluate our method when integrated into a browser fingerprint tracking algorithm, namely FP-Stalker, which is the state-of-the-art algorithm for linking browser fingerprints over time.

DrawnApart integration in FP-Stalker

Compared to the original version of FP-Stalker, our DrawnApart algorithm offers a boost of almost 67% on the average tracking time.

Tracking time boost with DrawnApart compared to the original version of FP-Stalker

Countermeasures

While the effectiveness of DrawnApart introduces serious privacy risks for users, a number of countermeasures could address these issues:

  • Blocking JavaScript is an effective countermeasure, and most browsers and extensions allow blocking JavaScript. This is a rather radical solution that comes with the downside breaking a lot of sides since most websites heavily rely on JavaScript for displaying their content or maintaining a good user experience. You can also look to browser extensions to enable/disable JavaScript at a more granular level (choose which scripts to allow). NoScript is a big favorite, and uMatrix is nice too!
  • Disabling the WebGL API could be a good tradeoff compared to blocking all JavaScript. However, it is to be noted that even if only 1% of the Alexa Top 10k showed any usage of the WebGL API, disabling it completely could introduce unexpected breakage. Firefox and a few other browsers already allow users to deactivate the WebGL API.
  • Introducing random variation of the clock speed would also be an efficient countermeasure. DrawnApart's traces already contain some noise, and thus, introducing random variations in clock speed would make the data even noisier and thus less exploitable.
  • Preventing parallel execution would directly impact the inner working of our technique. However, this countermeasure comes with the heavy cost of severely limiting the performance of the WebGL API.

It should be noted that current mitigations against Canvas Fingerprinting that introduce noise in renderings (see Brave's farble output) cannot be applied against DrawnApart as it is based on measuring execution time and not on comparing rendered images.

Conclusion

Throughout our paper, DrawnApart is shown to efficiently identify devices both in a controlled environment and in real-world scenarios. We show the technique works against what are otherwise identical devices. Through our data collection performed on the Am I Unique platform we showed it is feasible in practice. DrawnApart offers a boost of almost 67% to the average tracking time of the state-of-the-art browser fingerprinting algorithm, outlining the importance of hardware fingerprinting in web tracking.

DrawnApart Demo

In the following video, Vitaly Dyadyuk, co-author of the paper, swaps the CPUs (with integrated graphics) of two identical computers showing that DrawnApart is able to detect a full CPU swap.

DrawnApart has received media coverage, most notably: