Exploiting GPU Caches from the Browser with WebGPU

2024-06-27 · Master's Thesis

Roland Czerny

Abstract

The use of GPUs for general-purpose computations has increased steadily in recent years. The massive parallelism of GPUs benefits many applications, including security-critical computations such as AES. Since computations on sensitive data have become more common on GPUs, the scrutiny must also increase. Meanwhile, WebGPU facilitates easy access to compute shaders from every web browser. Previous studies have shown that GPUs are susceptible to several cache-based side-channel attacks originally designed for CPUs. Prior work demonstrated Prime+Probe attacks in various scenarios from native code.

This work presents the first side-channel attack from within the browser using the WebGPU API. We construct several generic primitives to build a self-configuring attack that works across various different devices. We present a technique to distinguish L2 cache hits from cache misses in WebGPU, an essential building block for every cache attack. We use this building block to automatically detect parameters like the cache hit-miss threshold, the L2 cache size, and the number of cache sets. We demonstrate the effectiveness of these primitives on 12 desktop GPUs from 5 different generations and 2 vendors.

A crucial step to mount a Prime+Probe cache attack is to find eviction sets. We present the first parallel eviction-set-finding algorithm. The algorithm is tailored to GPUs and reliably finds more than 80 % of sets on all but one tested Nvidia GPU. We find the sets in 2 to 12 minutes, depending on the GPU. We further evaluate a native-to-browser data-exfiltration scenario. For this, we construct the first Prime+Probe covert channel from a native CUDA application to a WebGPU application in the browser. Our covert channel reaches a true channel capacity of up to 10.9 kB/s. The self-configuring nature of our algorithms and the brief time frame of under 15 minutes enable drive-by attacks during internet browsing. Our attack requires no user interaction and works across a variety of Nvidia GPUs.

The content of this thesis was a major contribution to a conference paper at AsiaCCS’24 by Giner et al. [1]. The paper demonstrates more practical attacks using our generic primitives. The attacks presented in the paper include an inter-keystroke timing attack, which exploits timing variations between keystrokes, and an end-to-end attack that compromises the entire AES encryption process running natively in CUDA.

Our work demonstrates that access to GPUs from the browser can be a powerful tool for attackers. We suggest that browsers treat access to the GPU like access to other security- and privacy-related resources, like the microphone or camera.

Cite

@inproceedings{
  title={{Exploiting GPU Caches from the Browser with WebGPU}},
  author={Czerny, Roland},
  booktitle={Master's Thesis},
  year={2024},
}