Use your Nvidia GPU's VRAM as swap space on Linux
Motivation and Use Cases
- Intended for machines with limited, non-upgradable RAM but relatively large, often-idle Nvidia VRAM (e.g., laptops in hybrid GPU mode, desktops with big gaming/AI cards).
- Lets unused VRAM serve as swap instead of writing to SSD, potentially saving flash wear and exploiting otherwise wasted memory.
- Seen as especially attractive when workloads (gaming vs. “productivity” / heavy RAM use) are not concurrent, so VRAM can be repurposed when GPU is idle.
Performance and Implementation Concerns
- Reported throughput (~1.3 GB/s on a laptop RTX 3070) is far below theoretical PCIe/VRAM bandwidth.
- Thread attributes this to:
- User-space implementation using NBD, which is known to be relatively slow.
- Extra copies via a bounce buffer and many kernel/user context switches per 4K page.
- Limited queue depth and poor request coalescing with NBD.
- ZRAM compression overhead (though some think this is minor).
- Swapping to NVMe is described as a highly optimized, zero-copy, DMA-based path that can be faster in practice.
- Suggestions: move to ublk or a custom kernel block driver to reduce overhead and increase concurrency.
Hardware and Architectural Limits
- VRAM cannot simply be added to system RAM because most desktop GPUs are not cache-coherent with the CPU and must be mapped uncached or write-combined, making them very slow as “real” RAM.
- Datacenter GPUs and future CXL-style devices can offer coherency, but latency remains much higher than DRAM.
- Historical and alternative approaches exist (MTD/phram, vramfs, OpenCL/Vulkan-based ramdisks, Windows GpuRamDrive).
Swap Semantics and SSD Wear
- Some see VRAM swap as a way to avoid SSD wear; others argue real-world swap usage usually doesn’t meaningfully shorten SSD life.
- Broader debate on swap:
- Some configure large swap (often for suspend-to-disk) and rely on it to prevent abrupt OOM.
- Others prefer minimal or no swap, treating heavy swapping as equivalent to a “dead” system.
- One view: swap’s main role is fair reclamation between anonymous and file-backed memory, not emergency RAM.
Usability, Risks, and Power
- Concerns that using VRAM as swap may:
- Prevent GPU power-gating, hurting battery life on laptops.
- Compete with graphical or compute workloads and cause crashes if VRAM runs out (especially on dynamic Wayland compositors).
- Dynamic enabling/disabling of VRAM swap on GPU pressure is desired but not fully solved; flagged as a to-do area.
- Classic microkernel-style concern (swap daemon needing to page itself) is addressed by keeping its pages unswappable/pinned.