Use your Nvidia GPU's VRAM as swap space on Linux

Motivation and Use Cases

  • Intended for machines with limited, non-upgradable RAM but relatively large, often-idle Nvidia VRAM (e.g., laptops in hybrid GPU mode, desktops with big gaming/AI cards).
  • Lets unused VRAM serve as swap instead of writing to SSD, potentially saving flash wear and exploiting otherwise wasted memory.
  • Seen as especially attractive when workloads (gaming vs. “productivity” / heavy RAM use) are not concurrent, so VRAM can be repurposed when GPU is idle.

Performance and Implementation Concerns

  • Reported throughput (~1.3 GB/s on a laptop RTX 3070) is far below theoretical PCIe/VRAM bandwidth.
  • Thread attributes this to:
    • User-space implementation using NBD, which is known to be relatively slow.
    • Extra copies via a bounce buffer and many kernel/user context switches per 4K page.
    • Limited queue depth and poor request coalescing with NBD.
    • ZRAM compression overhead (though some think this is minor).
  • Swapping to NVMe is described as a highly optimized, zero-copy, DMA-based path that can be faster in practice.
  • Suggestions: move to ublk or a custom kernel block driver to reduce overhead and increase concurrency.

Hardware and Architectural Limits

  • VRAM cannot simply be added to system RAM because most desktop GPUs are not cache-coherent with the CPU and must be mapped uncached or write-combined, making them very slow as “real” RAM.
  • Datacenter GPUs and future CXL-style devices can offer coherency, but latency remains much higher than DRAM.
  • Historical and alternative approaches exist (MTD/phram, vramfs, OpenCL/Vulkan-based ramdisks, Windows GpuRamDrive).

Swap Semantics and SSD Wear

  • Some see VRAM swap as a way to avoid SSD wear; others argue real-world swap usage usually doesn’t meaningfully shorten SSD life.
  • Broader debate on swap:
    • Some configure large swap (often for suspend-to-disk) and rely on it to prevent abrupt OOM.
    • Others prefer minimal or no swap, treating heavy swapping as equivalent to a “dead” system.
    • One view: swap’s main role is fair reclamation between anonymous and file-backed memory, not emergency RAM.

Usability, Risks, and Power

  • Concerns that using VRAM as swap may:
    • Prevent GPU power-gating, hurting battery life on laptops.
    • Compete with graphical or compute workloads and cause crashes if VRAM runs out (especially on dynamic Wayland compositors).
  • Dynamic enabling/disabling of VRAM swap on GPU pressure is desired but not fully solved; flagged as a to-do area.
  • Classic microkernel-style concern (swap daemon needing to page itself) is addressed by keeping its pages unswappable/pinned.