2024-12-28

All You Need Is 4x 4090 GPUs to Train Your Own Model

Hardware capabilities & training scale

4× RTX 4090 (24 GB each) gives 96 GB total VRAM; enough to train LLMs from scratch up to ~1B parameters per the author.
Other commenters argue 96 GB should support full fine-tuning of models up to ~5B parameters with techniques like gradient checkpointing.
Author reports ~7 days to train a 500M-parameter model on 100B tokens.
Parallelism is typically done via Distributed Data Parallel (DDP), not VRAM pooling via NVLink (which 4090s don’t have).

Cost, cloud vs on-prem, and ROI

4× 4090s are framed as “all you need” but also “and ~$12k” in hardware, plus an expensive CPU/motherboard with enough PCIe lanes.
Some argue renting 4× 4090 instances is cheaper and more flexible (roughly <$500 for a ~10-day train).
Others note capex vs opex tradeoffs, resale value of GPUs, and desire to learn low-level quirks as reasons to own hardware.
GPU rental market is described as crowded, with competition, varying integrity, and occasional “scammy” behavior (e.g., overselling hardware).

Power, cooling, and electrical requirements

With ~450 W per 4090 and dual 1500 W PSUs, total draw can approach 3 kW.
Several comments insist a dedicated 20–30 A circuit is effectively required, especially in US homes.
Discussion compares US vs EU/UK circuits, emphasizing total wattage limits and fire risk from overcurrent.

Model, data, and software-side questions

Many readers are more interested in what can realistically be trained and the data/curation process than in the rig itself.
Training data suggestions include starting from FineWebEdu.
Some ask for examples of model outputs and more detail on post-training methods (e.g., RL/RLHF).

Alternatives & tradeoffs

Suggestions include used A100s, 3090s, lower-end 40-series (4060/4070 Ti), Tesla P40s, or just waiting for 5090 (32 GB VRAM).
Objections to 3090s/M4 minis/Apple Silicon: older architectures, weaker memory bandwidth, limited training support vs CUDA.

Article quality & AI co-authorship

Multiple commenters feel parts of the article read like AI-generated marketing copy, especially references to gaming features (e.g., DLSS 3).
Author confirms AI “co-authored” text; some readers find this off-putting and prefer purely human-written explanations.

Related topics