All You Need Is 4x 4090 GPUs to Train Your Own Model
Hardware capabilities & training scale
- 4× RTX 4090 (24 GB each) gives 96 GB total VRAM; enough to train LLMs from scratch up to ~1B parameters per the author.
- Other commenters argue 96 GB should support full fine-tuning of models up to ~5B parameters with techniques like gradient checkpointing.
- Author reports ~7 days to train a 500M-parameter model on 100B tokens.
- Parallelism is typically done via Distributed Data Parallel (DDP), not VRAM pooling via NVLink (which 4090s don’t have).
Cost, cloud vs on-prem, and ROI
- 4× 4090s are framed as “all you need” but also “and ~$12k” in hardware, plus an expensive CPU/motherboard with enough PCIe lanes.
- Some argue renting 4× 4090 instances is cheaper and more flexible (roughly <$500 for a ~10-day train).
- Others note capex vs opex tradeoffs, resale value of GPUs, and desire to learn low-level quirks as reasons to own hardware.
- GPU rental market is described as crowded, with competition, varying integrity, and occasional “scammy” behavior (e.g., overselling hardware).
Power, cooling, and electrical requirements
- With ~450 W per 4090 and dual 1500 W PSUs, total draw can approach 3 kW.
- Several comments insist a dedicated 20–30 A circuit is effectively required, especially in US homes.
- Discussion compares US vs EU/UK circuits, emphasizing total wattage limits and fire risk from overcurrent.
Model, data, and software-side questions
- Many readers are more interested in what can realistically be trained and the data/curation process than in the rig itself.
- Training data suggestions include starting from FineWebEdu.
- Some ask for examples of model outputs and more detail on post-training methods (e.g., RL/RLHF).
Alternatives & tradeoffs
- Suggestions include used A100s, 3090s, lower-end 40-series (4060/4070 Ti), Tesla P40s, or just waiting for 5090 (32 GB VRAM).
- Objections to 3090s/M4 minis/Apple Silicon: older architectures, weaker memory bandwidth, limited training support vs CUDA.
Article quality & AI co-authorship
- Multiple commenters feel parts of the article read like AI-generated marketing copy, especially references to gaming features (e.g., DLSS 3).
- Author confirms AI “co-authored” text; some readers find this off-putting and prefer purely human-written explanations.