Llama 3.1
Model capabilities & benchmark results
- Commenters highlight the 405B model as roughly competitive with GPT‑4o on several public benchmarks (MMLU, coding, math), and near top-tier in some user-run tests (e.g., NYT Connections, coding leaderboards).
- The 8B and 70B variants show notable gains over Llama 3, especially on MMLU, and are seen as more practical for most users.
- Some users report that GPT‑4o and Claude 3.5 still feel better in real coding and math tasks despite benchmark parity.
- Benchmarks are widely treated with caution; LMSys ELO is mentioned as more reflective of “real world” usage, but it has its own limitations.
Hardware requirements & running locally
- 405B is considered essentially out of reach for typical home hardware, even under 4–8 bit quantization; estimates include ~200 GB+ VRAM and multi‑GPU setups costing around $10k or more.
- Suggestions include multi‑GPU PC builds, Mac Studio clusters over Thunderbolt with tools like Exo, and CPU-only options that would be extremely slow.
- 8B and 70B models are commonly run locally via tools like Ollama, llama.cpp, and other frontends on single GPUs or high‑RAM Macs.
- There is ongoing work and some friction around support for new architectures (e.g., ROPE changes).
Hosting, pricing & ecosystem
- For serious use of 405B, commenters point to cloud providers (AWS, GCP, Azure), specialized inference platforms (Groq, Hyperbolic, Bedrock), and APIs embedded in products (WhatsApp, Meta AI, Poe, VSCode extensions).
- Discussion notes that open models don’t automatically mean cheap inference; hosted Llama pricing is often compared to proprietary models.
“Open source” vs “open weights” debate
- Strong disagreement over calling Llama “open source.”
- Critics note license restrictions (certain commercial users, military/nuclear use, acceptable‑use clauses) and the absence of training datasets, arguing this breaks with traditional open‑source and open‑science norms.
- Others argue that releasing weights plus code is still a major positive, and that strict semantic policing may discourage companies from opening anything.
- Several propose “open weights” or “nearly-open source” as more accurate terms.
Meta’s strategy & competitive landscape
- Multiple comments frame Meta’s releases as a “scorched earth” play to undercut proprietary labs by collapsing the base‑model moat.
- There is debate over whether any training “secret sauce” exists or whether compute scale plus open weights will commoditize base models, shifting profit to applications and compute.
- Some see heavy use of synthetic data for fine‑tuning as a key ingredient and a broader industry trend.
Regulation & regional access
- Europeans report that Meta’s chat product isn’t available in the EU, likely due to GDPR and upcoming AI/DM rules.
- Opinions split between viewing this as justified consumer protection vs. evidence that EU regulation slows access to new tech.