Llama 3.1

Model capabilities & benchmark results

  • Commenters highlight the 405B model as roughly competitive with GPT‑4o on several public benchmarks (MMLU, coding, math), and near top-tier in some user-run tests (e.g., NYT Connections, coding leaderboards).
  • The 8B and 70B variants show notable gains over Llama 3, especially on MMLU, and are seen as more practical for most users.
  • Some users report that GPT‑4o and Claude 3.5 still feel better in real coding and math tasks despite benchmark parity.
  • Benchmarks are widely treated with caution; LMSys ELO is mentioned as more reflective of “real world” usage, but it has its own limitations.

Hardware requirements & running locally

  • 405B is considered essentially out of reach for typical home hardware, even under 4–8 bit quantization; estimates include ~200 GB+ VRAM and multi‑GPU setups costing around $10k or more.
  • Suggestions include multi‑GPU PC builds, Mac Studio clusters over Thunderbolt with tools like Exo, and CPU-only options that would be extremely slow.
  • 8B and 70B models are commonly run locally via tools like Ollama, llama.cpp, and other frontends on single GPUs or high‑RAM Macs.
  • There is ongoing work and some friction around support for new architectures (e.g., ROPE changes).

Hosting, pricing & ecosystem

  • For serious use of 405B, commenters point to cloud providers (AWS, GCP, Azure), specialized inference platforms (Groq, Hyperbolic, Bedrock), and APIs embedded in products (WhatsApp, Meta AI, Poe, VSCode extensions).
  • Discussion notes that open models don’t automatically mean cheap inference; hosted Llama pricing is often compared to proprietary models.

“Open source” vs “open weights” debate

  • Strong disagreement over calling Llama “open source.”
  • Critics note license restrictions (certain commercial users, military/nuclear use, acceptable‑use clauses) and the absence of training datasets, arguing this breaks with traditional open‑source and open‑science norms.
  • Others argue that releasing weights plus code is still a major positive, and that strict semantic policing may discourage companies from opening anything.
  • Several propose “open weights” or “nearly-open source” as more accurate terms.

Meta’s strategy & competitive landscape

  • Multiple comments frame Meta’s releases as a “scorched earth” play to undercut proprietary labs by collapsing the base‑model moat.
  • There is debate over whether any training “secret sauce” exists or whether compute scale plus open weights will commoditize base models, shifting profit to applications and compute.
  • Some see heavy use of synthetic data for fine‑tuning as a key ingredient and a broader industry trend.

Regulation & regional access

  • Europeans report that Meta’s chat product isn’t available in the EU, likely due to GDPR and upcoming AI/DM rules.
  • Opinions split between viewing this as justified consumer protection vs. evidence that EU regulation slows access to new tech.