Apertus – Open Foundation Model for Sovereign AI

Overall focus

  • Discussion centers on Apertus as a “fully open” model (weights, data, training recipes) aimed at European/Swiss “sovereign AI” and whether that matters given its current capability.

Openness, pipelines, and “SOTA”

  • Many value Apertus for being genuinely open: open weights, open data, full training pipeline.
  • Some argue true “state of the art” should mean models that can be inspected and replicated, not closed “cutting-edge” systems from frontier labs.
  • Others maintain that frontier lab models remain the real performance SOTA, regardless of openness.

Model quality and practical use

  • Earlier Apertus versions were described as “pretty bad”; some testing suggests the new ones are still not competitive with top models.
  • Users report it’s workable as a backbone for RAG and some agents (e.g., legal consulting, translation), but not yet “agentic” or frontier-level.
  • Weaknesses include hallucinations in multilingual tasks and basic language questions (e.g., conjugations, word spellings).

Training data, copyright, and ethics

  • Apertus uses FineWeb/Common Crawl; some criticize this as unlicensed scraping that contradicts “copyright-compliant” marketing.
  • Others argue scraping public web data for training is legal and that expanding copyright here would be harmful.
  • There’s demand for a “vegan” model trained only on licensed or public-domain data for ethical reasons.

Sovereign AI, geopolitics, and data locality

  • Strong theme: countries (especially in Europe) need their own AI capabilities to avoid dependence on US or Chinese tech, given concerns about US rule of law, surveillance, export controls, and political instability.
  • Some see Apertus and similar projects as capability-building more than immediate model competitiveness.
  • Debate over which jurisdictions are safest for data (US vs EU vs Switzerland vs Nordics) and whether any country is truly “safe.”

Comparison with other open models

  • Other fully open or near-open pipelines mentioned: OLMo 3.1, K2 Think V2, Nvidia Nemotron, plus strong Chinese models (GLM, DeepSeek, Qwen).
  • Consensus that Nemotron and several Chinese models currently outperform Apertus; some users prefer them in production.

Local vs service models and UX

  • Several argue the real near-term battleground is local vs hosted LLMs, not just open vs closed.
  • Local models are already “good enough” for many tasks, but tooling and UX are confusing and fragmented.
  • Concern that poor local UX is pushing users toward centralized, closed services, reducing digital autonomy.

Compute, licensing, and compliance

  • Claim that “the Swiss have no GPUs” is refuted by references to the Alps supercomputer with thousands of Grace-Hopper chips.
  • License includes a novel mechanism: periodically downloading a hash-based filter to remove personal data from outputs based on deletion requests; unclear how sustainable this is.
  • Some see Apertus mainly serving European compliance/sovereignty requirements rather than chasing peak benchmark scores.