Open-R1: an open reproduction of DeepSeek-R1
What “open” means for R1 and Open-R1
- DeepSeek-R1 is seen as “open-weights” only: weights are public, but training code and datasets are not.
- Several commenters argue a truly reproducible “open” model needs at least code + data, ideally also weights; others say expecting full datasets is unrealistic given legal and competitive risks.
- Open-R1’s goal is explicitly to rebuild the missing pieces (recipes, code, data) so others can train similar or better reasoning models, not just use DeepSeek’s weights.
Compute, cost, and feasibility of reproduction
- Confusion over the widely quoted ~$5.5M figure: some clarify this was for DeepSeek V3 base model, not R1 reasoning tuning.
- R1 reportedly used ~800k samples for reinforcement learning, leading some to think the “R1 trick” could be comparatively cheap once a strong base model exists.
- Skepticism remains about whether Open-R1 can match R1’s performance without comparable resources or hidden tricks.
Datasets, legality, and “knowledge laundering”
- Many believe no major lab will release raw training data due to copyright and terms-of-service liability, plus competitive advantage.
- One discussion describes a multi-step scheme: train on copyrighted data, generate synthetic data, then train a new model on that—framed as “knowledge laundering.”
- There is interest in fully open datasets (e.g., Allen Institute work, RedPajama), and proposals for a decentralized, deduplicated, community-maintained training-data archive.
Geopolitics, censorship, and trust
- Debate over whether Chinese models are especially untrustworthy or just differently “massaged” compared to US/European models.
- Some point out Western models are also heavily aligned and censored (especially around sexuality, politics, and safety topics).
- A few commenters “trust” some Western labs slightly more on political independence, but others argue US tech firms also “bend the knee” to power.
Open source vs big tech framing
- Several see DeepSeek and projects like Open-R1 as part of a broader battle: heavily-capitalized US incumbents vs open or non‑US efforts, not simply “US vs China.”
- Others push back on romanticizing open models as “gifts” or morally superior, and emphasize precise terminology (“open source” vs “open weights”).
Other domains for RL with verifiable rewards
- Suggested areas: law (case outcomes, codes), medical diagnosis (test results, outcomes), stochastic processes, robotics and chip design with simulators, RFP responses, management consulting, and any domain with good simulators or automated checks.
AI hype vs early web nostalgia
- Some compare today’s rapid AI progress to the early web or Web 2.0—continuous excitement, but with faster information flow now.
- Others express burnout: generative AI is seen as flooding the internet with low-quality content and undermining human connection.
Status of Open-R1 and criticism of the announcement
- Multiple readers stress this is just an announcement of an effort, not a working R1 reproduction; some call the headline misleading without evaluation numbers.
- Nonetheless, many welcome an independent, open attempt to replicate DeepSeek’s reasoning methods.
Security and backdoor concerns in local LLMs
- Worry that people are now “running anything,” reminiscent of the early Windows/Internet era.
- While runtimes like Ollama/llama.cpp are likened to relatively safe interpreters, commenters note that models used as agents—with tool and code-execution access—could, in theory, be trained to trigger hidden behaviors (date‑based or keyword‑based attacks).
- No concrete backdoor examples are given; risk is discussed as a plausible future vector, especially for widely adopted “open” models.
Crowdsourcing and how to help
- Some ask how to contribute data or effort; suggestions include crowdsourcing domain-specific data (e.g., local-language stories, speech), BOINC‑style distributed training, and building shared infrastructure for open datasets.
- One commenter half-jokingly says “we don’t need human help anymore, we have DeepSeek,” reflecting both excitement and anxiety.