Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model
Allegations About the Rio-3.5 Model
- Rio’s municipal IT arm released “Rio-3.5-Open-397B,” presented as a homegrown post-trained model based on Qwen3.5-397B with strong benchmark results, especially in Portuguese.
- A detailed GitHub issue argues the model is actually a simple merge: ~60% Nex-N2 Pro + ~40% base Qwen3.5-397B-A17B, with no evidence of further post-training or distillation.
- The merged model reportedly responds with the source model’s name and reproduces fine-tuned blurbs from Nex, strengthening the “repackaged merge” claim.
Rio Team’s Response and Transparency Concerns
- The Hugging Face model card was later updated to admit it is a merge of Nex and Qwen plus “on-policy distillation,” and to apologize, claiming the wrong checkpoint (pre-distillation) was uploaded.
- Critics doubt this explanation, noting that the allegedly correct model has not (as of the discussion) been uploaded, and that affiliations on the HF page were edited/removed after the controversy.
- Some see this as misrepresentation of lab capabilities and possible misuse of public funds; others argue intent is unclear and want to wait for a new release and third-party verification.
Technical Discussion: Model Merging
- Commenters note that merging works when models share architecture; here, Nex is itself a Qwen3.5 finetune, so linear interpolation of weights is feasible.
- A simple scheme is described: each weight tensor in the merged model is a weighted average of corresponding tensors from the two source models.
- This technique has precedent (“Frankenstein models”) in both language and image models (e.g., Stable Diffusion), with mixed real-world benefits: often modest gains on specific benchmarks, degradation elsewhere.
- Discussion touches on “linear mode connectivity,” robustness of large models to such merges, and the broader practice of surgically editing weights (merging, “abliteration,” etc.).
Political, Ethical, and Meta Reactions
- Some frame this as a “taxpayer-funded scam”; others note ambiguity about funding, though a public official publicly claimed public money was used.
- There is debate over state vs private role in AI: some favor national/local capability for sovereignty; others see it as state overreach in an area better left to industry.
- Several comments generalize this to a wider pattern of hype, over-claiming, and “marketing-first” behavior across the AI industry.
- A side thread criticizes the use of GitHub issues as public call-out venues and laments a perceived decline in HN discussion quality.