2026-01-21

Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete

Next-edit vs FIM and use cases

Commenters clarify that FIM = “fill-in-the-middle”: the model sees both prefix and suffix and fills the gap.
“Next-edit” is framed as a more editing-oriented autocomplete: focused on what changes next in the current file, not generic code generation.
Several users are keen to test it in editors (Sublime, VSCode, Neovim, Zed, Emacs) specifically for inline/tab-complete, not chat.

Training approach, RL, and syntax vs semantics

A question compares RL fine-tuning to constrained decoding (e.g., grammar-based decoding) for enforcing syntax.
Responses argue constrained decoding mainly enforces syntax, not semantics or compiler correctness, and doesn’t improve the base model.
RL can jointly reward syntax, parse correctness, and compilation success, and “pushes” the model to learn better habits.
Also noted: constrained decoding is limited to CFG-like grammars and often harms quality because it forces off-policy decoding.

Model quality, base models, and sizes

Derived from Qwen2.5 coder; Qwen3 reportedly underperforms on their benchmark due to missing FIM/autocomplete pretraining.
Claims that Sweep 1.5B significantly beats Qwen2.5 Coder 1.5B on their benchmark; an internal 7B model is said to be much stronger but not released.
Some users impressed by 1.5B performance (even used for simple chat/blog text); others find quality “fine but not amazing” and wish for 10–20B variants.

Local deployment, hardware, and tooling

1.5B is small enough for CPU-only and consumer hardware; people report good speed on M-series Macs and with LM Studio, llama.cpp, Ollama.
Debate over whether Raspberry Pi is actually usable, given prompt prefill vs token generation tradeoffs.
Multiple config snippets shared for using it in Zed, Neovim, and VSCode; JetBrains plugin currently uses a hosted larger model, not local weights.

Autocomplete vs agentic tools and IDE ecosystem

Strong sentiment that high-quality autocomplete is a “must-have” and often more useful than heavy agents for developers writing new code.
Some criticize JetBrains’ AI offering as late and underwhelming, pushing them toward VSCode or other tools; others still value JetBrains but feel the IDEs are stagnating.
Several users are excited that open-weight, small, specialized models may reduce dependence on Copilot/Cursor-style paid services.

Openness, data, and future directions

One thread challenges calling this “open source” without training data; consensus that it’s “open-weights,” not fully open.
Interest in: how next-edit training data from repos was built; genetic algorithm used for prompt/templates; possibilities for user fine-tuning on specific stacks.
Broader excitement about democratized training of small, task-specific models and concerns that big labs over-optimize benchmarks instead of usability.

Related topics