Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete

Next-edit vs FIM and use cases

  • Commenters clarify that FIM = “fill-in-the-middle”: the model sees both prefix and suffix and fills the gap.
  • “Next-edit” is framed as a more editing-oriented autocomplete: focused on what changes next in the current file, not generic code generation.
  • Several users are keen to test it in editors (Sublime, VSCode, Neovim, Zed, Emacs) specifically for inline/tab-complete, not chat.

Training approach, RL, and syntax vs semantics

  • A question compares RL fine-tuning to constrained decoding (e.g., grammar-based decoding) for enforcing syntax.
  • Responses argue constrained decoding mainly enforces syntax, not semantics or compiler correctness, and doesn’t improve the base model.
  • RL can jointly reward syntax, parse correctness, and compilation success, and “pushes” the model to learn better habits.
  • Also noted: constrained decoding is limited to CFG-like grammars and often harms quality because it forces off-policy decoding.

Model quality, base models, and sizes

  • Derived from Qwen2.5 coder; Qwen3 reportedly underperforms on their benchmark due to missing FIM/autocomplete pretraining.
  • Claims that Sweep 1.5B significantly beats Qwen2.5 Coder 1.5B on their benchmark; an internal 7B model is said to be much stronger but not released.
  • Some users impressed by 1.5B performance (even used for simple chat/blog text); others find quality “fine but not amazing” and wish for 10–20B variants.

Local deployment, hardware, and tooling

  • 1.5B is small enough for CPU-only and consumer hardware; people report good speed on M-series Macs and with LM Studio, llama.cpp, Ollama.
  • Debate over whether Raspberry Pi is actually usable, given prompt prefill vs token generation tradeoffs.
  • Multiple config snippets shared for using it in Zed, Neovim, and VSCode; JetBrains plugin currently uses a hosted larger model, not local weights.

Autocomplete vs agentic tools and IDE ecosystem

  • Strong sentiment that high-quality autocomplete is a “must-have” and often more useful than heavy agents for developers writing new code.
  • Some criticize JetBrains’ AI offering as late and underwhelming, pushing them toward VSCode or other tools; others still value JetBrains but feel the IDEs are stagnating.
  • Several users are excited that open-weight, small, specialized models may reduce dependence on Copilot/Cursor-style paid services.

Openness, data, and future directions

  • One thread challenges calling this “open source” without training data; consensus that it’s “open-weights,” not fully open.
  • Interest in: how next-edit training data from repos was built; genetic algorithm used for prompt/templates; possibilities for user fine-tuning on specific stacks.
  • Broader excitement about democratized training of small, task-specific models and concerns that big labs over-optimize benchmarks instead of usability.