Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

Benchmarks & Evaluation

  • Current benchmarks measure retrieval quality (e.g., NDCG), not end‑to‑end agent performance.
  • Some commenters argue this is “the wrong thing” to optimize, since what matters is whether agents finish tasks faster/cheaper with equal or better quality.
  • Others share small, informal agent evals: Semble sometimes saves context tokens, but can increase latency or produce only marginal cost improvements.
  • There are calls for open, reproducible agent benchmarks (including harness configuration) and full-session cost/quality metrics.

Token Savings vs Grep

  • The “98% fewer tokens” claim is clarified as comparing the common grep + readfile(cat) loop versus Semble’s smaller targeted snippets.
  • Several note that grep itself is token‑free; the cost comes from agents reading large file chunks or entire files.
  • Some argue well‑prompted agents already use grep -C N or selective reads, making the savings less extreme; others say agents often just cat whole files in practice.

Agent Integration, Trust & Behavior

  • Many LLMs are heavily trained on grep/rg and may distrust or over-query new tools, negating theoretical savings.
  • People discuss using hooks, memory files (e.g., AGENTS.md/CLAUDE.md), and explicit instructions to push models toward Semble or LSPs.
  • Reports of MCP/CLI integration issues include hanging processes, connection errors, and agents redundantly combining Semble with ripgrep.
  • There is concern that extra tools can make agents “dumber” by encouraging aggressive, shallow searching and more turns.

Comparisons to Other Tools

  • Compared conceptually or anecdotally with: ripgrep, LSPs, RTK, Headroom, context‑mode, Serena, codebase‑memory‑mcp, CK, cs, Cursor indexing, and ck‑style structured search.
  • Some users report Semble indexing dramatically faster and returning more relevant code than CK on large repos.
  • Others prefer LSP‑based navigation for refactors and type‑aware analysis, seeing Semble as complementary.

Performance, Design & Scope

  • Indexing is reported as very fast; chunking uses tree‑sitter; models are trained on several languages but claimed to generalize more widely.
  • Implemented in Python for familiarity, despite comments wishing for Rust/Go.
  • Tool is local, deterministic, and aims to do “one thing: fast semantic code search.”

Broader Concerns & Alternatives

  • Suggestions to measure additional metrics like correction-loop frequency and end‑to‑end session tokens/time.
  • Some argue that structured project docs (e.g., a curated PROJECT.md) or whole‑repo dumps for small projects can rival or beat specialized search in practice.
  • Security concerns focus on supply‑chain risks; maintainers emphasize local‑only behavior and minimal dependencies, but acknowledge transitive risks remain.