Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep
Benchmarks & Evaluation
- Current benchmarks measure retrieval quality (e.g., NDCG), not end‑to‑end agent performance.
- Some commenters argue this is “the wrong thing” to optimize, since what matters is whether agents finish tasks faster/cheaper with equal or better quality.
- Others share small, informal agent evals: Semble sometimes saves context tokens, but can increase latency or produce only marginal cost improvements.
- There are calls for open, reproducible agent benchmarks (including harness configuration) and full-session cost/quality metrics.
Token Savings vs Grep
- The “98% fewer tokens” claim is clarified as comparing the common
grep + readfile(cat)loop versus Semble’s smaller targeted snippets. - Several note that grep itself is token‑free; the cost comes from agents reading large file chunks or entire files.
- Some argue well‑prompted agents already use
grep -C Nor selective reads, making the savings less extreme; others say agents often justcatwhole files in practice.
Agent Integration, Trust & Behavior
- Many LLMs are heavily trained on grep/rg and may distrust or over-query new tools, negating theoretical savings.
- People discuss using hooks, memory files (e.g., AGENTS.md/CLAUDE.md), and explicit instructions to push models toward Semble or LSPs.
- Reports of MCP/CLI integration issues include hanging processes, connection errors, and agents redundantly combining Semble with ripgrep.
- There is concern that extra tools can make agents “dumber” by encouraging aggressive, shallow searching and more turns.
Comparisons to Other Tools
- Compared conceptually or anecdotally with: ripgrep, LSPs, RTK, Headroom, context‑mode, Serena, codebase‑memory‑mcp, CK,
cs, Cursor indexing, andck‑style structured search. - Some users report Semble indexing dramatically faster and returning more relevant code than CK on large repos.
- Others prefer LSP‑based navigation for refactors and type‑aware analysis, seeing Semble as complementary.
Performance, Design & Scope
- Indexing is reported as very fast; chunking uses tree‑sitter; models are trained on several languages but claimed to generalize more widely.
- Implemented in Python for familiarity, despite comments wishing for Rust/Go.
- Tool is local, deterministic, and aims to do “one thing: fast semantic code search.”
Broader Concerns & Alternatives
- Suggestions to measure additional metrics like correction-loop frequency and end‑to‑end session tokens/time.
- Some argue that structured project docs (e.g., a curated PROJECT.md) or whole‑repo dumps for small projects can rival or beat specialized search in practice.
- Security concerns focus on supply‑chain risks; maintainers emphasize local‑only behavior and minimal dependencies, but acknowledge transitive risks remain.