2025-01-12

Tabby: Self-hosted AI coding assistant

Overview of Tabby and Capabilities

Self-hosted AI coding assistant with code completion and “codebase chat,” positioned as an on‑prem / team platform (SSO, access control, auth).
Marketed as one of the few fully self-service on-prem options, with adopters saying performance is competitive with hosted tools.
Built-in RAG/doc integration so it can be taught unfamiliar API frameworks via documentation ingestion.

Hardware, Models, and Performance

Supports Nvidia (CUDA), AMD (via Vulkan), and Apple Silicon; Macs are “OK for individual use” but not ideal for multi‑user servers.
Rule-of-thumb: ~1 GB RAM per 1B parameters (less with heavy quantization). Context length also drives memory needs.
Tiny models (1–3B) are “dumb” for conversational coding but fine for tab completions; 7–70B open models can surpass GPT‑4o‑mini for coding if hardware permits.
Single‑GPU only by default; multi‑GPU use suggested via external backends like vLLM and OpenAI-compatible endpoints.

Deployment, IDE Support, and Alternatives

Designed primarily for shared servers but can run on powerful personal machines or in Docker on‑prem.
Community notes Eclipse client exists but is not prominently documented; requests for VS2022, Sublime, Zed, MSVC support.
Comparisons with other local setups (Ollama + Continue.dev, Twinny) highlight trade‑offs in ease of use, hardware, and licensing.

Telemetry, Licensing, and Business Model

Community Edition collects non‑toggleable IDE/extension telemetry, limited to hardware and model metadata per shared struct.
Confusion over “open source but up to 5 users” pricing; others clarify that open source does not mean cost-free for all uses and point to the license.

Code Quality, Skill Development, and Determinism

Many worry LLMs generate “junior-level” or inefficient code, and that blind acceptance may stall developer growth.
Counterpoints:
- LLMs can accelerate capable devs and serve as a new abstraction layer, similar to moving from assembly to high-level languages.
- Poor code quality self-corrects through tests, debugging, and maintenance pressures.
Long subthread on determinism: traditional compilers vs stochastic LLMs, temperature/seed control, and whether nondeterminism is acceptable for production code.

Critiques of Company Practices

One commenter reports an unpaid, multi‑round, take‑home–heavy interview ending in ghosting, sparking broader criticism of such hiring processes as disrespectful and a red flag.

Related topics