2024-06-20

Why we no longer use LangChain for building our AI agents

Use of “agents” vs simple prompt flows

Several commenters argue many “agent” setups are just overcomplicated ways of doing what 2–3 sequential prompts plus a simple control loop can do.
Definitions of “agent” vary: non-deterministic workflows with autonomous termination criteria; LLMs outputting JSON tool calls; or multiple “characters” passing messages.
Some see real value where the LLM chooses actions in a loop (e.g., Voyager-like control loops, iterative RAG), others say agent hype has vastly outrun proven utility.

Critiques of LangChain

Common themes: excessive abstraction, “spaghetti” design, hard-to-debug pipelines, and poor/dated documentation.
Many found it easy for toy demos but painful once they needed customization (e.g., logprobs, nonstandard APIs, prompt tweaks, function calling details).
Abstractions often hide exactly what must be visible for effective prompt engineering and observability.
Some see it as overengineered for tasks that are “just string concatenation + HTTP calls + a loop,” and worry about lock‑in and difficulty removing it later.
There’s strong sentiment it was an early, now-mismatched abstraction from the pre-ChatGPT, completion-model era.

Perceived benefits and partial defenses

A minority report good experiences when:
- Using LangChain primarily as a provider-agnostic layer to swap models/embeddings.
- Treating it as a library of components (models, vector stores, parsers) rather than a full framework.
- Using LangGraph to express flows as state machines; praised for low-level, controllable orchestration.
- Leveraging LangSmith-style observability for tracing complex LLM flows.
Some argue frameworks are naturally opinionated; they help newcomers and prototypes, and being early in a new domain implies missteps and refactoring.

Abstractions, frameworks, and “good vs bad” design

Many say LLMs are still too immature for heavy frameworks; patterns aren’t stable, so abstractions quickly become wrong.
“Good” abstractions are said to handle cross-cutting concerns (telemetry, state, cost control, provider swapping), while “bad” ones try to hide prompts, intermediate steps, and control flow.
Comparisons are made to ORMs, web frameworks, and GraphQL: abstractions can be valuable, but only once the right level is understood.

Alternatives and lighter-weight tools

Frequently mentioned: Microsoft Semantic Kernel, LlamaIndex, LiteLLM, Vercel AI SDK, simple “strategy pattern” wrappers, OpenAI-compatible gateways, Instructor, Burr/Hamilton, Langroid, txtai, and homegrown minimal code.
Many teams report ripping out LangChain and replacing it with thin, explicit wrappers, smaller dependencies, and clearer code.

Related topics