Why we no longer use LangChain for building our AI agents

Use of “agents” vs simple prompt flows

  • Several commenters argue many “agent” setups are just overcomplicated ways of doing what 2–3 sequential prompts plus a simple control loop can do.
  • Definitions of “agent” vary: non-deterministic workflows with autonomous termination criteria; LLMs outputting JSON tool calls; or multiple “characters” passing messages.
  • Some see real value where the LLM chooses actions in a loop (e.g., Voyager-like control loops, iterative RAG), others say agent hype has vastly outrun proven utility.

Critiques of LangChain

  • Common themes: excessive abstraction, “spaghetti” design, hard-to-debug pipelines, and poor/dated documentation.
  • Many found it easy for toy demos but painful once they needed customization (e.g., logprobs, nonstandard APIs, prompt tweaks, function calling details).
  • Abstractions often hide exactly what must be visible for effective prompt engineering and observability.
  • Some see it as overengineered for tasks that are “just string concatenation + HTTP calls + a loop,” and worry about lock‑in and difficulty removing it later.
  • There’s strong sentiment it was an early, now-mismatched abstraction from the pre-ChatGPT, completion-model era.

Perceived benefits and partial defenses

  • A minority report good experiences when:
    • Using LangChain primarily as a provider-agnostic layer to swap models/embeddings.
    • Treating it as a library of components (models, vector stores, parsers) rather than a full framework.
    • Using LangGraph to express flows as state machines; praised for low-level, controllable orchestration.
    • Leveraging LangSmith-style observability for tracing complex LLM flows.
  • Some argue frameworks are naturally opinionated; they help newcomers and prototypes, and being early in a new domain implies missteps and refactoring.

Abstractions, frameworks, and “good vs bad” design

  • Many say LLMs are still too immature for heavy frameworks; patterns aren’t stable, so abstractions quickly become wrong.
  • “Good” abstractions are said to handle cross-cutting concerns (telemetry, state, cost control, provider swapping), while “bad” ones try to hide prompts, intermediate steps, and control flow.
  • Comparisons are made to ORMs, web frameworks, and GraphQL: abstractions can be valuable, but only once the right level is understood.

Alternatives and lighter-weight tools

  • Frequently mentioned: Microsoft Semantic Kernel, LlamaIndex, LiteLLM, Vercel AI SDK, simple “strategy pattern” wrappers, OpenAI-compatible gateways, Instructor, Burr/Hamilton, Langroid, txtai, and homegrown minimal code.
  • Many teams report ripping out LangChain and replacing it with thin, explicit wrappers, smaller dependencies, and clearer code.