2025-01-07

How I program with LLMs

When to Use LLMs & How to Trust Them

Strong theme: only use LLMs where you can verify or test the output.
One camp: “don’t use them for what you don’t know how to do”; others soften this to “don’t use them where you can’t validate.”
Many treat LLMs like a fast “intern”: good for drafts, but everything must be reviewed, tested, and often rewritten.
High‑risk domains (security, crypto, infra config, auth) are widely seen as inappropriate for blind LLM use.

Coding Workflows: Autocomplete, Search, Chat-Driven

Autocomplete: some claim 2–3x productivity, especially for boilerplate and repetitive patterns; others find it distracting or error‑prone and turn it off.
Search: LLM chat used as “smart Stack Overflow,” especially for error messages, obscure APIs, and navigating large/complex docs; many say web search has worsened.
Chat-driven programming works well for prototypes, glue code, and unfamiliar SDKs, but often degenerates into messy, redundant, or subtly buggy code that needs cleanup.

Tooling & IDE Integration

Tools like Cursor, Aider, Continue, Codeium, Copilot, and editor plugins (VS Code, JetBrains, Emacs) are heavily discussed.
Desiderata:
- Tight integration with VCS (per-command commits, easy rollback).
- Clear diffs and multi-file “agent mode” review workflows.
- Ability to run tests/linters automatically and feed failures back to the model.
Some prefer using LLMs only in the browser/scratch files to keep interactions bounded and explicit.

Security, Privacy & IP

Some companies have strict “no AI” policies over fears of code exfiltration, regulatory/contractual breaches, and licensing contamination.
Others note enterprises already trust many SaaS vendors with source, and LLM vendors offer non-training, “enterprise” or self‑hosted options.
There is concern about models regurgitating GPL or proprietary code and about competitors learning from leaked “secret sauce.”

Effects on Skills, Juniors & Learning

Worry: juniors may copy LLM code without real understanding, leading to fragile systems and unspotted security issues.
Counterpoint: LLMs are powerful tutors; they can accelerate learning of languages, libraries, and concepts when users actively interrogate and verify.
Several note that effective use correlates with strong communication skills and existing domain expertise.

Effectiveness, Limits & Domains

Works best for: glue code, wrappers, scripting, types, boilerplate, tests, CLI utilities, one‑off tools, and exploring new APIs.
Struggles with: large legacy codebases, complex refactors, novel algorithms, performance-sensitive or concurrent code, and big-context reasoning.
Context window limits and hallucinations remain major pain points; careful prompting, decomposition, and documentation for the LLM help but don’t eliminate issues.

Future Directions & Open Questions

Hoped-for advances: whole‑codebase refactoring, better handling of huge contexts, integrated test‑/model‑checking, and domain‑specific models (e.g. per language or SDK).
Some foresee more DSLs and language experimentation; others expect adoption barriers for languages underrepresented in training data.
Overall sentiment: big productivity gains for certain workflows, but far from a universal or fully trustworthy replacement for experienced engineers.

Related topics