A messy experiment that changed how I think about AI code analysis

Perceived Contribution of the Technique

  • Many find the core idea useful: pre-structure the codebase, add higher-level context, and then have the LLM reason about code like a more experienced reviewer.
  • Several note this mirrors how good human reviewers triage: understand architecture and impact first, then inspect details.
  • Some see it as an example of “domain-specific chain-of-thought” prompting applied to code analysis.

Prompting, Planning, and Agentic Workflows

  • Multiple commenters already ask models to “plan first, code later” and explicitly forbid code generation until an architecture or approach is agreed.
  • Existing tools (AI IDEs, coding agents, code search systems) already implement variants of:
    • Architecture discussion / project mapping
    • Context gathering via code search and call graphs
    • Multi-file editing, build/test/deploy loops

Context, Structure, and Transformer Behavior

  • Strong agreement that more and better-curated context dramatically improves LLM outputs.
  • Some push back on “context first” language, arguing transformers see the whole window at once; others respond that ordering and scaffolding still influence behavior via prompting.

Skepticism: Missing Details, Evaluation, and Hype

  • Key functions in the article (file grouping, context extraction) are omitted, leading to accusations of “jazz hands” and hidden “secret sauce.”
  • Repeated calls for:
    • Actual source code, not just narratives
    • Benchmarks on diverse, realistic codebases and PRs
    • Metrics for correctness and significance, not just impressive anecdotes
  • Several see the tone as marketing-adjacent, typical of AI-hype content.

Junior vs Senior Analogy and Anthropomorphism

  • Heated debate over the claim that juniors read code linearly; some say this matches their early experience, others call it unrealistic and condescending.
  • The text was edited mid-thread, prompting questions about narrative reliability.
  • Many dislike anthropomorphizing AI as a “senior developer,” seeing it as misleading framing.

Real-World Use of Coding Assistants

  • Some report substantial productivity gains using tools like AI IDEs and agents for:
    • Boilerplate, stories, tests, localization, and documentation
    • Large-scale but low-conceptual work across many files
  • Others emphasize that such output still needs human review and often contains duplication or suboptimal patterns.

Limits: Hallucinations, APIs, and Verification

  • Concern that the showcased example may involve hallucinated details (e.g., fabricated PR references).
  • Common frustrations:
    • Invented APIs and mixed framework versions
    • Plausible-sounding but wrong suggestions
  • Suggestions include feeding concrete API docs, using RAG, and explicitly validating how often outputs are both correct and important.

Broader Reflections

  • Disagreement over whether tech debt is mainly a coding vs management problem.
  • Meta-discussion notes strong emotional reactions: some developers feel threatened or defensive; others accuse critics of Luddism.
  • Several see this work as one early step toward “engineering practical thinking patterns” for LLM-based tools.