Potential issues in curl found using AI assisted tools

Context: curl, AI, and a positive case

  • Thread centers on a rare positive story: dozens of real curl bugs surfaced via “AI-assisted tools,” in contrast to earlier waves of bogus, AI‑generated security reports that maintainers described as a DDoS.
  • Commenters stress the title should emphasize “AI‑assisted security scanners,” not “AI found bugs” outright.

Human vetting vs ‘AI slop’

  • Key distinction:
    • Bad pattern: people paste code into general LLMs, forward hallucinated “vulnerabilities” without understanding them.
    • Good pattern: professionals run specialized tools, then manually confirm each issue before reporting.
  • Several note the asymmetry: unvetted AI reports are cheap to send but very expensive to triage; projects now ban repeat “slop” reporters.

How the AI security tools work

  • Tools mentioned include AI‑centric SAST products (e.g., ZeroPath, Corgea, Almanax); some founders join the thread to say they do not wrap traditional analyzers but use LLMs as core engines for detection and triage.
  • Others are skeptical, reading marketing as “AI post‑processing” on classic static analysis; they propose reproducing this by running verbose open‑source scanners and using generic LLMs to triage results.
  • Bug reports were initially private due to potential security impact; resulting fixes are visible in curl PRs tagged with SARIF data.

Experiences with AI as coding companion

  • Many find LLMs more useful as reviewers/debuggers than as code generators:
    • Spot suspicious patterns, missing warning flags, or logic errors.
    • Assist in complex debugging (e.g., proposing hypotheses, driving gdb, tracing assembly).
  • Techniques that help: tailored prompts, planning modes, tool calling, excluding tests/docs, or asking the model to design its own “best prompt.”
  • Some note specialized tools (Cursor BugBot, Gemini 2.5 Pro, project‑aware reviewers) work better than generic chat.

Limits, hallucinations, and need for validation

  • Hallucinations remain a central problem, especially in low‑level memory safety: convincing but wrong vulnerability reports are costly to verify.
  • Several security researchers argue that interactive, environment‑aware, tool‑driven architectures (gdb, multi‑agent loops, PoC generation) are required to validate findings at scale.
  • One suggestion: use AI to propose checks, then turn those into deterministic scripts/linters baked into CI.

Broader concerns and philosophy

  • Worries about:
    • Abuse of powerful scanning tools for zero‑day hunting or supply‑chain attacks.
    • Proprietary pricing and limited reproducibility of the results.
  • Broader debate over AI and creativity: some feel AI steals the “fun” of implementation; others say it frees them to focus on design and higher‑level creativity.
  • A recurring theme: AI is a powerful “bicycle for the mind” for competent practitioners, but dangerous and misleading for those who don’t know how to evaluate its output.