2025-10-02

Potential issues in curl found using AI assisted tools

Context: curl, AI, and a positive case

Thread centers on a rare positive story: dozens of real curl bugs surfaced via “AI-assisted tools,” in contrast to earlier waves of bogus, AI‑generated security reports that maintainers described as a DDoS.
Commenters stress the title should emphasize “AI‑assisted security scanners,” not “AI found bugs” outright.

Human vetting vs ‘AI slop’

Key distinction:
- Bad pattern: people paste code into general LLMs, forward hallucinated “vulnerabilities” without understanding them.
- Good pattern: professionals run specialized tools, then manually confirm each issue before reporting.
Several note the asymmetry: unvetted AI reports are cheap to send but very expensive to triage; projects now ban repeat “slop” reporters.

How the AI security tools work

Tools mentioned include AI‑centric SAST products (e.g., ZeroPath, Corgea, Almanax); some founders join the thread to say they do not wrap traditional analyzers but use LLMs as core engines for detection and triage.
Others are skeptical, reading marketing as “AI post‑processing” on classic static analysis; they propose reproducing this by running verbose open‑source scanners and using generic LLMs to triage results.
Bug reports were initially private due to potential security impact; resulting fixes are visible in curl PRs tagged with SARIF data.

Experiences with AI as coding companion

Many find LLMs more useful as reviewers/debuggers than as code generators:
- Spot suspicious patterns, missing warning flags, or logic errors.
- Assist in complex debugging (e.g., proposing hypotheses, driving gdb, tracing assembly).
Techniques that help: tailored prompts, planning modes, tool calling, excluding tests/docs, or asking the model to design its own “best prompt.”
Some note specialized tools (Cursor BugBot, Gemini 2.5 Pro, project‑aware reviewers) work better than generic chat.

Limits, hallucinations, and need for validation

Hallucinations remain a central problem, especially in low‑level memory safety: convincing but wrong vulnerability reports are costly to verify.
Several security researchers argue that interactive, environment‑aware, tool‑driven architectures (gdb, multi‑agent loops, PoC generation) are required to validate findings at scale.
One suggestion: use AI to propose checks, then turn those into deterministic scripts/linters baked into CI.

Broader concerns and philosophy

Worries about:
- Abuse of powerful scanning tools for zero‑day hunting or supply‑chain attacks.
- Proprietary pricing and limited reproducibility of the results.
Broader debate over AI and creativity: some feel AI steals the “fun” of implementation; others say it frees them to focus on design and higher‑level creativity.
A recurring theme: AI is a powerful “bicycle for the mind” for competent practitioners, but dangerous and misleading for those who don’t know how to evaluate its output.

Related topics