Potential issues in curl found using AI assisted tools
Context: curl, AI, and a positive case
- Thread centers on a rare positive story: dozens of real curl bugs surfaced via “AI-assisted tools,” in contrast to earlier waves of bogus, AI‑generated security reports that maintainers described as a DDoS.
- Commenters stress the title should emphasize “AI‑assisted security scanners,” not “AI found bugs” outright.
Human vetting vs ‘AI slop’
- Key distinction:
- Bad pattern: people paste code into general LLMs, forward hallucinated “vulnerabilities” without understanding them.
- Good pattern: professionals run specialized tools, then manually confirm each issue before reporting.
- Several note the asymmetry: unvetted AI reports are cheap to send but very expensive to triage; projects now ban repeat “slop” reporters.
How the AI security tools work
- Tools mentioned include AI‑centric SAST products (e.g., ZeroPath, Corgea, Almanax); some founders join the thread to say they do not wrap traditional analyzers but use LLMs as core engines for detection and triage.
- Others are skeptical, reading marketing as “AI post‑processing” on classic static analysis; they propose reproducing this by running verbose open‑source scanners and using generic LLMs to triage results.
- Bug reports were initially private due to potential security impact; resulting fixes are visible in curl PRs tagged with SARIF data.
Experiences with AI as coding companion
- Many find LLMs more useful as reviewers/debuggers than as code generators:
- Spot suspicious patterns, missing warning flags, or logic errors.
- Assist in complex debugging (e.g., proposing hypotheses, driving gdb, tracing assembly).
- Techniques that help: tailored prompts, planning modes, tool calling, excluding tests/docs, or asking the model to design its own “best prompt.”
- Some note specialized tools (Cursor BugBot, Gemini 2.5 Pro, project‑aware reviewers) work better than generic chat.
Limits, hallucinations, and need for validation
- Hallucinations remain a central problem, especially in low‑level memory safety: convincing but wrong vulnerability reports are costly to verify.
- Several security researchers argue that interactive, environment‑aware, tool‑driven architectures (gdb, multi‑agent loops, PoC generation) are required to validate findings at scale.
- One suggestion: use AI to propose checks, then turn those into deterministic scripts/linters baked into CI.
Broader concerns and philosophy
- Worries about:
- Abuse of powerful scanning tools for zero‑day hunting or supply‑chain attacks.
- Proprietary pricing and limited reproducibility of the results.
- Broader debate over AI and creativity: some feel AI steals the “fun” of implementation; others say it frees them to focus on design and higher‑level creativity.
- A recurring theme: AI is a powerful “bicycle for the mind” for competent practitioners, but dangerous and misleading for those who don’t know how to evaluate its output.