He asked AI to count carbs 27000 times. It couldn't give the same answer twice
Task feasibility: carbs from photos
- Many argue the problem is fundamentally under‑specified: photons don’t reveal hidden ingredients, portion density, added oils, or internal contents.
- Examples: identical-looking foods can differ massively in calories; even carbs can vary with bread type, fillers, sauces.
- Others counter that for carbs specifically, typical foods (e.g., a plain white-bread cheese sandwich) allow a rough, consistent human estimate using prior knowledge.
LLM behavior: randomness and limitations
- Commenters note LLMs are probabilistic next-token predictors; repeated queries yield varied answers, even at low temperature.
- Some highlight that even with temperature near 0 and same prompt, hardware, backend changes, and model design can still cause nondeterminism.
- Models also struggle to quantify their own confidence; numeric “confidence scores” often don’t reflect actual uncertainty.
Medical and ethical concerns
- Strong agreement that using generic LLMs as autonomous insulin-dosing calculators is dangerous.
- AI carb-counting features in diabetes tools and commercial apps are seen as potentially harmful or fraudulent, especially when marketed as accurate.
- Several insist the correct response from an AI here should be “I can’t tell” or a clearly caveated rough range, not a precise-seeming guess.
Critiques of the study and article
- Some see the result as “water is wet”: obvious to anyone technical; they view the article as clickbaity or shallow.
- Others defend it as an important quantitative demonstration for non-technical diabetics and policymakers, especially since prompts were taken from real insulin-related software.
- A few say the more interesting baseline would be human estimates or existing commercial apps, not just raw frontier models.
Better approaches and practical use cases
- Suggested improvements: include text descriptions, approximate weights, labels, barcodes, or Bluetooth scales; use specialized vision models plus nutrition databases.
- Some report success using LLMs to log food when they provide exact ingredients and weights, with AI mainly doing aggregation and lookups.
- Consensus: image-only carb estimation should be treated as rough guidance at best, not as a medical-grade input.
Public understanding and AI marketing
- Many blame aggressive “AI can do anything” marketing and sci‑fi imagery for users treating LLMs as oracles.
- Calls for better AI literacy in schools and clearer vendor messaging about limitations and appropriate use cases.