Gemini 3.5 Flash

Pricing and Model Positioning

  • Gemini 3.5 Flash standard pricing is reported as ~$1.50/M input and $9/M output tokens, about 3× the previous Flash tier and similar to older Pro models.
  • Several commenters note confusion between batch/flex vs on‑demand pricing; early posts misquoted cheaper numbers.
  • Many see this as Flash effectively becoming “new Pro”: cheaper than 3.1 Pro per token but not a “cheap fast model” anymore.
  • Some suspect this is not cost-based but an attempt to move upmarket and reduce overload from underpriced Flash models.

Performance, Benchmarks, and Token Use

  • Benchmarks look strong; several claim 3.5 Flash is near or at Sonnet‑class intelligence and beats 3.1 Pro on many tests.
  • Others highlight cost-per-task: Artificial Analysis shows 3.5 Flash costing ~74% more than 3.1 Pro to run their suite, while scoring lower.
  • Multiple hands-on tests show 3.5 using many more “thinking” tokens; that can erase speed/price advantages in real workloads.
  • Some report 3.5 Flash solves coding/design tasks in far fewer tokens than older Gemini models; others find it more verbose.

Developer Experience, Tools, and Reliability

  • Antigravity 2.0 (CLI and GUI) is praised as a strong agent harness, but:
    • Quotas on the Gemini “AI Pro” plan were sharply reduced (e.g., “12 Pro prompts per 5 hours”, very easy to hit).
    • People hit quota or 5xx errors after a handful of Antigravity sessions; some are canceling subscriptions.
    • Complaints that failed generations (e.g., image overload errors) still consume quota.
  • Google’s AI Studio and API are widely described as flaky and inconsistent compared to competitors.

Coding and Agentic Use

  • Opinions on coding are polarized:
    • Some say raw coding/reasoning is very strong for a “flash” model and competitive with higher tiers.
    • Others find 3.5 Flash clearly worse than frontier models in deep systems code, long-horizon refactors, and tool use.
  • Recurring theme: Gemini models are “smart but stubborn” — disregard AGENTS.md/instructions, overbuild features, disable tooling, or ignore linters.
  • Agentic performance (multi-step tool use, large projects) is often called Gemini’s weak spot; several say this regressed vs older models.

Hallucinations, Knowledge Cutoff, and Search

  • Knowledge cutoff is January 2025 with “latest update May 2026”; some find the lag worrying given rapidly LLM‑polluted web data.
  • Many still see frequent hallucinations in legal, research, niche APIs, and gaming contexts, even with web search enabled.
  • Others argue that web-grounded harnesses for top models have made hallucinations rare for everyday questions, but not for specialized domains.

Competition and Local Models

  • DeepSeek V4 (especially Flash) and Qwen 3.6 are repeatedly cited as dramatically cheaper with “good enough” capability, especially for coding.
  • Several note that open‑weight models now approach last year’s frontier and can be run locally on high-end consumer hardware; this makes rising cloud prices less attractive.
  • Some foresee a three‑tier future: free/cheap local models for most users, subscription “near frontier” models, and expensive frontier APIs for high‑value work.

Naming, Branding, and Strategy

  • Many find Google’s naming confusing: Flash vs Flash‑Lite vs Pro, shifting roles from release to release.
  • Some interpret 3.5 Flash being marked “stable” (not “preview”) plus the price hike as a long-term reset of the “cheap model” baseline, not a temporary spike.
  • Several suspect Google is prioritizing monetization and search integration over being the absolute frontier lab.

Culture, UX, and “Vibe”

  • A recurring complaint is Gemini’s personality: overly enthusiastic, flattering, and verbose, even when wrong; some users say this alone puts them off.
  • The “pelican on a bike” SVG benchmark shows 3.5 Flash generating elaborate, stylized but structurally flawed graphics, illustrating a tendency to “do a lot” rather than fix core mistakes.
  • Overall sentiment mixes respect for speed and raw capability with strong skepticism about pricing, reliability, and long-term trust.