Gemini 3.5 Flash
Pricing and Model Positioning
- Gemini 3.5 Flash standard pricing is reported as ~$1.50/M input and $9/M output tokens, about 3× the previous Flash tier and similar to older Pro models.
- Several commenters note confusion between batch/flex vs on‑demand pricing; early posts misquoted cheaper numbers.
- Many see this as Flash effectively becoming “new Pro”: cheaper than 3.1 Pro per token but not a “cheap fast model” anymore.
- Some suspect this is not cost-based but an attempt to move upmarket and reduce overload from underpriced Flash models.
Performance, Benchmarks, and Token Use
- Benchmarks look strong; several claim 3.5 Flash is near or at Sonnet‑class intelligence and beats 3.1 Pro on many tests.
- Others highlight cost-per-task: Artificial Analysis shows 3.5 Flash costing ~74% more than 3.1 Pro to run their suite, while scoring lower.
- Multiple hands-on tests show 3.5 using many more “thinking” tokens; that can erase speed/price advantages in real workloads.
- Some report 3.5 Flash solves coding/design tasks in far fewer tokens than older Gemini models; others find it more verbose.
Developer Experience, Tools, and Reliability
- Antigravity 2.0 (CLI and GUI) is praised as a strong agent harness, but:
- Quotas on the Gemini “AI Pro” plan were sharply reduced (e.g., “12 Pro prompts per 5 hours”, very easy to hit).
- People hit quota or 5xx errors after a handful of Antigravity sessions; some are canceling subscriptions.
- Complaints that failed generations (e.g., image overload errors) still consume quota.
- Google’s AI Studio and API are widely described as flaky and inconsistent compared to competitors.
Coding and Agentic Use
- Opinions on coding are polarized:
- Some say raw coding/reasoning is very strong for a “flash” model and competitive with higher tiers.
- Others find 3.5 Flash clearly worse than frontier models in deep systems code, long-horizon refactors, and tool use.
- Recurring theme: Gemini models are “smart but stubborn” — disregard AGENTS.md/instructions, overbuild features, disable tooling, or ignore linters.
- Agentic performance (multi-step tool use, large projects) is often called Gemini’s weak spot; several say this regressed vs older models.
Hallucinations, Knowledge Cutoff, and Search
- Knowledge cutoff is January 2025 with “latest update May 2026”; some find the lag worrying given rapidly LLM‑polluted web data.
- Many still see frequent hallucinations in legal, research, niche APIs, and gaming contexts, even with web search enabled.
- Others argue that web-grounded harnesses for top models have made hallucinations rare for everyday questions, but not for specialized domains.
Competition and Local Models
- DeepSeek V4 (especially Flash) and Qwen 3.6 are repeatedly cited as dramatically cheaper with “good enough” capability, especially for coding.
- Several note that open‑weight models now approach last year’s frontier and can be run locally on high-end consumer hardware; this makes rising cloud prices less attractive.
- Some foresee a three‑tier future: free/cheap local models for most users, subscription “near frontier” models, and expensive frontier APIs for high‑value work.
Naming, Branding, and Strategy
- Many find Google’s naming confusing: Flash vs Flash‑Lite vs Pro, shifting roles from release to release.
- Some interpret 3.5 Flash being marked “stable” (not “preview”) plus the price hike as a long-term reset of the “cheap model” baseline, not a temporary spike.
- Several suspect Google is prioritizing monetization and search integration over being the absolute frontier lab.
Culture, UX, and “Vibe”
- A recurring complaint is Gemini’s personality: overly enthusiastic, flattering, and verbose, even when wrong; some users say this alone puts them off.
- The “pelican on a bike” SVG benchmark shows 3.5 Flash generating elaborate, stylized but structurally flawed graphics, illustrating a tendency to “do a lot” rather than fix core mistakes.
- Overall sentiment mixes respect for speed and raw capability with strong skepticism about pricing, reliability, and long-term trust.