2025-06-05

Gemini-2.5-pro-preview-06-05

Versioning and Naming Confusion

Multiple “preview” variants (03-25 → 05-06 → 06-05) confuse users, especially with ambiguous US-style dates; several wish for semantic versioning (2.5.1, 2.5.2) or a 2.6 bump.
Some report Google silently redirecting older model IDs (e.g., 03-25 → 05-06), breaking expectations of API stability.
Silent checkpoint updates (1.5 001→002, 2.5 0325→0506→0605) are contrasted with OpenAI’s more explicit versioning and notifications.
People are unsure which version runs in the Gemini web app and complain that even Google’s own launch pages mix 05‑06 and 06‑05 benchmark charts.

Model Behavior, Regressions, and Suspected Nerfs

Multiple reports that Gemini 2.5 Pro was excellent at long-form reasoning and summaries but recently became “forgetful” after a few turns, ignoring short conversational history.
Some attribute this to intentional nerfs and “dark patterns” in the consumer app: undocumented rate limits masked as generic errors, forced sign‑outs when outputs get long, and possibly reduced reasoning effort on multi-turn chats.
Others describe earlier Gemini versions abruptly changing behavior (e.g., always greeting like a new chat despite full history).

Benchmarks vs Lived Experience

New version shows strong gains on Aider’s coding leaderboard (jump from ~76.9 to 82.2) and lmarena ELO, and improved scores on puzzles like NYT Connections.
However, several users say Gemini still lags Claude 4 / Opus / o3 on complex coding or reasoning, sometimes looping, giving up, or wrongly blaming TypeScript limitations.
Others report the opposite: Gemini catching SQL rewrite bugs Claude missed, or outperforming Claude on certain languages (Go) and data/ETL tasks.
Many express skepticism that public leaderboards reflect real work; Goodhart’s law and cherry-picked benchmarks are explicitly invoked.

Coding Style and Developer UX

Common complaints: Gemini is overly verbose, litters code with trivial comments, renames variables unasked, touches unrelated lines, and sometimes drops brackets.
Some feel its style resembles an “inexperienced” programmer requiring constant nudging for concision, async patterns, and structure.
Others praise it as fast, cheap, and generally correct, especially compared to older models or for non-agentic “assist” use.

Tooling, Rate Limits, and Access

Users access Gemini 2.5 via Cursor, IDE agents (Zed, Roo Code, Cline), AI Studio, and chat app; some models must be manually selected.
AI Studio exposes a “thinking budget” slider, but higher “deep think” settings appear gated behind paid “Ultra” plans.
Confusion persists over where rate limits apply: reports of new 100‑message/day caps in the Gemini app, looser limits via AI Studio/API, and unclear communication from Google.

Competitive Context and Perception

Some see Gemini’s progress as a serious challenge to OpenAI and question OpenAI’s sky-high valuation given hardware costs and competition from Google/Facebook data moats.
Others argue OpenAI still has huge mindshare (“chatgpt” as a verb) and strong revenue projections, while Gemini’s real-world usefulness feels overhyped or even “astroturfed.”
Overall sentiment: Gemini 2.5 Pro (06‑05) is a strong, improving model with attractive cost/performance, but opinions are sharply split on whether it is truly best-in-class for coding and complex reasoning.

Related topics