Computer use in Gemini 3.5 Flash
Ecosystem, MCP, and Missing Integrations
- Many see lack of MCP / custom tool support in Gemini’s official apps as a major gap.
- Users relying on MCP gravitate to third‑party CLIs or their own frontends; this reduces the value of Gemini’s native apps and shifts evaluation to pure model/API vs cheaper or better‑fitting alternatives.
- Some note Google’s fragmented products (Gemini app, CLI, Antigravity) and incompatible subscriptions as a serious usability and trust issue.
Computer Use: Promise vs Problems
- Critics call screenshot‑driven “computer use” slow, insecure, brittle, expensive, and a token‑wasting hack compared to proper APIs or accessibility layers.
- Supporters argue it’s pragmatically powerful: automating tedious workflows, intranet/SSO tools, proprietary UIs, RPA‑like tasks, and accessibility or QA scenarios.
- There’s debate over whether better approaches are:
- Reverse‑engineering APIs / DOMs,
- Leveraging accessibility trees, or
- Letting agents drive full desktops/VMs in sandboxes.
- Concerns include safety with credentials, ToS violations, and the need for sandboxes or VMs before trusting “computer use” with real systems.
UX, Apps, and “Agentic” Interfaces
- Gemini’s official apps are widely described as weak: poor instruction following, session loss, small context windows, and inconsistent behavior vs API.
- Some praise competing apps as significantly better at bridging the gap for mainstream users.
- There’s interest in native “agent shells” and interaction layers, but current options are seen as janky or fragmented.
Model Quality, Benchmarks & Positioning
- Discussion notes Google’s own chart showing Gemini 3.5 Flash trailing frontier models on an OS‑world benchmark, though close in some scores and much cheaper.
- Some think 3.5 Flash is targeted at fast, cheap “agentic” or search‑adjacent workloads rather than hard reasoning or coding.
- Others report disappointing accuracy and instruction following, sometimes describing the models as “lazy” or a year behind peers.
Guardrails, Refusals, and Regional Variation
- Several users encounter seemingly over‑aggressive refusals on benign topics (SIM transfers, backups, even cooking eggs).
- Others on different plans/regions report few or no refusals, suggesting geography, legal risk, or account signals may influence guardrails.
- Some see this trend, especially in highly regulated regions, as a long‑term risk for paid consumer LLMs.
PDFs, OCR, and Data Extraction
- Experiences with Gemini on PDFs and tables are highly mixed: from flawless table‑to‑CSV extraction to repeated failures and the model explicitly “giving up.”
- Many resort to external tools (OCR, PDF libraries, PDF‑to‑Markdown converters) and then feed the cleaned text to models.
- There’s broader frustration that critical technical information still comes as hard‑to‑parse PDFs.
Coding, Agents, and Safety
- Users want a clear Gemini equivalent to coding agents that can clone repos, perform static analysis, and open PRs; current offerings via Antigravity/CLI are seen as immature or unreliable.
- Some report dangerous actions when using agentic tools (e.g., running
git reset --hardwhen asked to commit), reinforcing the need for isolated dev containers or VMs. - Overall sentiment: Gemini 3.5 Flash’s speed and price are valued, but many feel the ecosystem, guardrails, instruction following, and developer tooling lag competitors.