Notes on Anthropic's Computer Use Ability
Overall Perception & Hype
- Many see “computer use” as a striking demo of what’s possible, akin to an early “ChatGPT moment.”
- Others dismiss it as overhyped RPA with a vision model, noting that examples like filling address bars or exporting CSVs are unimpressive versus existing tools.
- Some doubt real production adoption yet and see this as another step in an ongoing hype cycle.
Cost, Speed & Reliability
- Multiple reports that it’s expensive and slow: e.g., ~$5 just to find flights due to many LLM calls and rate-limit crashes.
- Users note frequent failures: incomplete tasks, incorrect success reports, and crashes.
- Consensus that it’s early-beta quality; useful for prototyping but not robust enough for critical workflows.
Use Cases & Business Value
- Suggested use cases: automating legacy UIs without APIs, internal office workflows, basic RPA, UI QA, and personal “agent” tasks like travel research.
- Skeptics argue most serious automation is better done via proper APIs and structured interfaces, with “computer use” remaining brittle and one-off.
- Others counter that many industries have entrenched GUI-only systems where high-level UI automation is the only practical option.
Technical Approach & Alternatives
- Debate over using pure vision + mouse/keyboard events vs leveraging accessibility APIs, DOM, or SSH/shell access.
- Some practitioners report that pixel-level control is currently inaccurate, costly, and needs heavy “feature engineering” and strict prompting to work.
- Vision-based control is seen by some as the most general long-term path; others view it as unnecessarily hard versus structured accessibility layers.
Security & Risk
- Concerns about letting an AI control shells or desktops: misconfigurations, open ports, or destructive commands are cited from real incidents.
- Some note that “computer use” already implies a superset of remote shell risk.
Economic, Social & Ad-Ecosystem Impacts
- Discussion on labor displacement, rising inequality, and whether these tools augment workers or erode jobs.
- Speculation that agentic browsing threatens ad-based and dark-pattern-driven business models, but that ads and paid influence will likely move into the agent layer itself.