Anthropic surpasses OpenAI to become most valuable AI startup
Model capabilities for coding
- Many compare Anthropic’s Opus/Claude Code vs OpenAI’s GPT‑5.5/Codex.
- Some find GPT‑5.5 clearly better on large, complex codebases and refactors, with fewer errors and strong
apply_patchtooling. - Others find Opus better at system architecture, product design, and idiomatic style, especially in languages like Rust.
- Several report both still hallucinate, invent APIs, or overreach on claims; no model is reliably factual on complex troubleshooting.
- DeepSeek V4 is often described as ~slightly weaker than recent Opus/GPT‑5.x but dramatically cheaper and fast, good enough for many.
Harness / UX and workflow
- A major theme: the harness (Claude Code, Codex, Cursor, etc.) matters as much as the base model.
- Claude Code praised for autonomy, flow, planning, and handling underspecified prompts; feels like “autopilot,” especially for non-experts.
- Downsides: can ignore instructions, fight user choices, or waste tokens with over-elaborate steps and artifacts.
- Codex is seen as more literal and “driver required”: better if you know exactly what you want and can steer aggressively.
- Many say final code from top models is often indistinguishable; real differences are in steering effort, iteration speed, tone, and integration with tools/IDEs.
Marketing, vibes, and tribalism
- Long argument over whether Claude’s popularity is real superiority or “modern Tupperware party.”
- Some ran blind tests (same PRs implemented with different models) and found colleagues couldn’t tell which model wrote what, suggesting hype and confirmation bias.
- Others insist they can tell “night and day” differences in practice and that user experience, not just output, justifies preferences.
- Analogies invoked: Coke vs Pepsi, luxury cars, GPU arms race, Vim vs Emacs; many acknowledge developers are susceptible to branding, early-mover advantage, and social proof.
Ethics, safety, and leadership
- Strong antipathy toward OpenAI leadership (e.g., perceived political donations, board conduct, DoD cooperation, abandoned safety rhetoric).
- Anthropic is praised by some for pushing back on unrestricted military use, hiring ethicists, pledging wealth donations, and refusing ad-tech business models.
- Others argue Anthropic is no better or worse: accused of fear-based safety marketing, working with defense contractors, hostile policies toward some users, and being least generous to open source.
- DeepSeek raises separate concerns about funding Chinese entities vs disliking US tech CEOs; some prefer local or third-party-hosted open models.
Pricing, enterprise adoption, and competition
- Complaints that Claude subscriptions have low limits and that Anthropic’s token pricing and frequent changes feel like bait-and-switch or “hostage-taking.”
- Some enterprises choose Anthropic due to negotiated data-retention guarantees and earlier focus on business tools (Claude Code, Cowork, etc.).
- Others note growing developer shift toward Codex for coding, using Claude primarily for UX or design, and DeepSeek for cost savings.
- Many expect frontier model quality to converge; harness design, contracts, and ecosystem may matter more than small capability gaps.
Valuation and long-term outlook
- Many view both Anthropic and OpenAI valuations as bubble-like, possibly 10× too high, reminiscent of dot‑com and crypto.
- Some say they’re building long‑term AI infrastructure (compute, tooling) analogous to cloud providers and will persist even if open models catch up.
- Others argue that if open‑weight models near SOTA, closed labs lose their moat, and high valuations may not be sustainable.
- General uncertainty over who “wins”: proprietary labs, open models, or downstream application startups.