2026-05-30

Anthropic surpasses OpenAI to become most valuable AI startup

Model capabilities for coding

Many compare Anthropic’s Opus/Claude Code vs OpenAI’s GPT‑5.5/Codex.
Some find GPT‑5.5 clearly better on large, complex codebases and refactors, with fewer errors and strong apply_patch tooling.
Others find Opus better at system architecture, product design, and idiomatic style, especially in languages like Rust.
Several report both still hallucinate, invent APIs, or overreach on claims; no model is reliably factual on complex troubleshooting.
DeepSeek V4 is often described as ~slightly weaker than recent Opus/GPT‑5.x but dramatically cheaper and fast, good enough for many.

Harness / UX and workflow

A major theme: the harness (Claude Code, Codex, Cursor, etc.) matters as much as the base model.
Claude Code praised for autonomy, flow, planning, and handling underspecified prompts; feels like “autopilot,” especially for non-experts.
Downsides: can ignore instructions, fight user choices, or waste tokens with over-elaborate steps and artifacts.
Codex is seen as more literal and “driver required”: better if you know exactly what you want and can steer aggressively.
Many say final code from top models is often indistinguishable; real differences are in steering effort, iteration speed, tone, and integration with tools/IDEs.

Marketing, vibes, and tribalism

Long argument over whether Claude’s popularity is real superiority or “modern Tupperware party.”
Some ran blind tests (same PRs implemented with different models) and found colleagues couldn’t tell which model wrote what, suggesting hype and confirmation bias.
Others insist they can tell “night and day” differences in practice and that user experience, not just output, justifies preferences.
Analogies invoked: Coke vs Pepsi, luxury cars, GPU arms race, Vim vs Emacs; many acknowledge developers are susceptible to branding, early-mover advantage, and social proof.

Ethics, safety, and leadership

Strong antipathy toward OpenAI leadership (e.g., perceived political donations, board conduct, DoD cooperation, abandoned safety rhetoric).
Anthropic is praised by some for pushing back on unrestricted military use, hiring ethicists, pledging wealth donations, and refusing ad-tech business models.
Others argue Anthropic is no better or worse: accused of fear-based safety marketing, working with defense contractors, hostile policies toward some users, and being least generous to open source.
DeepSeek raises separate concerns about funding Chinese entities vs disliking US tech CEOs; some prefer local or third-party-hosted open models.

Pricing, enterprise adoption, and competition

Complaints that Claude subscriptions have low limits and that Anthropic’s token pricing and frequent changes feel like bait-and-switch or “hostage-taking.”
Some enterprises choose Anthropic due to negotiated data-retention guarantees and earlier focus on business tools (Claude Code, Cowork, etc.).
Others note growing developer shift toward Codex for coding, using Claude primarily for UX or design, and DeepSeek for cost savings.
Many expect frontier model quality to converge; harness design, contracts, and ecosystem may matter more than small capability gaps.

Valuation and long-term outlook

Many view both Anthropic and OpenAI valuations as bubble-like, possibly 10× too high, reminiscent of dot‑com and crypto.
Some say they’re building long‑term AI infrastructure (compute, tooling) analogous to cloud providers and will persist even if open models catch up.
Others argue that if open‑weight models near SOTA, closed labs lose their moat, and high valuations may not be sustainable.
General uncertainty over who “wins”: proprietary labs, open models, or downstream application startups.

Related topics