2025-03-19

OpenAI's o1-pro now available via API

Capabilities and Workflows

Multiple commenters report o1‑pro is exceptionally strong at whole‑codebase reasoning: feeding it 100k+ tokens of source (sometimes plus third‑party libraries) to find subtle bugs, refactors, and design issues that were missed by humans.
Typical “bug-hunting” prompts are simple (“scan this codebase for bugs / improvements”) plus a large pasted pack of files or a git diff.
Several tooling patterns emerge: repo packers (e.g., Repomix), small CLIs, and editor extensions that concatenate files into a single markdown prompt with filenames and fenced code blocks, which some feel improves performance.
Anecdotes include it designing nontrivial components (e.g., a .NET authorization filter, specialized audio/PCM plugins) in one interactive session, sometimes replacing days of research and iteration.

Perceived Advantages vs Other Models

Some users say all other models feel like a “waste of time” compared to o1‑pro, especially when pushing very large contexts and more abstract problem descriptions.
Others report benchmarks and their own experience show Claude Sonnet 3.7 or o3‑mini‑high as equal or better on many tasks, especially when problems are already broken down.
A recurring theme: o1‑pro’s advantage is at “one level up” of abstraction—doing more steps and inferring implied sub-tasks without hand‑holding, rather than raw IQ on small prompts.

Pricing, Value, and Human Comparison

The $150/M input and $600/M output token pricing is widely viewed as extreme; many say they’ll restrict it to rare, high‑value calls or stick to the web UI subscription.
Debate over economics: some argue that for focused, high‑impact work (e.g., generating complex, correct code in under an hour) it’s still far cheaper than a skilled SWE; others show calculations where even small projects could accumulate significant bills and remain far from replacing human cost.
One thread compares token cost per year to a $160k office worker, suggesting OpenAI is now within an order of magnitude on price, but with far lower autonomy and reliability.

Synthetic Data and Policy Concerns

A suggested justification for the price is using o1‑pro to generate high‑quality synthetic data and evals to train or tune cheaper models.
This runs into OpenAI’s terms, which forbid using their outputs to build competing models. Commenters note the irony given OpenAI’s own data sourcing, and question enforceability and whether high pricing is partly meant to deter “slow distillation.”

API / Technical and UX Issues

o1‑pro is only available via the new Responses API, not Chat Completions, forcing code updates and new streaming handling. Some find this migration annoying and poorly documented.
It does not support streaming and often feels very slow; some infer it might be doing best‑of‑N or hidden chains-of-thought internally, for which users are billed but cannot see.
There are reports of failures and errors with very large (near‑limit) contexts, making “read the whole codebase+docs” workflows brittle.

Limitations and Mixed Impressions

Despite large context, some users find it still struggles to restructure long texts (e.g., detailed transcripts) without dropping facts, and often over‑summarizes.
Others say they see little difference between o1‑pro and cheaper models on “straightforward practical problems,” and keep o1‑pro as a slow, expensive “last resort” rather than a constant assistant.
Knowledge cutoff (2023) and a 200k context window are described as underwhelming in 2025, though some joke that the older cutoff might make the model more optimistic.

Miscellaneous and Humor

A playful subthread tracks the cost and quality of having o1‑pro generate an SVG of a pelican riding a bicycle, with jokes that it might be cheaper to buy a real pelican.
Several digressions explore AGI/ASI economics, human vs model energy use, and pop‑culture references (e.g., The Matrix), mostly as light speculation rather than concrete conclusions.

Related topics