Summary
- OpenAI is contemplating significant reductions in token prices, anticipating similar actions from Anthropic.
- This decision comes as both firms prepare for their respective IPOs.
- Open-source inference providers are already offering services at a fraction of the cost of closed models, presenting corporate clients with alternatives before any potential price war starts.
OpenAI is exploring the possibility of lowering the prices it charges developers and businesses, as reported by the Wall Street Journal, in response to expected cuts from Anthropic. Conversations regarding this strategy are still evolving, especially since both companies filed confidentially for IPOs this month and have yet to achieve profitability.
Sam Altman stated at a recent event, "I think we'll have a lot of ways we can help people get more value for less spend," according to the Wall Street Journal. This remark comes as OpenAI reported a -122% adjusted operating margin in the first quarter of 2026, indicating a loss of $1.22 for every dollar earned.
The competitive pressure is evident. As highlighted by Decrypt, ChatGPT's share of global generative AI web traffic declined from 77.6% in May 2025 to 53.7% by April 2026. For the first time, a greater number of companies tracked by the Ramp AI Index are opting to pay for Anthropic's services rather than those of OpenAI. Anthropic's annualized run rate surged from $9 billion at the end of 2025 to $47 billion by May 2026, marking a 422% increase in just five months, largely attributed to Claude Code, with Q2 2026 being its first profitable quarter.
OpenAI has prioritized its own coding tool, Codex, but is currently in a position of catching up.
Both firms are engaged in a fierce competition to attract as many clients as possible in what some consider the largest technological surge since the dot-com bubble. A wide range of companies is now eager to incorporate AI technologies. Uber's CTO reported exhausting its entire 2026 AI budget by April, and some employees at JP Morgan are spending more on AI than their own salaries, as noted by the bank's chief data officer for its payments sector.
This phenomenon has been termed "tokenmaxxing" in Silicon Valley, referring to the practice of rapidly consuming AI tokens—data processed by AI models—often without a clear return on investment. At AIPCon last week, Palantir CEO Alex Karp likened it to an addiction. JP Morgan analysts recently published a note titled "AI Bills Are Out of Control." The companies most vulnerable to the consequences of this trend are now contemplating a price war.
Tommy Shaughnessy from Delphi Ventures discussed the inherent structural issues in a popular X post this week: the $20/month flat fee was always below the actual costs of heavy usage—a strategy intended to promote adoption rather than cover computational expenses. Once a legitimate business requires AI at scale, it transitions to the API model, incurring costs per token while consuming significantly more computing resources.
Not everyone concurs with this perspective. Some argue that the AI oligopoly in the Western world allows firms to impose increasingly high fees for processing requests, while Chinese models are priced much lower as evidence of this dynamic. If this holds true, substantial price reductions could still be feasible without compromising financial stability.
Hot take: They're not subsidized; their margins are excessive. They are just absolutely exploiting API customers. Anyone who has used DeepSeek or hosted anything and calculated the hardware/power costs knows this https://t.co/XQ477Qw3Vv
— Roy (@usr_bin_roygbiv) June 11, 2026
Real enterprise applications are shifting towards metered API pricing, with companies consuming credits at a much faster rate than the flat fees suggested. Meanwhile, open-source inference providers—entities that offer computational power for AI models—are rapidly expanding, driven by agentic tools. These platforms support leading Chinese AI models such as DeepSeek, GLM, MiMo, Kimi, or Minimax, which compete with Claude Opus on coding tasks, at approximately one-thirteenth the price of closed alternatives.
Shaughnessy noted, "Chinese labs open-source frontier-grade models." He added, "The model represents the largest expense for an inference provider, and they acquire it for free." As long as this trend continues, the baseline for intelligence pricing continues to decline, making margin recovery at OpenAI or Anthropic a complex problem without a straightforward resolution.
This entire scenario would change only if China shifted to a closed-source model, Shaughnessy pointed out, which would be favorable for U.S. labs.
Currently, most of China's AI labs seem committed to the opposite strategy.
