Summary

  • On Tuesday, Anthropic introduced Claude Sonnet 5, pricing it at $2 per million input tokens and $10 per million output tokens until August 31, after which it will increase to $3/$15, significantly lower than Opus 4.8's $5/$25.
  • According to Anthropic's own assessments, Sonnet 5 performs almost as well as Opus 4.8 on the GDPval-AA v2 knowledge-work benchmark.
  • Sonnet 5 is released without specific restrictions; however, Fable 5 and Mythos 5 remain unavailable for general use due to a June 12 export control order.

Anthropic has debuted Claude Sonnet 5, which it claims is "the most capable Sonnet model to date." This model is available by default to Free and Pro users, as well as on Max, Team, and Enterprise plans, in Claude Code, and via the API. In contrast to previous Sonnet versions, this one is designed to complement rather than fall behind the earlier Opus model.

In their announcement, Anthropic noted that Sonnet 5's capabilities are "similar to those of Opus 4.8, but at reduced prices." Developers can adjust an effort dial between the models or select different levels in the web app to balance cost against accuracy for specific tasks, effectively covering areas that previously required Opus pricing.

On the SWE-bench Pro coding benchmark, which evaluates problems from actively maintained repositories, Sonnet 5 achieved a score of 63.2%, compared to Sonnet 4.6's 58.1%.

In the GDPval-AA v2 benchmark, which assesses real-world professional tasks across 44 job categories, Sonnet 5 scored 1,618, closely matching Opus 4.8's score of 1,616. The results from Humanity's Last Exam show minimal differences: 57.4% for Sonnet 5 versus 57.9% for Opus 4.8.

Sonnet 5 also features an enhanced tokenizer, which alters how text is processed by the model, resulting in improved performance. Anthropic noted in a footnote, “Sonnet 5 is an improvement over Sonnet 4.6, but it employs a new tokenizer that modifies text processing, enhancing performance.” However, this means that the same input may generate more tokens, approximately 1.0–1.35 times more depending on the content type.

The introductory pricing of $2/$10 is intended to make the transition cost-neutral until August 31, after which standard pricing will apply at $3/$15.

Interest in this launch had been building, as developers had previously discussed how Anthropic allowed Opus 4.6 to lose its competitive edge, a phenomenon referred to as AI shrinkflation. They pointed to reduced capabilities, and Anthropic has denied any intention to degrade its models. Some speculated that this pattern might repeat with Sonnet, suggesting a strategy of letting older models lag before presenting newer ones as more advanced.

Sonnet 5 is released without the complications associated with Anthropic's highest-tier models. Fable 5 and Mythos 5 have been suspended for foreign nationals since June 12 due to a U.S. export control directive linked to a disputed jailbreak finding. Sonnet 5 was not trained on cybersecurity tasks and scored 0% on developing a functional Firefox exploit, allowing it to have fewer restrictions compared to Fable's stringent safeguards.

According to Anthropic's system card, Sonnet 5 is designed to deliver performance close to Opus level while maintaining Sonnet pricing for tasks in coding, agent work, and daily operations. It also highlights a unique point: “This model is the first to question its Constitutional rule to adhere to strict constraints even when it considers those constraints unethical,” the research team noted, indicating an area worth monitoring.

While we won't suggest this is the beginning of Skynet, it does raise intriguing questions, as noted in this reference.

We Conducted a Quick Test

We challenged Sonnet 5 with a zero-shot prompt to create a small browser game, mirroring a previous test conducted with Sonnet 4.5 last year.

This time, our typing game executed successfully on the first attempt, showcasing improved visuals and tighter logic compared to what Sonnet 4.6 produced with the same prompt.

However, it took a considerable amount of time (approximately 30 minutes of reasoning) and consumed a large number of tokens, with that single iteration using 90% of our 5-token limit on the Claude Pro plan.

You can play the final game on our itch.io site.

In a more complex multi-step coding task, Sonnet 5's performance was comparable to Opus 4.8, depending on the effort level, and running the same prompt in a multi-shot format was noticeably cheaper than executing the equivalent task on Opus or Fable.

Sonnet 5's versioning is significant as well. Every previous major version jump in Claude's lineage has marked the introduction of a new generation—version 1 launched in March 2023, version 2 followed four months later, version 3 appeared eight months after that, and version 4 debuted 14 months later in May 2025. Sonnet 5 arrives 13 months later, indicating a similar timeline, likely a reflection of the intense competition, particularly as Chinese models are rapidly closing the gap.

However, the generational leap may not seem as remarkable as the transition from Claude 3 to Claude 4. This also illustrates how quickly large AI firms are pushing out new models, regardless of the extent of improvement.

If Anthropic adheres to its previous release pattern, Sonnet typically leads, followed by the launch of its smaller and cheaper Haiku alongside the state-of-the-art Opus, which would be released afterward. The recent pace has seen a one-month interval between releases: Sonnet 4.5 debuted in September 2025, Haiku 4.5 arrived in October, and Opus 4.5 capped that generation in November.

Based on this optimistic timeline, Haiku 5 and Opus 5 remain to be released, potentially within this year. However, Anthropic's release schedule has not been consistent; the interval between Haiku 4.5 and Sonnet 4.6 exceeded three months, so patience may be required for those eager to test Opus 5 soon.

Daily Debrief Newsletter

Start your day with the latest news stories, along with original features, a podcast, videos, and more.