Anthropic's Claude Opus 4 and Sonnet 4 Set a New Benchmark in AI Coding

anthropic launches claude opus 4 and claude sonnet 4 ai models

Image Credit: Anthropic

In Short

Anthropic has dropped two new AI models under the Claude 4 series -- Claude Opus 4 and Claude Sonnet 4.
Anthropic says that Claude Opus 4 is the "world's best coding model," outperforming OpenAI Codex-1 and Gemini 2.5 Pro on SWE-bench.
Claude 4 models are rolling out to all paid plans, and free users can access the Claude Sonnet 4 model without extended thinking mode.

On Thursday, Anthropic launched two new AI models under the Claude 4 series — Claude Opus 4 and Claude Sonnet 4. Anthropic says Claude Opus 4 is the “world’s best coding model” and it offers sustained performance on long-horizon, agentic workflows. And Claude Sonnet 4 brings superior coding and reasoning performance than Claude Sonnet 3.7.

First, let’s talk about the Claude Opus 4 AI model. On the SWE-bench verified benchmark which measures performance on real software engineering tasks, Claude Opus 4 achieves 72.5%, slightly higher than OpenAI’s best coding model, Codex-1 which got 72.1%. However, with parallel test-time compute, which appears similar to the Deep Think mode in Gemini 2.5 Pro, Opus 4 achieved a groundbreaking 79.4%.

What is interesting is that the Claude Sonnet 4 model achieves 72.7% on SWE-bench, and with parallel test-time compute, gets 80.2% accuracy — delivering better coding performance than the larger Opus 4 model.

Anthropic says the Claude Sonnet 4 model “balances performance and efficiency for internal and external use cases, with enhanced steerability for greater control over implementations. While not matching Opus 4 in most domains, it delivers an optimal mix of capability and practicality.“

Also Read: How to Set Up MCP Servers in Claude on Windows and Mac

Claude Opus 4 excels in complex, long-running tasks and agentic workflows, while Claude Sonnet 4 combines strong coding performance and efficiency. Both models are hybrid reasoning models, meaning they can offer near-instant responses and extended thinking for deeper reasoning.

Anthropic also notes that when given access to local files, Claude Opus 4 maintains key information in a memory file. For example, while playing Pokémon, Claude Opus 4 created a navigation guide file to improve its gameplay.

Finally, in terms of safety, the company, for the first time, has activated AI Safety Level 3 (ASL-3) for the Claude Opus 4 model, in line with Anthropic’s Responsible Scaling Policy (RSP). Anthropic has implemented Constitutional Classifiers and other defenses to prevent jailbreaking techniques.

Claude 4 models are rolling out to all paid users under Pro, Max, Team, and Enterprise plans. And thankfully, Claude Sonnet 4 is available to free users as well, but without extended thinking.

What is Model Context Protocol (MCP) Explained

Arjun Sha Apr 13, 2025

10 Best Large Language Models (LLMs) in 2026

Arjun Sha Feb 10, 2025

10 Best ChatGPT Alternatives in 2026 (Free & Paid)

Arjun Sha Jan 28, 2025

Anthropic’s Claude AI Can Now Control Your Computer Without Any Help

Sagnik Das Gupta Oct 23, 2024

#Tags