On the evening of April 16, 2026, AI company Anthropic announced that its latest large model, Claude Opus 4.7, is officially online. The model is now available on all Claude products, the official API, and the three major cloud platforms: Amazon, Google, and Microsoft. Pricing remains consistent with the previous generation Opus 4.6: $5 per million input tokens and $25 per million output tokens.
Core Upgrades for Claude Opus 4.7
Opus 4.7 demonstrates enhanced performance in complex software engineering tasks. It can handle long-running tasks more stably and adhere more strictly to user instructions during execution. The model exhibits higher consistency in long-running tasks and self-verifies its own outputs before generating results.
In terms of multimodal capabilities, the model now supports images with a longest edge up to 2,576 pixels (approximately 3.75 megapixels), an increase of over three times compared to previous Claude models. Opus 4.7 performs comprehensively and stably across various benchmarks, ranking in the top tier overall, with strong capabilities in coding, reasoning, and multi-domain tasks.
Regarding memory, Opus 4.7 has improved the file-system-based memory mechanism, allowing it to retain key notes across sessions in long tasks. In third-party evaluations such as GDPval-AA and Finance Agent, Opus 4.7 achieved state-of-the-art scores.
New Features and Changes
Opus 4.7 introduces an xhigh (ultra-high) mode, positioned between high and max. Users can more finely weigh the trade-off between reasoning depth and response latency when tackling difficult problems. In Claude Code, the default tier for all plans has been upgraded to xhigh.
The API adds a "Task Budget" feature (in public beta), allowing developers to set an approximate budget for token consumption so the model knows where to spend more and where to save during long tasks. Claude Code adds the /ultrareview command, specifically for code review, which carefully reads through changes to identify bugs and design issues.
However, two changes in Opus 4.7 will affect token usage: First, it adopts an updated tokenizer, improving how the model processes text, at the cost of increased token count for the same input—roughly 1.0 to 1.35 times the original amount depending on content type. Second, the amount of thinking increases under higher thinking intensity tiers in Opus 4.7, especially in subsequent turns of agent-like scenarios.
Performance Benchmark Comparison
According to benchmark data, Opus 4.7 scored 64.3% in the SWE-bench Pro programming test, jumping from 53.4% in version 4.6, a single-generation increase of nearly 11 percentage points, surpassing GPT-5.4's 57.7% and Gemini 3.1 Pro's 54.2%. In visual reasoning, the CharXiv benchmark rose from 69.1% to 82.1%, benefiting from the newly supported 2576-pixel longest-edge recognition capability. On the tool calling scale evaluation MCP-Atlas, Opus 4.7 reached 77.3%, exceeding GPT-5.4's 68.1% and Gemini's 73.9%.
However, on the Agentic search evaluation BrowseComp, Opus 4.7's score dropped from 83.7% to 79.3%, surpassed by GPT-5.4's 89.3% and Gemini's 85.9%.
Detailed Comparison of Four Models
| Feature | Claude Opus 4.7 | GPT-5.4 | Claude Opus 4.6 | Gemini 3.1 Pro |
|---|---|---|---|---|
| Release Date | April 16, 2026 | March 6, 2026 | February 5, 2026 | February 19, 2026 |
| Developer | Anthropic | OpenAI | Anthropic | |
| Core Features | Enhanced complex software engineering, higher resolution image support, self-verifying output | Native computer usage capability, thought process preview, 1M token context | 1M token context window, adaptive thinking, agent task persistence | Three-layer thinking mode, 2M token context, strengthened core reasoning |
| Coding Ability | SWE-bench Pro: 64.3% | SWE-bench Pro: 57.7% | SWE-bench Pro: 53.4% | SWE-bench Pro: 54.2% |
| Multimodal Capability | Supports 2576 pixel image processing (~3.75MP) | Improved visual perception and document parsing | Standard image processing capability | Powerful multimodal understanding capability |
| Context Window | 200K tokens / 1M tokens (beta) | Up to 1M tokens | 200K tokens / 1M tokens (beta) | Up to 2M tokens |
| Pricing | Input $5/MTok, Output $25/MTok | Not specified (usually billed by usage) | Input $5/MTok, Output $25/MTok | Tiered pricing, same as previous generation |
| Special Features | xhigh mode, Task Budget, /ultrareview command | Thought process preview, native computer control, tool search | Adaptive thinking, compressed API, 128K output tokens | Three-layer thinking mode, Deep Think technology downgraded |
| Security Features | Project Glasswing cybersecurity protection | Continues existing security protections and introduces new open-source evaluations | Overall security is good | Hallucination control AA-Omniscience Index reaches 30 |
User Feedback and Industry Impact
User reviews for Opus 4.7 are somewhat polarized. Most users acknowledge the improvement in coding ability, but have many complaints regarding copywriting and conversational communication. Some users stated that while the official announcement touted visual improvements, token consumption increased significantly; testing the same design draft showed Opus 4.7's input tokens spiked to over 3 times that of Opus 4.6.
In long-context retrieval, Opus 4.6 scored 78.3%, while Opus 4.7 dropped directly to 32.2%. Anthropic explained that the new model reports errors directly when information is missing, rather than hallucinating as before. Actual user tests show that even when information is clearly within the context, it can still miss it.
Conclusion
Claude Opus 4.7 represents a significant advancement for Anthropic in complex software engineering and multimodal processing, particularly surpassing major competitors in programming benchmarks. However, the increase in token consumption and performance decline in certain areas (such as long-context retrieval) indicate that this was not a painless upgrade. For users in hardcore coding scenarios, Opus 4.7 offers significant value; however, for broader application scenarios, users may need to weigh costs against benefits.
With the successive releases of GPT-5.4, Gemini 3.1, and Claude Opus 4.7, the 2026 large model competition has entered a heated stage, with manufacturers seeking the best balance between specialized capabilities, cost control, and user experience.




