Back to list
Mar 22, 2026
5
0
0
ClaudeNEW

Claude Opus 4.6 Makes 1M Token Context Window GA at Standard Pricing: No More Premium

Anthropic drops the long-context premium on Opus 4.6 and Sonnet 4.6, making the full 1M token context window available at standard per-token rates with 600-media support.

#Claude#Anthropic#Opus 4.6#1M Context#Long Context
Claude Opus 4.6 Makes 1M Token Context Window GA at Standard Pricing: No More Premium
AI Summary

Anthropic drops the long-context premium on Opus 4.6 and Sonnet 4.6, making the full 1M token context window available at standard per-token rates with 600-media support.

Key Takeaways

Anthropic announced on March 13, 2026 that the 1 million token context window is now generally available for Claude Opus 4.6 and Claude Sonnet 4.6 at standard pricing. This eliminates the premium that previously applied to prompts exceeding 200,000 tokens during the beta period, where Opus 4.6 charged $10 per million input tokens and $37.50 per million output tokens for long-context requests. Under the new pricing, a 900,000-token request is billed at the same per-token rate as a 9,000-token request: $5/$25 per million tokens for Opus 4.6 and $3/$15 for Sonnet 4.6.

Alongside the pricing change, Anthropic raised the media limit from 100 to 600 images or PDF pages per request, expanded availability across Claude Platform, Microsoft Foundry, and Google Cloud's Vertex AI, and removed the requirement for a beta header on requests exceeding 200,000 tokens.

Feature Overview

1. Pricing: The Premium Is Gone

The most significant change is economic. During the beta period, using the 1M context window carried a substantial surcharge:

ModelBeta Pricing (200K+)GA Standard PricingSavings
Opus 4.6 Input$10/M tokens$5/M tokens50%
Opus 4.6 Output$37.50/M tokens$25/M tokens33%
Sonnet 4.6 Input$6/M tokens$3/M tokens50%
Sonnet 4.6 Output$22.50/M tokens$15/M tokens33%

For enterprise applications that routinely process large documents, codebases, or extended conversations, the savings are substantial. A single 800,000-token Opus 4.6 request that previously cost $8 in input tokens alone now costs $4.

2. Performance: Leading Frontier Models in Long-Context Tasks

Opus 4.6 scores 78.3% on MRCR v2, the Multi-turn Retrieval and Comprehension in Reading benchmark, which is the highest score among frontier models at that context length. On the more demanding 8-needle 1M variant of MRCR v2, a needle-in-a-haystack benchmark that tests retrieval accuracy across the full context window, Opus 4.6 scores 76% while Sonnet 4.5 manages just 18.5%.

These numbers reflect a generational improvement. Each successive Claude model generation has improved long-context retrieval accuracy, and Opus 4.6 represents the first time a model has maintained strong performance across the full 1 million token window.

3. Media Processing: 6x Capacity Increase

The media limit increase from 100 to 600 images or PDF pages per request is a practical upgrade for enterprise use cases. Legal teams can now process entire contract portfolios in a single request. Research teams can analyze hundreds of academic papers simultaneously. Development teams can load comprehensive codebases for analysis without splitting requests.

This 6x increase also applies to mixed media requests, meaning users can combine PDFs, images, and text within the same 1M token context window.

4. Claude Code Integration

For Claude Code users on Max, Team, and Enterprise subscriptions, the full 1M context window is now available automatically in Opus 4.6 sessions. This enables developers to load entire codebases, including all dependencies and documentation, into a single conversation. The practical impact is that Claude Code can now reason about large-scale software projects holistically rather than working with fragments.

5. Extended Output: Up to 128K Tokens

Opus 4.6 supports outputs of up to 128,000 tokens, which is the longest output capability among the current Claude model lineup. Combined with the 1M input context, this creates a pipeline where massive inputs can produce comprehensive outputs. Use cases include generating detailed analysis reports from large datasets, producing complete code refactors from full codebase reviews, and writing comprehensive summaries of extensive document collections.

Usability Analysis

The practical impact of this change is most significant for three user groups. Enterprise API customers who process large documents will see immediate cost savings of 33-50% on their long-context workloads. Development teams using Claude Code gain the ability to reason about entire codebases without chunking. And research teams working with large paper collections or datasets can now run analyses that were previously cost-prohibitive.

The removal of the beta header requirement is a small but meaningful developer experience improvement. Previously, accessing the 1M context window required explicitly opting in through API headers. Now it is available by default, reducing friction for applications that occasionally need extended context.

For Claude Desktop and web users on Pro and Max plans, the 1M context window translates to the ability to upload large documents and maintain very long conversations without hitting limits. The 600-image support means that visual-heavy workflows, such as analyzing slide decks or reviewing design mockups, are now far more practical.

Pros

  1. 50% input cost reduction for long-context workloads compared to beta pricing, making enterprise-scale document processing significantly more affordable
  2. 78.3% MRCR v2 score is the highest among frontier models, demonstrating genuine capability rather than just a large context window
  3. 600-media support enables practical enterprise workflows like full contract portfolio analysis and comprehensive codebase review
  4. No beta header required reduces developer friction and makes 1M context available by default
  5. 128K token output paired with 1M input creates the longest input-to-output pipeline among current frontier models

Limitations

  1. Cost remains substantial at standard pricing for very large requests, with an 800K-token Opus 4.6 input still costing $4 per request
  2. Latency increases with context length, and million-token requests require patience for initial processing
  3. Not all use cases benefit from extreme context, and many applications work effectively within 200K tokens
  4. Availability is limited to Opus 4.6 and Sonnet 4.6, with older model versions not receiving the 1M context upgrade

Outlook

The move to standard pricing for 1M context is a competitive signal. Google's Gemini models have offered long context windows, but Anthropic is now making the economic case that large context should not carry a premium. This pricing decision could pressure other providers to follow suit, potentially accelerating the industry toward treating extended context as a baseline capability rather than a premium feature.

For the broader AI industry, the performance data is perhaps more significant than the pricing. A 78.3% score on MRCR v2 at 1M tokens demonstrates that models are becoming genuinely useful at extreme context lengths, not just capable of accepting long inputs. As benchmark scores continue to improve, the practical applications of million-token context will expand from niche enterprise use cases to mainstream development workflows.

Anthropic's simultaneous expansion of media limits and output length suggests the company is building toward a future where Claude functions as a comprehensive analysis engine: ingest everything, process holistically, output in detail. This positions Claude for agentic workflows where autonomous systems need to understand large, complex information spaces to make decisions.

Conclusion

Claude Opus 4.6's 1M context window moving to general availability at standard pricing removes the most significant barrier to enterprise adoption of long-context AI. The combination of industry-leading retrieval accuracy, 6x media capacity, and 128K output length creates a practical tool for teams that need to process, analyze, and act on large information sets. For developers, researchers, and enterprise teams working with substantial data, this update makes Claude the most cost-effective frontier model for long-context workloads.

Pros

  • 50% input cost reduction for long-context workloads compared to beta pricing
  • Highest frontier model score on MRCR v2 at 1M context length (78.3%)
  • 600-media support enables practical enterprise document processing workflows
  • No beta header required reduces developer friction
  • 128K token output paired with 1M input creates the longest available pipeline

Cons

  • Cost remains substantial for very large requests at standard pricing
  • Latency increases with context length requiring patience for million-token requests
  • Not all use cases benefit from extreme context lengths
  • Limited to Opus 4.6 and Sonnet 4.6 models only

Comments0

Key Features

1. 1M token context window now generally available at standard pricing: $5/$25 per million tokens for Opus 4.6, no long-context premium 2. Opus 4.6 scores 78.3% on MRCR v2, the highest among frontier models at 1M context length 3. Media limit increased 6x from 100 to 600 images or PDF pages per request 4. Outputs up to 128K tokens supported, the longest among current Claude models 5. No beta header required for requests exceeding 200K tokens, available on Claude Platform, Microsoft Foundry, and Vertex AI

Key Insights

  • The 50% input cost reduction for long-context workloads makes enterprise-scale document processing significantly more affordable
  • Opus 4.6's 78.3% MRCR v2 score demonstrates genuine capability improvement, not just context window expansion
  • The 76% score on 8-needle 1M MRCR v2 versus Sonnet 4.5's 18.5% shows the generational leap in long-context retrieval
  • 600-media support enables full contract portfolio analysis and comprehensive codebase review in single requests
  • Standard pricing for 1M context could pressure competitors to eliminate their own long-context premiums
  • Claude Code integration gives developers the ability to reason about entire large-scale projects holistically
  • The combined 1M input and 128K output pipeline positions Claude for comprehensive analysis workflows
  • This pricing move signals Anthropic views extended context as a baseline capability rather than a premium feature

Was this review helpful?

Share

Twitter/X