Claude Opus 4.6 Makes 1M Token Context Window GA at Standard Pricing: No More Premium
Anthropic drops the long-context premium on Opus 4.6 and Sonnet 4.6, making the full 1M token context window available at standard per-token rates with 600-media support.
Anthropic drops the long-context premium on Opus 4.6 and Sonnet 4.6, making the full 1M token context window available at standard per-token rates with 600-media support.
Key Takeaways
Anthropic announced on March 13, 2026 that the 1 million token context window is now generally available for Claude Opus 4.6 and Claude Sonnet 4.6 at standard pricing. This eliminates the premium that previously applied to prompts exceeding 200,000 tokens during the beta period, where Opus 4.6 charged $10 per million input tokens and $37.50 per million output tokens for long-context requests. Under the new pricing, a 900,000-token request is billed at the same per-token rate as a 9,000-token request: $5/$25 per million tokens for Opus 4.6 and $3/$15 for Sonnet 4.6.
Alongside the pricing change, Anthropic raised the media limit from 100 to 600 images or PDF pages per request, expanded availability across Claude Platform, Microsoft Foundry, and Google Cloud's Vertex AI, and removed the requirement for a beta header on requests exceeding 200,000 tokens.
Feature Overview
1. Pricing: The Premium Is Gone
The most significant change is economic. During the beta period, using the 1M context window carried a substantial surcharge:
| Model | Beta Pricing (200K+) | GA Standard Pricing | Savings |
|---|---|---|---|
| Opus 4.6 Input | $10/M tokens | $5/M tokens | 50% |
| Opus 4.6 Output | $37.50/M tokens | $25/M tokens | 33% |
| Sonnet 4.6 Input | $6/M tokens | $3/M tokens | 50% |
| Sonnet 4.6 Output | $22.50/M tokens | $15/M tokens | 33% |
For enterprise applications that routinely process large documents, codebases, or extended conversations, the savings are substantial. A single 800,000-token Opus 4.6 request that previously cost $8 in input tokens alone now costs $4.
2. Performance: Leading Frontier Models in Long-Context Tasks
Opus 4.6 scores 78.3% on MRCR v2, the Multi-turn Retrieval and Comprehension in Reading benchmark, which is the highest score among frontier models at that context length. On the more demanding 8-needle 1M variant of MRCR v2, a needle-in-a-haystack benchmark that tests retrieval accuracy across the full context window, Opus 4.6 scores 76% while Sonnet 4.5 manages just 18.5%.
These numbers reflect a generational improvement. Each successive Claude model generation has improved long-context retrieval accuracy, and Opus 4.6 represents the first time a model has maintained strong performance across the full 1 million token window.
3. Media Processing: 6x Capacity Increase
The media limit increase from 100 to 600 images or PDF pages per request is a practical upgrade for enterprise use cases. Legal teams can now process entire contract portfolios in a single request. Research teams can analyze hundreds of academic papers simultaneously. Development teams can load comprehensive codebases for analysis without splitting requests.
This 6x increase also applies to mixed media requests, meaning users can combine PDFs, images, and text within the same 1M token context window.
4. Claude Code Integration
For Claude Code users on Max, Team, and Enterprise subscriptions, the full 1M context window is now available automatically in Opus 4.6 sessions. This enables developers to load entire codebases, including all dependencies and documentation, into a single conversation. The practical impact is that Claude Code can now reason about large-scale software projects holistically rather than working with fragments.
5. Extended Output: Up to 128K Tokens
Opus 4.6 supports outputs of up to 128,000 tokens, which is the longest output capability among the current Claude model lineup. Combined with the 1M input context, this creates a pipeline where massive inputs can produce comprehensive outputs. Use cases include generating detailed analysis reports from large datasets, producing complete code refactors from full codebase reviews, and writing comprehensive summaries of extensive document collections.
Usability Analysis
The practical impact of this change is most significant for three user groups. Enterprise API customers who process large documents will see immediate cost savings of 33-50% on their long-context workloads. Development teams using Claude Code gain the ability to reason about entire codebases without chunking. And research teams working with large paper collections or datasets can now run analyses that were previously cost-prohibitive.
The removal of the beta header requirement is a small but meaningful developer experience improvement. Previously, accessing the 1M context window required explicitly opting in through API headers. Now it is available by default, reducing friction for applications that occasionally need extended context.
For Claude Desktop and web users on Pro and Max plans, the 1M context window translates to the ability to upload large documents and maintain very long conversations without hitting limits. The 600-image support means that visual-heavy workflows, such as analyzing slide decks or reviewing design mockups, are now far more practical.
Pros
- 50% input cost reduction for long-context workloads compared to beta pricing, making enterprise-scale document processing significantly more affordable
- 78.3% MRCR v2 score is the highest among frontier models, demonstrating genuine capability rather than just a large context window
- 600-media support enables practical enterprise workflows like full contract portfolio analysis and comprehensive codebase review
- No beta header required reduces developer friction and makes 1M context available by default
- 128K token output paired with 1M input creates the longest input-to-output pipeline among current frontier models
Limitations
- Cost remains substantial at standard pricing for very large requests, with an 800K-token Opus 4.6 input still costing $4 per request
- Latency increases with context length, and million-token requests require patience for initial processing
- Not all use cases benefit from extreme context, and many applications work effectively within 200K tokens
- Availability is limited to Opus 4.6 and Sonnet 4.6, with older model versions not receiving the 1M context upgrade
Outlook
The move to standard pricing for 1M context is a competitive signal. Google's Gemini models have offered long context windows, but Anthropic is now making the economic case that large context should not carry a premium. This pricing decision could pressure other providers to follow suit, potentially accelerating the industry toward treating extended context as a baseline capability rather than a premium feature.
For the broader AI industry, the performance data is perhaps more significant than the pricing. A 78.3% score on MRCR v2 at 1M tokens demonstrates that models are becoming genuinely useful at extreme context lengths, not just capable of accepting long inputs. As benchmark scores continue to improve, the practical applications of million-token context will expand from niche enterprise use cases to mainstream development workflows.
Anthropic's simultaneous expansion of media limits and output length suggests the company is building toward a future where Claude functions as a comprehensive analysis engine: ingest everything, process holistically, output in detail. This positions Claude for agentic workflows where autonomous systems need to understand large, complex information spaces to make decisions.
Conclusion
Claude Opus 4.6's 1M context window moving to general availability at standard pricing removes the most significant barrier to enterprise adoption of long-context AI. The combination of industry-leading retrieval accuracy, 6x media capacity, and 128K output length creates a practical tool for teams that need to process, analyze, and act on large information sets. For developers, researchers, and enterprise teams working with substantial data, this update makes Claude the most cost-effective frontier model for long-context workloads.
Pros
- 50% input cost reduction for long-context workloads compared to beta pricing
- Highest frontier model score on MRCR v2 at 1M context length (78.3%)
- 600-media support enables practical enterprise document processing workflows
- No beta header required reduces developer friction
- 128K token output paired with 1M input creates the longest available pipeline
Cons
- Cost remains substantial for very large requests at standard pricing
- Latency increases with context length requiring patience for million-token requests
- Not all use cases benefit from extreme context lengths
- Limited to Opus 4.6 and Sonnet 4.6 models only
References
Comments0
Key Features
1. 1M token context window now generally available at standard pricing: $5/$25 per million tokens for Opus 4.6, no long-context premium 2. Opus 4.6 scores 78.3% on MRCR v2, the highest among frontier models at 1M context length 3. Media limit increased 6x from 100 to 600 images or PDF pages per request 4. Outputs up to 128K tokens supported, the longest among current Claude models 5. No beta header required for requests exceeding 200K tokens, available on Claude Platform, Microsoft Foundry, and Vertex AI
Key Insights
- The 50% input cost reduction for long-context workloads makes enterprise-scale document processing significantly more affordable
- Opus 4.6's 78.3% MRCR v2 score demonstrates genuine capability improvement, not just context window expansion
- The 76% score on 8-needle 1M MRCR v2 versus Sonnet 4.5's 18.5% shows the generational leap in long-context retrieval
- 600-media support enables full contract portfolio analysis and comprehensive codebase review in single requests
- Standard pricing for 1M context could pressure competitors to eliminate their own long-context premiums
- Claude Code integration gives developers the ability to reason about entire large-scale projects holistically
- The combined 1M input and 128K output pipeline positions Claude for comprehensive analysis workflows
- This pricing move signals Anthropic views extended context as a baseline capability rather than a premium feature
Was this review helpful?
Share
Related AI Reviews
Claude March 2026 Usage Promotion: Anthropic Doubles Off-Peak Limits for All Plans
Anthropic launches a two-week promotion doubling Claude usage limits during off-peak hours across Free, Pro, Max, and Team plans from March 13 through March 27, 2026.
Anthropic Launches Claude Partner Network With $100M Investment and First Technical Certification
Anthropic unveils the Claude Partner Network backed by $100M in 2026 funding, introduces Claude Certified Architect certification, and signs anchor partners including Accenture and Deloitte.
Claude Gets Interactive Visuals: Charts, Diagrams, and Data Visualizations Now Appear Directly in Chat
Anthropic launches inline interactive visuals for Claude, enabling charts, diagrams, and data visualizations directly within conversations. Available to all users including free tier.
Anthropic Launches Claude Marketplace: One-Stop Shop for Enterprise AI Tools
Anthropic unveils Claude Marketplace, letting enterprises use existing spending commitments to purchase third-party AI tools from Replit, GitLab, Harvey, Snowflake, and more.
