OpenAI Launches GPT-5.4: Computer Use, 1M Token Context, and Tool Search
OpenAI releases GPT-5.4 with native computer control, a 1-million-token context window, and a new Tool Search system that cuts token usage by 47%.
OpenAI releases GPT-5.4 with native computer control, a 1-million-token context window, and a new Tool Search system that cuts token usage by 47%.
OpenAI's Most Capable Model Yet
On March 5, 2026, OpenAI released GPT-5.4, a new foundation model the company describes as its most capable and efficient frontier model for professional work. Available in three variants, standard GPT-5.4, GPT-5.4 Thinking (a reasoning-focused version), and GPT-5.4 Pro (optimized for maximum performance), the release consolidates capabilities that were previously spread across separate models into a single unified system.
GPT-5.4 is rolling out to ChatGPT Plus, Team, and Pro subscribers, as well as through the OpenAI API. The model represents a significant step forward in three key areas: native computer use, an industry-leading context window, and a novel approach to tool management that dramatically reduces costs.
Native Computer Use: A First for OpenAI
GPT-5.4 is the first general-purpose OpenAI model that can take direct control of a computer. The model can click, type, and navigate software applications using screenshots and mouse/keyboard commands, without relying on a separate specialized model.
This capability positions GPT-5.4 as a direct competitor to Anthropic's Claude computer use feature, which launched in late 2024. The difference is that GPT-5.4 integrates computer control natively into the same model that handles conversation, coding, and reasoning, rather than requiring a separate tool or model.
On the OSWorld-Verified benchmark, which measures real-world computer use tasks, GPT-5.4 scores 75.0%. This not only exceeds GPT-5.2's score of 47.3% but also surpasses the measured human baseline of 72.4%. On WebArena Verified, another computer use benchmark, GPT-5.4 also sets a new record.
One Million Token Context Window
The API version of GPT-5.4 supports context windows of up to one million tokens, the largest context window ever offered by OpenAI. This is a substantial increase from the 128,000-token limit of GPT-4 and positions the model for enterprise workflows that require processing large codebases, lengthy legal documents, or extensive research corpora.
The expanded context window is particularly significant for agentic applications, where models need to plan, execute, and verify tasks across long horizons while maintaining coherent state across many interactions.
Tool Search: A New Approach to Efficiency
Perhaps the most technically innovative feature of GPT-5.4 is Tool Search, a new system for managing tool calling that rethinks how models interact with APIs and external services.
Traditionally, all tool definitions are included in every API request, consuming significant tokens even when most tools are not needed. With Tool Search, GPT-5.4 receives only a lightweight list of available tools along with a search capability. When the model needs to use a specific tool, it dynamically looks up that tool's full definition and appends it to the conversation on demand.
The results are substantial. In testing on 250 tasks from Scale's MCP Atlas benchmark with 36 MCP servers enabled, the Tool Search configuration reduced total token usage by 47% while maintaining accuracy. For developers building complex agentic systems with many tool integrations, this translates directly into lower API costs and faster response times.
Benchmark Performance
Beyond computer use, GPT-5.4 delivers broad improvements across professional benchmarks:
| Benchmark | GPT-5.4 | GPT-5.2 | Improvement |
|---|---|---|---|
| OSWorld-Verified | 75.0% | 47.3% | +27.7 points |
| GDPval (Knowledge Work) | 83.0% | N/A | Record score |
| Claim Accuracy | +33% | Baseline | Per-claim error reduction |
| Response Accuracy | +18% | Baseline | Overall error reduction |
The model consolidates the coding strengths of GPT-5.3-Codex, improved reasoning from GPT-5.4 Thinking, and the new agentic capabilities for autonomous desktop, browser, and application navigation.
Three Model Variants
OpenAI is offering GPT-5.4 in three configurations to serve different use cases:
GPT-5.4 (Standard): The default model for ChatGPT subscribers, balancing capability with speed for everyday tasks including conversation, coding, analysis, and now computer use.
GPT-5.4 Thinking: A reasoning-focused variant that applies extended chain-of-thought processing to complex problems. Designed for tasks requiring multi-step logic, mathematical proofs, or scientific reasoning.
GPT-5.4 Pro: Optimized for maximum performance on the most demanding professional tasks. Available for users who need the highest accuracy on complex enterprise workflows.
Pros
- Native computer use surpasses the human baseline on OSWorld-Verified at 75.0%, making autonomous software navigation practically viable
- Tool Search reduces token usage by 47% in multi-tool scenarios, directly lowering API costs for developers building agentic systems
- One-million-token context window enables processing of entire codebases, legal documents, and research corpora in a single request
- Consolidates coding, reasoning, and agentic capabilities into one model instead of requiring separate specialized models
- 33% reduction in per-claim errors compared to GPT-5.2 demonstrates meaningful progress on hallucination reduction
Cons
- Computer use capabilities remain in early stages, and real-world reliability across diverse software environments is unproven at scale
- The one-million-token context window is API-only, with ChatGPT subscribers likely receiving a smaller limit
- Three model variants (Standard, Thinking, Pro) add complexity for users deciding which version to use
- Pricing details for the Pro variant and extended context windows have not been fully disclosed
Outlook
GPT-5.4 represents OpenAI's clearest statement yet that the future of AI is agentic. By combining computer use, massive context, and efficient tool management in a single model, OpenAI is building the foundation for AI systems that can autonomously complete complex multi-step workflows.
The Tool Search innovation is particularly worth watching. As the AI ecosystem moves toward standardized tool protocols like MCP, the ability to efficiently manage hundreds or thousands of tool definitions becomes a critical infrastructure challenge. GPT-5.4's approach of dynamic tool retrieval could become the standard pattern.
The competitive landscape is intensifying. Anthropic's Claude already offers computer use capabilities, and Google's Gemini is pushing agentic features through Pixel devices. GPT-5.4's benchmark-leading performance on computer use tasks gives OpenAI a strong position, but the real test will be reliability in production deployments.
Conclusion
GPT-5.4 is a significant release that advances the state of the art in three important dimensions: autonomous computer control, context length, and tool efficiency. The model's ability to exceed human performance on computer use benchmarks while simultaneously reducing operational costs through Tool Search makes it compelling for both individual developers and enterprise customers. For teams building agentic AI applications, GPT-5.4 is the most complete single-model solution currently available from any major provider.
Pros
- Native computer use surpasses the human baseline on OSWorld-Verified at 75.0%, enabling practical autonomous software navigation
- Tool Search reduces token usage by 47% in multi-tool scenarios, directly lowering API costs for agentic applications
- 1-million-token context window enables processing entire codebases and document corpora in a single request
- Consolidates coding, reasoning, and agentic capabilities into one unified model
- 33% reduction in per-claim errors compared to GPT-5.2 demonstrates meaningful hallucination reduction
Cons
- Computer use reliability across diverse real-world software environments remains unproven at scale
- 1-million-token context window is API-only, not available to all ChatGPT subscribers
- Three model variants add decision complexity for users choosing between Standard, Thinking, and Pro
- Full pricing details for Pro variant and extended context have not been disclosed
References
Comments0
Key Features
OpenAI launched GPT-5.4 on March 5, 2026, introducing native computer use that scores 75.0% on OSWorld-Verified (surpassing the 72.4% human baseline), a 1-million-token context window (the largest in OpenAI's history), and Tool Search which reduces token usage by 47% in multi-tool scenarios. The model is available in three variants: Standard, Thinking (reasoning-focused), and Pro (maximum performance). GPT-5.4 consolidates coding capabilities from GPT-5.3-Codex with improved reasoning and agentic desktop navigation, achieving 33% fewer per-claim errors than GPT-5.2.
Key Insights
- GPT-5.4 is the first OpenAI model with native computer use, scoring 75.0% on OSWorld-Verified and surpassing the 72.4% human baseline
- Tool Search dynamically retrieves tool definitions on demand, reducing token usage by 47% across 250 tasks on Scale's MCP Atlas benchmark
- The 1-million-token API context window is the largest OpenAI has ever offered, positioning GPT-5.4 for enterprise-scale document processing
- GPT-5.4 consolidates coding, reasoning, and agentic capabilities that were previously split across GPT-5.3-Codex and other specialized models
- Per-claim error rates dropped 33% compared to GPT-5.2, with overall response errors down 18%
- Three model variants (Standard, Thinking, Pro) allow users to optimize for speed, reasoning depth, or maximum accuracy
- The model scored a record 83% on GDPval, OpenAI's benchmark for knowledge work tasks
- Computer use integration positions GPT-5.4 as a direct competitor to Anthropic Claude's computer use feature
Was this review helpful?
Share
Related AI Reviews
OpenAI Codex Goes Beyond Code: Full Mac Computer Use, Memory, and 90+ Plugins
OpenAI's April 2026 Codex update turns the coding assistant into a full desktop AI agent for macOS, adding computer use, memory, image generation, and over 90 new plugins.
OpenAI Launches GPT-Rosalind: A Specialized AI Model for Drug Discovery and Life Sciences Research
OpenAI released GPT-Rosalind on April 17, 2026, a domain-specific model for biology, genomics, and drug discovery, with access limited to vetted enterprise research partners including Amgen, Moderna, and Thermo Fisher.
OpenAI GPT-5.4-Cyber Review: A Purpose-Built AI Model for Defensive Cybersecurity
OpenAI launches GPT-5.4-Cyber, a fine-tuned GPT-5.4 variant for defensive security work, with binary reverse engineering and expanded Trusted Access for Cyber program.
OpenAI and Amazon Sign $50B Partnership: AWS Becomes Primary Enterprise Distribution Channel
OpenAI announces a $50B multi-year strategic partnership with Amazon, positioning AWS as its primary enterprise distribution channel and signaling a deliberate pivot away from Microsoft.
