Google Opens Gemini Deep Research Agent to Developers via New Interactions API
Google makes its autonomous Deep Research Agent available to developers through the Interactions API, powered by Gemini 3.1 Pro with web search and private data capabilities.
Google makes its autonomous Deep Research Agent available to developers through the Interactions API, powered by Gemini 3.1 Pro with web search and private data capabilities.
Deep Research Leaves the Consumer Sandbox
Google has opened its Gemini Deep Research Agent to developers through a new Interactions API, marking the first time the company's most advanced autonomous research capabilities are available outside of its consumer products. Previously limited to the Gemini App and Google Search, the Deep Research Agent can now be embedded directly into third-party applications.
The agent is powered by Gemini 3.1 Pro and is designed for complex, multi-step information gathering tasks that go beyond simple question-and-answer interactions. It autonomously plans research strategies, executes searches, reads and synthesizes content, and produces cited reports.
How the Deep Research Agent Works
Unlike standard language model API calls that return immediate responses, the Deep Research Agent operates asynchronously. Developers send a research prompt, and the agent independently determines how to investigate the topic, executing multiple rounds of search and analysis before delivering a final report.
The workflow follows a four-stage pattern:
| Stage | Description |
|---|---|
| Planning | Agent analyzes the research prompt and creates an investigation strategy |
| Searching | Agent queries Google Search and reads web pages, iterating as needed |
| Synthesis | Agent combines findings from multiple sources into a coherent analysis |
| Reporting | Agent delivers a detailed report with source citations |
This iterative approach means the agent may perform 80 to 160 web searches per task, reading and processing up to 900,000 input tokens for complex research questions. The entire process can take up to 60 minutes for the most demanding queries.
The Interactions API: A New Interface Pattern
The Interactions API represents a departure from Google's existing Gemini API patterns. Standard API calls use synchronous generate_content methods, but the Deep Research Agent requires background execution through a background=true parameter.
Developers create an interaction, receive an interaction ID, and then either poll for results or stream progress updates. The API provides thinking summaries during execution, giving applications visibility into what the agent is currently investigating.
Key technical specifications include:
- Agent identifier:
deep-research-pro-preview-12-2025 - Maximum execution time: 60 minutes
- Input types: Text, images, PDFs, audio, and video
- Output: Detailed reports with inline source citations
- Streaming: Real-time progress updates available
- Reconnection: Support for network resilience using
last_event_idtracking
The API does not currently support custom Function Calling tools or Model Context Protocol (MCP) servers, limiting integration options for applications that need the agent to interact with proprietary systems.
Private Data Integration
One of the most significant capabilities is the File Search tool, which allows developers to feed their own documents into the research process. The agent can combine public web research with analysis of private PDFs, reports, and datasets.
This opens use cases in competitive intelligence, due diligence, academic research, and market analysis where the most valuable insights come from connecting public information with proprietary data.
Pricing and Cost Structure
Google has adopted a pay-as-you-go pricing model based on underlying model usage and tool costs:
| Research Complexity | Estimated Cost | Typical Searches | Input Tokens |
|---|---|---|---|
| Standard | $2-3 per task | ~80 searches | ~250K tokens |
| Complex | $3-5 per task | ~160 searches | ~900K tokens |
At $2-5 per research task, the agent is positioned as a tool for high-value information work rather than casual queries. For comparison, a human analyst performing equivalent research would typically spend 2-4 hours on a single task.
DeepSearchQA: A New Benchmark
Alongside the API launch, Google open-sourced DeepSearchQA, a benchmark containing 900 "causal chain" tasks designed to evaluate agent comprehensiveness. Unlike traditional fact-retrieval benchmarks that test whether an agent can find a specific answer, DeepSearchQA tests whether an agent can follow multi-step reasoning chains across multiple sources.
This is significant because existing search benchmarks do not adequately measure the kind of synthesis work that the Deep Research Agent performs. By open-sourcing the benchmark, Google is inviting the research community to evaluate competing approaches against a common standard.
Integration Across Google Services
Google is simultaneously integrating the Deep Research Agent across its own product portfolio:
- Google Search: Enhanced research results for complex queries
- Google Finance: Automated financial analysis and due diligence
- Gemini App: Consumer-facing deep research feature
- NotebookLM: Research assistant for document analysis
This multi-surface deployment means the same underlying agent powers both consumer and developer use cases, with the API providing programmatic access to capabilities that Google users access through product interfaces.
Current Limitations
The preview release has several constraints that developers should consider:
- No custom tools: Cannot call developer-defined functions or MCP servers
- No structured output: Reports are delivered as free-form text, not structured JSON
- No human-in-the-loop planning: The agent determines its own research strategy without developer approval
- Audio input not supported: Despite accepting multimodal inputs, audio files are excluded
- Preview status: The API may change before general availability
The lack of custom tool support is particularly notable, as it means the agent cannot perform actions like querying internal databases, calling proprietary APIs, or interacting with external services during its research process.
Competitive Position
The Deep Research Agent places Google in a unique position among AI providers. While OpenAI, Anthropic, and others offer powerful language models through APIs, none currently provide a comparable autonomous research agent as a developer-accessible service.
Perplexity AI offers search-augmented AI responses, but its approach is more focused on real-time question answering than extended autonomous research. The Deep Research Agent's willingness to spend up to 60 minutes on a single query represents a fundamentally different product category.
What This Means for Developers
The Interactions API opens a new class of AI-powered applications where the value proposition is not instant responses but thorough research. Potential use cases include:
- Legal research platforms that synthesize case law and regulatory information
- Investment analysis tools that compile comprehensive company profiles
- Academic research assistants that conduct literature reviews
- Competitive intelligence systems that monitor and analyze market developments
- Journalism tools that assist with background research and fact-checking
For these applications, the $2-5 per query cost is trivial compared to the value of the output, making the Deep Research Agent commercially viable for professional use cases.
Editor's Verdict
Google Opens Gemini Deep Research Agent to Developers via New Interactions API earns a solid recommendation within the gemini space.
The strongest case for paying attention is autonomous multi-step research eliminates the need for manual query chaining and result synthesis, which raises the bar for what readers should now expect from peers in this space. Reinforcing that, private data integration allows combining proprietary documents with public web research adds practical value rather than just headline appeal. The broader signal worth registering is straightforward: the Deep Research Agent is the first autonomous research agent made available as a developer API from a major AI provider. On the other side of the ledger, no custom Function Calling or MCP server support limits integration with proprietary systems is a real constraint, not a marketing footnote, and it should factor into any serious decision. Layered on top of that, preview status means API specifications may change before general availability narrows the set of teams for whom this is an obvious yes.
For Google Cloud and Workspace integrators, multimodal-first teams, and Gemini API adopters, this is a serious evaluation candidate, not just a curiosity to bookmark. For everyone else, the safer posture is to monitor coverage and revisit once the use cases that matter to your team are demonstrated in the wild.
Pros
- Autonomous multi-step research eliminates the need for manual query chaining and result synthesis
- Private data integration allows combining proprietary documents with public web research
- Streaming progress updates provide transparency into the agent's research process
- Open-sourced DeepSearchQA benchmark enables objective comparison across competing approaches
- Multi-surface deployment across Google Search, Finance, and NotebookLM validates real-world utility
Cons
- No custom Function Calling or MCP server support limits integration with proprietary systems
- Preview status means API specifications may change before general availability
- 60-minute execution time is unsuitable for real-time or low-latency applications
- No structured output format makes programmatic processing of results more difficult
References
Comments0
Key Features
Google's Gemini Deep Research Agent is now available to developers via the new Interactions API in preview. Powered by Gemini 3.1 Pro, it autonomously plans, searches, and synthesizes multi-step research tasks with up to 160 web searches and 900K input tokens per query. It supports multimodal inputs including images, PDFs, and video, and can integrate private data via File Search. Google also open-sourced DeepSearchQA, a 900-task benchmark for evaluating research agent comprehensiveness.
Key Insights
- The Deep Research Agent is the first autonomous research agent made available as a developer API from a major AI provider
- At $2-5 per research task, it is priced for professional applications rather than casual consumer queries
- The Interactions API introduces a new asynchronous pattern that differs from standard synchronous LLM API calls
- DeepSearchQA's 900 causal-chain tasks establish a new evaluation standard for research agent comprehensiveness
- Private data integration via File Search enables combining public web research with proprietary document analysis
- The 60-minute maximum execution time positions this as a fundamentally different product category than real-time AI assistants
- Lack of custom tool support limits integration with proprietary systems in the current preview
Was this review helpful?
Share
Related AI Reviews
Google Search Crosses 1 Billion AI Mode Users and Launches Information Agents
Google's AI Mode hit 1 billion monthly users at I/O 2026, with a landmark Search redesign powered by Gemini 3.5 Flash and new persistent Information Agents.
Google Gemini Omni Review: Conversational Video Generation That Understands Physics
Unveiled at Google I/O 2026 on May 19, Gemini Omni is a multimodal model that generates and edits video from text, images, and audio — fusing Gemini reasoning with Veo rendering and DeepMind Genie world simulation.
Gemini Spark: Google's 24/7 Personal AI Agent Launched at I/O 2026
Google unveiled Gemini Spark at I/O 2026 — a persistent AI agent running on cloud VMs around the clock to autonomously handle complex tasks across Gmail, Docs, and the web.
Gemini 3.5 Flash Launched at Google I/O 2026: Pro-Level Reasoning at Flash Speed
Google unveiled Gemini 3.5 Flash at I/O 2026, delivering 4x faster output than rival frontier models with 90.4% on GPQA Diamond and 78% on SWE-bench — now live across Search, the Gemini app, and the API.
