Google Opens Gemini Deep Research Agent to Developers via New Interactions API
Google makes its autonomous Deep Research Agent available to developers through the Interactions API, powered by Gemini 3.1 Pro with web search and private data capabilities.
Google makes its autonomous Deep Research Agent available to developers through the Interactions API, powered by Gemini 3.1 Pro with web search and private data capabilities.
Deep Research Leaves the Consumer Sandbox
Google has opened its Gemini Deep Research Agent to developers through a new Interactions API, marking the first time the company's most advanced autonomous research capabilities are available outside of its consumer products. Previously limited to the Gemini App and Google Search, the Deep Research Agent can now be embedded directly into third-party applications.
The agent is powered by Gemini 3.1 Pro and is designed for complex, multi-step information gathering tasks that go beyond simple question-and-answer interactions. It autonomously plans research strategies, executes searches, reads and synthesizes content, and produces cited reports.
How the Deep Research Agent Works
Unlike standard language model API calls that return immediate responses, the Deep Research Agent operates asynchronously. Developers send a research prompt, and the agent independently determines how to investigate the topic, executing multiple rounds of search and analysis before delivering a final report.
The workflow follows a four-stage pattern:
| Stage | Description |
|---|---|
| Planning | Agent analyzes the research prompt and creates an investigation strategy |
| Searching | Agent queries Google Search and reads web pages, iterating as needed |
| Synthesis | Agent combines findings from multiple sources into a coherent analysis |
| Reporting | Agent delivers a detailed report with source citations |
This iterative approach means the agent may perform 80 to 160 web searches per task, reading and processing up to 900,000 input tokens for complex research questions. The entire process can take up to 60 minutes for the most demanding queries.
The Interactions API: A New Interface Pattern
The Interactions API represents a departure from Google's existing Gemini API patterns. Standard API calls use synchronous generate_content methods, but the Deep Research Agent requires background execution through a background=true parameter.
Developers create an interaction, receive an interaction ID, and then either poll for results or stream progress updates. The API provides thinking summaries during execution, giving applications visibility into what the agent is currently investigating.
Key technical specifications include:
- Agent identifier:
deep-research-pro-preview-12-2025 - Maximum execution time: 60 minutes
- Input types: Text, images, PDFs, audio, and video
- Output: Detailed reports with inline source citations
- Streaming: Real-time progress updates available
- Reconnection: Support for network resilience using
last_event_idtracking
The API does not currently support custom Function Calling tools or Model Context Protocol (MCP) servers, limiting integration options for applications that need the agent to interact with proprietary systems.
Private Data Integration
One of the most significant capabilities is the File Search tool, which allows developers to feed their own documents into the research process. The agent can combine public web research with analysis of private PDFs, reports, and datasets.
This opens use cases in competitive intelligence, due diligence, academic research, and market analysis where the most valuable insights come from connecting public information with proprietary data.
Pricing and Cost Structure
Google has adopted a pay-as-you-go pricing model based on underlying model usage and tool costs:
| Research Complexity | Estimated Cost | Typical Searches | Input Tokens |
|---|---|---|---|
| Standard | $2-3 per task | ~80 searches | ~250K tokens |
| Complex | $3-5 per task | ~160 searches | ~900K tokens |
At $2-5 per research task, the agent is positioned as a tool for high-value information work rather than casual queries. For comparison, a human analyst performing equivalent research would typically spend 2-4 hours on a single task.
DeepSearchQA: A New Benchmark
Alongside the API launch, Google open-sourced DeepSearchQA, a benchmark containing 900 "causal chain" tasks designed to evaluate agent comprehensiveness. Unlike traditional fact-retrieval benchmarks that test whether an agent can find a specific answer, DeepSearchQA tests whether an agent can follow multi-step reasoning chains across multiple sources.
This is significant because existing search benchmarks do not adequately measure the kind of synthesis work that the Deep Research Agent performs. By open-sourcing the benchmark, Google is inviting the research community to evaluate competing approaches against a common standard.
Integration Across Google Services
Google is simultaneously integrating the Deep Research Agent across its own product portfolio:
- Google Search: Enhanced research results for complex queries
- Google Finance: Automated financial analysis and due diligence
- Gemini App: Consumer-facing deep research feature
- NotebookLM: Research assistant for document analysis
This multi-surface deployment means the same underlying agent powers both consumer and developer use cases, with the API providing programmatic access to capabilities that Google users access through product interfaces.
Current Limitations
The preview release has several constraints that developers should consider:
- No custom tools: Cannot call developer-defined functions or MCP servers
- No structured output: Reports are delivered as free-form text, not structured JSON
- No human-in-the-loop planning: The agent determines its own research strategy without developer approval
- Audio input not supported: Despite accepting multimodal inputs, audio files are excluded
- Preview status: The API may change before general availability
The lack of custom tool support is particularly notable, as it means the agent cannot perform actions like querying internal databases, calling proprietary APIs, or interacting with external services during its research process.
Competitive Position
The Deep Research Agent places Google in a unique position among AI providers. While OpenAI, Anthropic, and others offer powerful language models through APIs, none currently provide a comparable autonomous research agent as a developer-accessible service.
Perplexity AI offers search-augmented AI responses, but its approach is more focused on real-time question answering than extended autonomous research. The Deep Research Agent's willingness to spend up to 60 minutes on a single query represents a fundamentally different product category.
What This Means for Developers
The Interactions API opens a new class of AI-powered applications where the value proposition is not instant responses but thorough research. Potential use cases include:
- Legal research platforms that synthesize case law and regulatory information
- Investment analysis tools that compile comprehensive company profiles
- Academic research assistants that conduct literature reviews
- Competitive intelligence systems that monitor and analyze market developments
- Journalism tools that assist with background research and fact-checking
For these applications, the $2-5 per query cost is trivial compared to the value of the output, making the Deep Research Agent commercially viable for professional use cases.
Pros
- Autonomous multi-step research eliminates the need for manual query chaining and result synthesis
- Private data integration allows combining proprietary documents with public web research
- Streaming progress updates provide transparency into the agent's research process
- Open-sourced DeepSearchQA benchmark enables objective comparison across competing approaches
- Multi-surface deployment across Google Search, Finance, and NotebookLM validates real-world utility
Cons
- No custom Function Calling or MCP server support limits integration with proprietary systems
- Preview status means API specifications may change before general availability
- 60-minute execution time is unsuitable for real-time or low-latency applications
- No structured output format makes programmatic processing of results more difficult
References
Comments0
Key Features
Google's Gemini Deep Research Agent is now available to developers via the new Interactions API in preview. Powered by Gemini 3.1 Pro, it autonomously plans, searches, and synthesizes multi-step research tasks with up to 160 web searches and 900K input tokens per query. It supports multimodal inputs including images, PDFs, and video, and can integrate private data via File Search. Google also open-sourced DeepSearchQA, a 900-task benchmark for evaluating research agent comprehensiveness.
Key Insights
- The Deep Research Agent is the first autonomous research agent made available as a developer API from a major AI provider
- At $2-5 per research task, it is priced for professional applications rather than casual consumer queries
- The Interactions API introduces a new asynchronous pattern that differs from standard synchronous LLM API calls
- DeepSearchQA's 900 causal-chain tasks establish a new evaluation standard for research agent comprehensiveness
- Private data integration via File Search enables combining public web research with proprietary document analysis
- The 60-minute maximum execution time positions this as a fundamentally different product category than real-time AI assistants
- Lack of custom tool support limits integration with proprietary systems in the current preview
Was this review helpful?
Share
Related AI Reviews
Google Launches Nano Banana 2: 4K Image Generation at Flash Speed
Google DeepMind unveils Nano Banana 2, combining Gemini Flash speed with 4K resolution, character consistency, and multilingual text rendering across 141 countries.
Lyria 3 Arrives in Gemini: Google Turns Text Prompts Into 30-Second Music Tracks
Google launches Lyria 3 inside the Gemini app on February 18, 2026, letting users generate 30-second songs with vocals, lyrics, and instrumentals from text or image prompts.
Gemini 3.1 Pro Arrives: Google Doubles Down on Reasoning With 77.1% ARC-AGI-2
Google launches Gemini 3.1 Pro on February 19, 2026, achieving 77.1% on ARC-AGI-2 reasoning benchmark, more than double its predecessor, with 1M token context and 64K output tokens.
Gemini 3 Deep Think Gets a Major Upgrade: 84.6% on ARC-AGI-2 and 18 Unsolved Problems Cracked
Google upgrades Gemini 3 Deep Think with record-breaking reasoning scores, gold-medal science performance, and real-world research applications.
