Back to list
Mar 01, 2026
33
0
0
GeminiNEW

Google Opens Gemini Deep Research Agent to Developers via New Interactions API

Google makes its autonomous Deep Research Agent available to developers through the Interactions API, powered by Gemini 3.1 Pro with web search and private data capabilities.

#Gemini#Google#Deep Research#Interactions API#Gemini 3.1 Pro
Google Opens Gemini Deep Research Agent to Developers via New Interactions API
AI Summary

Google makes its autonomous Deep Research Agent available to developers through the Interactions API, powered by Gemini 3.1 Pro with web search and private data capabilities.

Deep Research Leaves the Consumer Sandbox

Google has opened its Gemini Deep Research Agent to developers through a new Interactions API, marking the first time the company's most advanced autonomous research capabilities are available outside of its consumer products. Previously limited to the Gemini App and Google Search, the Deep Research Agent can now be embedded directly into third-party applications.

The agent is powered by Gemini 3.1 Pro and is designed for complex, multi-step information gathering tasks that go beyond simple question-and-answer interactions. It autonomously plans research strategies, executes searches, reads and synthesizes content, and produces cited reports.

How the Deep Research Agent Works

Unlike standard language model API calls that return immediate responses, the Deep Research Agent operates asynchronously. Developers send a research prompt, and the agent independently determines how to investigate the topic, executing multiple rounds of search and analysis before delivering a final report.

The workflow follows a four-stage pattern:

StageDescription
PlanningAgent analyzes the research prompt and creates an investigation strategy
SearchingAgent queries Google Search and reads web pages, iterating as needed
SynthesisAgent combines findings from multiple sources into a coherent analysis
ReportingAgent delivers a detailed report with source citations

This iterative approach means the agent may perform 80 to 160 web searches per task, reading and processing up to 900,000 input tokens for complex research questions. The entire process can take up to 60 minutes for the most demanding queries.

The Interactions API: A New Interface Pattern

The Interactions API represents a departure from Google's existing Gemini API patterns. Standard API calls use synchronous generate_content methods, but the Deep Research Agent requires background execution through a background=true parameter.

Developers create an interaction, receive an interaction ID, and then either poll for results or stream progress updates. The API provides thinking summaries during execution, giving applications visibility into what the agent is currently investigating.

Key technical specifications include:

  • Agent identifier: deep-research-pro-preview-12-2025
  • Maximum execution time: 60 minutes
  • Input types: Text, images, PDFs, audio, and video
  • Output: Detailed reports with inline source citations
  • Streaming: Real-time progress updates available
  • Reconnection: Support for network resilience using last_event_id tracking

The API does not currently support custom Function Calling tools or Model Context Protocol (MCP) servers, limiting integration options for applications that need the agent to interact with proprietary systems.

Private Data Integration

One of the most significant capabilities is the File Search tool, which allows developers to feed their own documents into the research process. The agent can combine public web research with analysis of private PDFs, reports, and datasets.

This opens use cases in competitive intelligence, due diligence, academic research, and market analysis where the most valuable insights come from connecting public information with proprietary data.

Pricing and Cost Structure

Google has adopted a pay-as-you-go pricing model based on underlying model usage and tool costs:

Research ComplexityEstimated CostTypical SearchesInput Tokens
Standard$2-3 per task~80 searches~250K tokens
Complex$3-5 per task~160 searches~900K tokens

At $2-5 per research task, the agent is positioned as a tool for high-value information work rather than casual queries. For comparison, a human analyst performing equivalent research would typically spend 2-4 hours on a single task.

DeepSearchQA: A New Benchmark

Alongside the API launch, Google open-sourced DeepSearchQA, a benchmark containing 900 "causal chain" tasks designed to evaluate agent comprehensiveness. Unlike traditional fact-retrieval benchmarks that test whether an agent can find a specific answer, DeepSearchQA tests whether an agent can follow multi-step reasoning chains across multiple sources.

This is significant because existing search benchmarks do not adequately measure the kind of synthesis work that the Deep Research Agent performs. By open-sourcing the benchmark, Google is inviting the research community to evaluate competing approaches against a common standard.

Integration Across Google Services

Google is simultaneously integrating the Deep Research Agent across its own product portfolio:

  • Google Search: Enhanced research results for complex queries
  • Google Finance: Automated financial analysis and due diligence
  • Gemini App: Consumer-facing deep research feature
  • NotebookLM: Research assistant for document analysis

This multi-surface deployment means the same underlying agent powers both consumer and developer use cases, with the API providing programmatic access to capabilities that Google users access through product interfaces.

Current Limitations

The preview release has several constraints that developers should consider:

  • No custom tools: Cannot call developer-defined functions or MCP servers
  • No structured output: Reports are delivered as free-form text, not structured JSON
  • No human-in-the-loop planning: The agent determines its own research strategy without developer approval
  • Audio input not supported: Despite accepting multimodal inputs, audio files are excluded
  • Preview status: The API may change before general availability

The lack of custom tool support is particularly notable, as it means the agent cannot perform actions like querying internal databases, calling proprietary APIs, or interacting with external services during its research process.

Competitive Position

The Deep Research Agent places Google in a unique position among AI providers. While OpenAI, Anthropic, and others offer powerful language models through APIs, none currently provide a comparable autonomous research agent as a developer-accessible service.

Perplexity AI offers search-augmented AI responses, but its approach is more focused on real-time question answering than extended autonomous research. The Deep Research Agent's willingness to spend up to 60 minutes on a single query represents a fundamentally different product category.

What This Means for Developers

The Interactions API opens a new class of AI-powered applications where the value proposition is not instant responses but thorough research. Potential use cases include:

  • Legal research platforms that synthesize case law and regulatory information
  • Investment analysis tools that compile comprehensive company profiles
  • Academic research assistants that conduct literature reviews
  • Competitive intelligence systems that monitor and analyze market developments
  • Journalism tools that assist with background research and fact-checking

For these applications, the $2-5 per query cost is trivial compared to the value of the output, making the Deep Research Agent commercially viable for professional use cases.

Pros

  • Autonomous multi-step research eliminates the need for manual query chaining and result synthesis
  • Private data integration allows combining proprietary documents with public web research
  • Streaming progress updates provide transparency into the agent's research process
  • Open-sourced DeepSearchQA benchmark enables objective comparison across competing approaches
  • Multi-surface deployment across Google Search, Finance, and NotebookLM validates real-world utility

Cons

  • No custom Function Calling or MCP server support limits integration with proprietary systems
  • Preview status means API specifications may change before general availability
  • 60-minute execution time is unsuitable for real-time or low-latency applications
  • No structured output format makes programmatic processing of results more difficult

Comments0

Key Features

Google's Gemini Deep Research Agent is now available to developers via the new Interactions API in preview. Powered by Gemini 3.1 Pro, it autonomously plans, searches, and synthesizes multi-step research tasks with up to 160 web searches and 900K input tokens per query. It supports multimodal inputs including images, PDFs, and video, and can integrate private data via File Search. Google also open-sourced DeepSearchQA, a 900-task benchmark for evaluating research agent comprehensiveness.

Key Insights

  • The Deep Research Agent is the first autonomous research agent made available as a developer API from a major AI provider
  • At $2-5 per research task, it is priced for professional applications rather than casual consumer queries
  • The Interactions API introduces a new asynchronous pattern that differs from standard synchronous LLM API calls
  • DeepSearchQA's 900 causal-chain tasks establish a new evaluation standard for research agent comprehensiveness
  • Private data integration via File Search enables combining public web research with proprietary document analysis
  • The 60-minute maximum execution time positions this as a fundamentally different product category than real-time AI assistants
  • Lack of custom tool support limits integration with proprietary systems in the current preview

Was this review helpful?

Share

Twitter/X