Google's New AI Agent vs. OpenAI GPT-5.2: A Deep Dive

Google Unveils Enhanced Gemini Deep Research Agent

On Thursday, Google launched a redesigned version of its research agent, Gemini Deep Research. This updated agent is powered by the advanced Gemini 3 Pro foundation model.

Expanding Research Capabilities

The new agent’s functionality extends beyond simply generating research reports. It now enables developers to integrate Google’s sophisticated research capabilities directly into their own applications.

This integration is facilitated by Google’s new Interactions API. This API provides developers with greater control as the field of agentic AI continues to evolve.

Deep Research: Synthesizing Complex Information

Gemini Deep Research is designed as an agent capable of processing and synthesizing vast amounts of information. It can effectively manage large context dumps within prompts.

Google states that its customers are utilizing this tool for a variety of tasks, including due diligence processes and research into drug toxicity safety.

Integration with Google Services

Google plans to integrate this new deep research agent into several of its core services. These include Google Search, Google Finance, the Gemini App, and NotebookLM.

This move signifies a broader preparation for a future where AI agents handle information seeking, rather than direct human queries.

Minimizing Hallucinations with Gemini 3 Pro

Deep Research leverages the strengths of Gemini 3 Pro, which Google identifies as its “most factual” model. This model is specifically trained to reduce the occurrence of hallucinations during complex tasks.

AI hallucinations, where large language models fabricate information, are particularly problematic in long-running agentic tasks. Each autonomous decision carries a risk of being invalidated by a single hallucinated element.

Introducing the DeepSearchQA Benchmark

To demonstrate its advancements, Google has introduced a new benchmark called DeepSearchQA. This benchmark is designed to evaluate agents on complex, multi-step information retrieval tasks.

The DeepSearchQA benchmark has been released as an open-source resource.

Performance Evaluation on Existing Benchmarks

Google also tested Deep Research on Humanity’s Last Exam, a challenging benchmark of general knowledge, and BrowserComp, a benchmark focused on browser-based agentic tasks.

As anticipated, Google’s agent performed best on its own benchmark and Humanity’s Last Exam. However, OpenAI’s ChatGPT 5 Pro proved to be a close competitor, even surpassing Google on BrowserComp.

OpenAI Responds with GPT 5.2 “Garlic”

The release of Google’s news coincided with the launch of OpenAI’s GPT 5.2, codenamed “Garlic”.

OpenAI asserts that its newest model outperforms rivals, including Google, across a range of standard benchmarks, including its own proprietary evaluation.

Strategic Timing of Announcements

The timing of Google’s announcement appears strategic, occurring in anticipation of the widespread attention surrounding the release of OpenAI’s Garlic.

Topics

More

Google's New AI Agent vs. OpenAI GPT-5.2: A Deep Dive

Google Unveils Enhanced Gemini Deep Research Agent

Expanding Research Capabilities

Deep Research: Synthesizing Complex Information

Integration with Google Services

Minimizing Hallucinations with Gemini 3 Pro

Introducing the DeepSearchQA Benchmark

Performance Evaluation on Existing Benchmarks

OpenAI Responds with GPT 5.2 “Garlic”

Strategic Timing of Announcements

Related Posts

ChatGPT Launches App Store for Developers

Pickle Robot Appoints Tesla Veteran as First CFO

Peripheral Labs: Self-Driving Car Sensors Enhance Sports Fan Experience

Luma AI: Generate Videos from Start and End Frames

Alexa+ Adds AI to Ring Doorbells - Amazon's New Feature

Amazon Appoints Peter DeSantis to Lead New AI Organization