OpenAI Launches General Purpose Agent in ChatGPT

OpenAI Introduces a New AI Agent within ChatGPT

OpenAI is releasing a novel, versatile AI agent integrated directly into ChatGPT, designed to autonomously execute a broad spectrum of computer-based tasks for its users.

According to OpenAI, this agent possesses the ability to automatically manage a user’s schedule, create editable presentations and slideshows, and even execute code independently.

Key Capabilities of ChatGPT Agent

The newly launched tool, designated ChatGPT agent, consolidates functionalities from OpenAI’s prior agentic tools.

It incorporates the website navigation skills of Operator.
It also includes the information synthesis capabilities of Deep Research, which can condense data from numerous websites into succinct reports.

Users will be able to interact with the agent through simple, natural language prompts within the ChatGPT interface.

The rollout of ChatGPT agent begins on Thursday and is available to subscribers of OpenAI’s Pro, Plus, and Team plans. Activation is achieved by selecting “agent mode” from the tool dropdown menu within ChatGPT.

A Significant Step Towards Agentic AI

This launch signifies OpenAI’s most ambitious endeavor yet to transform ChatGPT into a fully agentic product. The goal is to enable the AI to proactively undertake actions and relieve users of tasks, rather than merely providing answers to queries.

In recent years, numerous Silicon Valley companies, including OpenAI, Google, and Perplexity, have introduced AI agents with similar promises. However, initial iterations of these agents have often struggled with complex tasks, falling short of the ambitious vision presented by tech leaders.

OpenAI asserts that ChatGPT agent represents a substantial improvement in capability compared to its previous agent offerings.

Enhanced Functionality Through Connectors and APIs

The new agent can leverage ChatGPT connectors, allowing users to link applications like Gmail and GitHub. This enables the agent to access pertinent information relevant to user prompts.

Furthermore, ChatGPT agent has access to a terminal and can utilize APIs to interact with specific applications.

OpenAI proposes that users can employ ChatGPT agent to accomplish tasks such as “planning and procuring ingredients for a Japanese breakfast for four,” or “analyzing three competitors and generating a slide deck.”

Successfully completing these tasks requires the agent to interpret websites, formulate a plan of action, and utilize various tools – representing a level of complexity that OpenAI has not previously attempted with its agents.

Performance Benchmarks

According to OpenAI, the underlying model powering ChatGPT agent demonstrates state-of-the-art performance on several key benchmarks.

The agent achieves a score of 41.6% on Humanity’s Last Exam (pass@1), a challenging test encompassing thousands of questions across over one hundred subjects. This is approximately double the scores attained by OpenAI’s o3 and o4-mini models.

On FrontierMath, a notoriously difficult mathematical benchmark, ChatGPT agent scores 27.4% when equipped with tools, including a terminal for code execution. This surpasses the previous state-of-the-art score of 6.3% achieved by o4-mini.

Safety Considerations and Safeguards

OpenAI emphasizes that ChatGPT agent was developed with a strong focus on safety, acknowledging the potential for misuse due to its enhanced capabilities.

In a safety report, OpenAI designates the model as “high capability” in the domains of biological and chemical weapons, as defined by its Preparedness Framework. This categorization indicates the potential for the model to “amplify existing pathways to severe harm.”

While OpenAI states it has no direct evidence of such risks, it is adopting a precautionary approach and implementing new safeguards.

Real-Time Monitoring and Memory Disablement

These safeguards include a real-time monitor that analyzes every prompt entered into ChatGPT agent. The system identifies requests related to biology and then assesses whether the agent’s response could be used to create a biological threat.

To further mitigate risks, OpenAI has disabled ChatGPT’s memory feature for this agent. This prevents potential misuse through prompt injection attacks, which could allow malicious actors to exfiltrate sensitive data. The company may consider re-enabling the feature in the future.

Despite the promising features of ChatGPT agent, its real-world effectiveness remains to be fully evaluated. Agent technology has historically proven fragile when interacting with complex, real-world scenarios.

However, OpenAI believes it has developed a more capable model that can deliver on the long-held promise of AI agents.

This story has been updated to include additional details.

Topics

More

OpenAI Launches General Purpose Agent in ChatGPT

OpenAI Introduces a New AI Agent within ChatGPT

Key Capabilities of ChatGPT Agent

A Significant Step Towards Agentic AI

Enhanced Functionality Through Connectors and APIs

Performance Benchmarks

Safety Considerations and Safeguards

Real-Time Monitoring and Memory Disablement

Related Posts

ChatGPT Launches App Store for Developers

Pickle Robot Appoints Tesla Veteran as First CFO

Peripheral Labs: Self-Driving Car Sensors Enhance Sports Fan Experience

Luma AI: Generate Videos from Start and End Frames

Alexa+ Adds AI to Ring Doorbells - Amazon's New Feature

Amazon Appoints Peter DeSantis to Lead New AI Organization