OpenAI Launches Open AI Reasoning Models

OpenAI Releases New Open-Weight AI Models

OpenAI revealed on Tuesday the availability of two new AI reasoning models with performance levels comparable to its o-series. These models are accessible for free download through the Hugging Face developer platform, according to the company’s announcement.

OpenAI characterizes these models as representing a “state of the art” advancement when evaluated against established benchmarks for open-source models.

Model Specifications and Capabilities

The released models are offered in two distinct sizes. A larger, more powerful gpt-oss-120b model is designed to operate effectively on a single Nvidia GPU.

Additionally, a lighter-weight gpt-oss-20b model is available, capable of running on a standard consumer laptop equipped with 16GB of memory.

A Return to Open Source

This launch signifies OpenAI’s first foray into ‘open’ language models since the release of GPT-2 over five years ago.

During a recent briefing, OpenAI explained that these open models will facilitate the transmission of intricate requests to AI models hosted in the cloud.

If the open model encounters a task beyond its capabilities, such as image processing, developers can seamlessly connect it to one of OpenAI’s more advanced, closed-source models.

Shift in Strategy

While OpenAI initially embraced open-source AI models, the company subsequently adopted a predominantly proprietary development approach.

This strategy proved instrumental in establishing a substantial business centered around providing access to its AI models via an API for both enterprises and developers.

However, CEO Sam Altman expressed in January a belief that OpenAI had been “on the wrong side of history” regarding open sourcing its technologies.

Competitive Landscape and External Pressure

OpenAI now faces increasing competition from Chinese AI labs, including DeepSeek, Alibaba’s Qwen, and Moonshot AI, which have developed highly capable and widely adopted open models.

Meta, a previous leader in the open AI sector, has seen its Llama AI models fall behind in performance over the past year.

Furthermore, the Trump administration urged U.S. AI developers in July to increase open-source contributions to foster global adoption of AI aligned with American principles.

Strategic Implications

With the introduction of gpt-oss, OpenAI aims to strengthen its relationships with both developers and the Trump administration.

Both groups have observed the growing prominence of Chinese AI labs within the open-source community.

“Going back to when we started in 2015, OpenAI’s mission is to ensure AGI that benefits all of humanity,” stated Altman in a communication to TechCrunch.

“To that end, we are excited for the world to be building on an open AI stack created in the United States, based on democratic values, available for free to all and for wide benefit.”

openai launches two ‘open’ ai reasoning models

Performance Evaluation of the Models

OpenAI’s objective was to establish its newly released open-weight model as a frontrunner within the open-source AI landscape, and the company asserts that this goal has been achieved.

Evaluations conducted on the Codeforces platform, utilizing available tools, reveal that gpt-oss-120b attained a score of 2622, while gpt-oss-20b achieved 2516. These results surpass those of DeepSeek’s R1 model, although they fall short of the performance demonstrated by o3 and o4-mini.

When assessed using Humanity’s Last Exam (HLE), a rigorous test encompassing a diverse range of crowdsourced questions (with tools enabled), gpt-oss-120b and gpt-oss-20b registered scores of 19% and 17.3%, respectively.

These scores indicate a performance level below that of o3, yet they exceed the capabilities of prominent open models developed by DeepSeek and Qwen.

A key observation is that OpenAI’s open models exhibit a higher propensity for hallucinations compared to its more recent AI reasoning models, specifically o3 and o4-mini.

The incidence of hallucinations has been noted as increasing in OpenAI’s latest AI reasoning models, a trend the company has acknowledged without a complete understanding of its underlying causes.

According to a published white paper, OpenAI anticipates this behavior, stating that “smaller models inherently possess less comprehensive world knowledge than larger, more advanced models and are therefore more prone to generating inaccurate or fabricated information.”

Testing with PersonQA, OpenAI’s internal benchmark for evaluating a model’s factual accuracy regarding individuals, revealed that gpt-oss-120b and gpt-oss-20b hallucinated in response to 49% and 53% of the questions, respectively.

This represents a more than threefold increase in the hallucination rate when compared to OpenAI’s o1 model (16%) and a higher rate than observed with its o4-mini model (36%).

Developing the New Generation of Models

OpenAI reports that the training methodologies employed for its newly released open models closely mirror those utilized in the development of its proprietary systems. A key aspect of these models is the implementation of a mixture-of-experts (MoE) architecture. This allows for efficient operation by activating only a subset of the total parameters when responding to a specific query.

Specifically, the gpt-oss-120b model, boasting a total of 117 billion parameters, only utilizes approximately 5.1 billion parameters for each token processed. This selective activation contributes to enhanced computational efficiency.

Furthermore, OpenAI leveraged high-performance computing and reinforcement learning (RL) during the training phase. This post-training process refines the models' behavior through simulated environments, utilizing substantial Nvidia GPU clusters to instill a sense of ethical reasoning.

Similarities to OpenAI’s o-series

This RL approach is consistent with the techniques used to train OpenAI’s o-series models. The open models also exhibit a comparable chain-of-thought reasoning process, requiring additional processing time and resources to formulate comprehensive answers.

As a direct consequence of this rigorous post-training, OpenAI asserts that its open AI models are particularly well-suited for powering AI agents. These agents can effectively utilize external tools, such as web search functionalities or Python code execution, as integral components of their reasoning process.

However, it’s important to note that these open models are currently limited to text-based operations. They lack the capacity to process or generate images and audio, a distinction from other models within OpenAI’s portfolio.

Licensing and Data Transparency

OpenAI is distributing both gpt-oss-120b and gpt-oss-20b under the Apache 2.0 license. This is a highly permissive license, granting enterprises the freedom to commercialize these open models without incurring royalty payments or seeking explicit permission from OpenAI.

Training Data Considerations

Despite this open licensing approach, OpenAI has decided against releasing the training data used to create these models. This decision aligns with ongoing legal challenges faced by several AI model providers, including OpenAI, concerning potential copyright infringements in their training datasets.

Addressing Safety Concerns

The release of OpenAI’s open models experienced several postponements, partly due to a focus on safety evaluations. Beyond standard safety protocols, OpenAI conducted investigations, detailed in a white paper, to assess the potential for malicious actors to fine-tune the gpt-oss models for harmful purposes.

These investigations specifically examined the possibility of adapting the models to assist in cyberattacks or the development of biological and chemical weapons.

Evaluation Results

Testing conducted by OpenAI and independent evaluators indicated a marginal increase in potential biological capabilities following fine-tuning. However, no evidence was found to suggest that the open models could reach a “high capability” threshold for danger in these sensitive domains.

Looking Ahead

While OpenAI’s models currently represent a leading edge in open-source AI, the developer community anticipates the release of DeepSeek R2, the next iteration of AI reasoning models. Additionally, a new open model from Meta’s Superintelligence Lab is also eagerly awaited.

Topics

More

OpenAI Launches Open AI Reasoning Models - New AI Release

OpenAI Releases New Open-Weight AI Models

Model Specifications and Capabilities

A Return to Open Source

Shift in Strategy

Competitive Landscape and External Pressure

Strategic Implications

Developing the New Generation of Models

Similarities to OpenAI’s o-series

Licensing and Data Transparency

Training Data Considerations

Addressing Safety Concerns

Evaluation Results

Looking Ahead

Related Posts

ChatGPT Launches App Store for Developers

Pickle Robot Appoints Tesla Veteran as First CFO

Peripheral Labs: Self-Driving Car Sensors Enhance Sports Fan Experience

Luma AI: Generate Videos from Start and End Frames

Alexa+ Adds AI to Ring Doorbells - Amazon's New Feature

Amazon Appoints Peter DeSantis to Lead New AI Organization