GPT-4.5 'Orion': OpenAI Unveils Its Most Powerful AI Model

GPT-4.5: OpenAI Launches New AI Model

An update as of 2:40 pm PT indicates that OpenAI has revised the documentation for GPT-4.5. A statement previously included in the AI model’s white paper, asserting that “GPT-4.5 is not a frontier AI model,” has been removed in the updated version. A link to the original white paper is available for reference.

Introducing GPT-4.5, Code-named Orion

On Thursday, OpenAI officially announced the release of GPT-4.5, the highly anticipated AI model previously known as Orion. This represents OpenAI’s largest model to date, having been trained with significantly more computational resources and data than any of its predecessors.

Not Classified as a Frontier Model

Despite its substantial size and capabilities, OpenAI clarifies in a published white paper that GPT-4.5 is not currently categorized as a frontier model.

Access and Availability

Access to GPT-4.5 within ChatGPT will be granted to subscribers of ChatGPT Pro, OpenAI’s premium $200 monthly plan, starting Thursday as part of a research preview. Developers utilizing paid tiers of the OpenAI API will also gain access today.

Users with ChatGPT Plus and ChatGPT Team subscriptions are expected to receive access to the model sometime next week, according to an OpenAI spokesperson who communicated with TechCrunch.

The Significance of Orion

The unveiling of Orion has garnered considerable attention within the industry, with many viewing it as an indicator of the continued effectiveness of conventional AI training methodologies. GPT-4.5’s development mirrored the approach used for GPT-4, GPT-3, GPT-2, and GPT-1 – namely, a substantial increase in computing power and data during an initial “pre-training” phase utilizing unsupervised learning.

Scaling and Performance

Historically, each successive GPT generation has demonstrated significant performance improvements across various domains, including mathematics, writing, and coding, as a direct result of scaling. OpenAI reports that the increased scale of GPT-4.5 has resulted in “a deeper world knowledge” and “higher emotional intelligence.”

However, evidence suggests that the benefits derived from scaling data and computational resources may be diminishing. On several established AI benchmarks, GPT-4.5’s performance is surpassed by newer AI models focused on “reasoning” from companies like DeepSeek (a Chinese AI startup), Anthropic, and even OpenAI itself.

Cost Considerations

OpenAI acknowledges that operating GPT-4.5 is exceptionally expensive. The company is currently evaluating the long-term feasibility of offering GPT-4.5 through its API. The API pricing for GPT-4.5 is set at $75 per million input tokens (approximately 750,000 words) and $150 per million output tokens.

This contrasts sharply with the cost of GPT-4o, which is priced at $2.50 per million input tokens and $10 per million output tokens.

A Research Preview

“We’re sharing GPT‐4.5 as a research preview to better understand its strengths and limitations,” OpenAI stated in a blog post shared with TechCrunch. “We’re still exploring what it’s capable of and are eager to see how people use it in ways we might not have expected.”

Varied Performance Characteristics

OpenAI clarifies that GPT-4.5 isn't intended as a direct substitute for GPT-4o, the foundational model powering the majority of its API services and ChatGPT. While GPT-4.5 incorporates functionalities such as file and image uploading, and compatibility with ChatGPT’s canvas feature, it currently lacks support for the realistic, bidirectional voice capabilities found in GPT-4o.

However, GPT-4.5 demonstrates improved performance compared to GPT-4o, and numerous other models as well.

According to OpenAI’s SimpleQA benchmark – designed to assess AI models on direct, factual inquiries – GPT-4.5 surpasses GPT-4o, along with OpenAI’s reasoning models, o1 and o3-mini, in accuracy. The company states that GPT-4.5 exhibits a lower tendency to generate inaccurate information, theoretically reducing the likelihood of fabricated responses.

One of OpenAI’s highest-performing reasoning models, deep research, was not included in the SimpleQA evaluation. A representative from OpenAI informed TechCrunch that its performance on this specific benchmark hasn’t been publicly reported and isn’t considered a comparable metric. Interestingly, Perplexity’s Deep Research model, which achieves similar results on other benchmarks to OpenAI’s deep research, outperforms GPT-4.5 on this test of factual recall.

openai unveils gpt-4.5 ‘orion,’ its largest ai model yet

When evaluated on a selection of coding challenges using the SWE-Bench Verified benchmark, GPT-4.5 achieves performance levels comparable to GPT-4o and o3-mini, but remains behind OpenAI’s deep research and Anthropic’s Claude 3.7 Sonnet. On the SWE-Lancer benchmark, which assesses an AI’s ability to create complete software features, GPT-4.5 exceeds the performance of both GPT-4o and o3-mini, though it doesn’t surpass deep research.

GPT-4.5 doesn’t quite match the capabilities of prominent AI reasoning models like o3-mini, DeepSeek’s R1, and Claude 3.7 Sonnet (a hybrid model) on challenging academic benchmarks such as AIME and GPQA. However, it equals or surpasses leading non-reasoning models on these same tests, indicating strong performance in mathematical and scientific problem-solving.

OpenAI also asserts that GPT-4.5 exhibits qualitative advantages over other models in areas not easily measured by benchmarks, specifically in understanding user intent. The model is described as responding with a more engaging and natural tone, and excelling in creative applications like content creation and design.

During an informal evaluation, OpenAI tasked GPT-4.5, GPT-4o, and o3-mini with generating an SVG image of a unicorn – a graphic format based on formulas and code. GPT-4.5 was the sole model capable of producing an image recognizably resembling a unicorn.

In a further test, the models were prompted with the statement, “I’m going through a tough time after failing a test.” While both GPT-4o and o3-mini provided helpful information, GPT-4.5’s response was deemed the most socially sensitive and appropriate.

“We anticipate a more comprehensive understanding of GPT-4.5’s capabilities as it is released,” OpenAI stated in its blog post, “recognizing that standard academic benchmarks don’t always reflect practical utility.”

The Questioning of Scaling Laws in AI Development

OpenAI posits that GPT-4.5 represents a leading edge in the realm of unsupervised learning. While this assertion may hold merit, the model's inherent constraints also seem to validate expert predictions regarding the eventual breakdown of conventional pre-training scaling laws.

Ilya Sutskever, a co-founder and previously the chief scientist at OpenAI, stated in December that a saturation point regarding data has been reached. He further indicated that pre-training, as currently practiced, is inevitably approaching its conclusion.

These statements align with anxieties voiced by AI investors, founders, and researchers to TechCrunch in a November report, concerning the challenges inherent in pre-training. The limitations encountered are prompting a shift in strategy.

As a response to these pre-training obstacles, the AI sector – including OpenAI – is increasingly focusing on reasoning models. These models, while slower in task completion compared to their non-reasoning counterparts, generally demonstrate greater reliability.

A key strategy involves allocating increased time and computational resources to allow AI reasoning models to thoroughly analyze and resolve complex problems. This approach is expected to yield substantial improvements in overall model performance.

OpenAI intends to integrate its GPT model series with its "o" reasoning series, with the anticipated release of GPT-5 later this year marking the beginning of this integration.

GPT-4.5, despite its reportedly substantial training costs, repeated delays, and failure to fully satisfy internal benchmarks, may not independently achieve the highest levels of AI performance. However, OpenAI likely views it as a crucial developmental phase towards more advanced capabilities.

Topics

More

GPT-4.5 'Orion': OpenAI Unveils Its Most Powerful AI Model

GPT-4.5: OpenAI Launches New AI Model

Introducing GPT-4.5, Code-named Orion

Not Classified as a Frontier Model

Access and Availability

The Significance of Orion

Scaling and Performance

Cost Considerations

A Research Preview

Varied Performance Characteristics

Related Posts

ChatGPT Launches App Store for Developers

Pickle Robot Appoints Tesla Veteran as First CFO

Peripheral Labs: Self-Driving Car Sensors Enhance Sports Fan Experience

Luma AI: Generate Videos from Start and End Frames

Alexa+ Adds AI to Ring Doorbells - Amazon's New Feature

Amazon Appoints Peter DeSantis to Lead New AI Organization