LOGO

AI2 Small AI Model Beats Google & Meta - Performance Comparison

May 1, 2025
AI2 Small AI Model Beats Google & Meta - Performance Comparison

Recent Release of the Olmo 2 1B AI Model

The current week has seen a surge in the release of smaller AI models.

The nonprofit AI research institute, Ai2, announced the release of Olmo 2 1B on Thursday. This is a 1-billion-parameter model that Ai2 asserts outperforms comparable models from companies like Google, Meta, and Alibaba across multiple benchmarks.

Model Details and Accessibility

Parameters, often called weights, are the internal elements within a model that determine its operational behavior.

Olmo 2 1B is available under the permissive Apache 2.0 license on the AI development platform Hugging Face. A key distinction of Olmo 2 1B is its replicability; Ai2 has released the code and datasets – Olmo-mix-1124 and Dolmino-mix-1124 – used in its development.

Advantages of Smaller Models

While smaller models may not possess the same capabilities as larger ones, they offer a significant advantage. They do not necessitate powerful hardware for operation.

This increased accessibility makes them ideal for developers and enthusiasts working with limited resources or standard consumer-grade machines.

Recent Launches and Capabilities

Numerous small model launches have occurred recently, including Microsoft’s Phi 4 reasoning family and Qwen’s 2.5 Omni 3B.

Like Olmo 2 1B, most of these models can function efficiently on contemporary laptops or even mobile devices.

Training Data and Token Count

Ai2 reports that Olmo 2 1B was trained using a dataset of 4 trillion tokens. These tokens were sourced from publicly available materials, AI-generated content, and manually created sources.

A token represents a fundamental unit of data processed by the model; one million tokens is roughly equivalent to 750,000 words.

Benchmark Performance

On the GSM8K benchmark, which assesses arithmetic reasoning, Olmo 2 1B achieved a higher score than Google’s Gemma 3 1B, Meta’s Llama 3.2 1B, and Alibaba’s Qwen 2.5 1.5B.

Furthermore, Olmo 2 1B demonstrated superior performance on TruthfulQA, a test designed to evaluate factual accuracy, when compared to the same three models.

Potential Risks and Limitations

Ai2 cautions that Olmo 2 1B is not without potential risks.

Like all AI models, it is capable of generating “problematic outputs,” including content that is harmful, sensitive, or factually incorrect. Consequently, Ai2 advises against deploying Olmo 2 1B in commercial applications.

#AI2#small AI model#Google#Meta#AI performance#machine learning