LOGO

Deepseek R1 AI Model: Runs on a Single GPU

May 29, 2025
Deepseek R1 AI Model: Runs on a Single GPU

DeepSeek's New AI Models: R1 and its Distilled Version

While the updated R1 reasoning AI model from DeepSeek is currently attracting significant attention within the AI community, the Chinese AI lab has also unveiled a more compact, “distilled” iteration of its R1. This smaller version, known as DeepSeek-R1-0528-Qwen3-8B, is asserted by DeepSeek to outperform models of similar size on specific benchmarks.

Performance Benchmarks of DeepSeek-R1-0528-Qwen3-8B

Built upon the Qwen3-8B model initially released by Alibaba in May, this streamlined R1 demonstrates superior performance compared to Google’s Gemini 2.5 Flash when evaluated on AIME 2025. AIME 2025 is a challenging suite of mathematical problems designed to test reasoning capabilities.

Furthermore, DeepSeek-R1-0528-Qwen3-8B achieves results that are nearly on par with Microsoft’s recently launched Phi 4 reasoning plus model during assessments of mathematical skills using the HMMT test.

Distilled Models: Trade-offs Between Size and Capability

Generally, distilled models, such as DeepSeek-R1-0528-Qwen3-8B, exhibit reduced capabilities when contrasted with their larger counterparts. However, a key advantage lies in their significantly lower computational requirements.

According to NodeShift, a cloud platform, running Qwen3-8B necessitates a GPU equipped with 40GB to 80GB of RAM, like an Nvidia H100. In contrast, the full-sized, updated R1 demands approximately twelve 80GB GPUs.

Training and Availability

DeepSeek developed DeepSeek-R1-0528-Qwen3-8B through a process of fine-tuning the Qwen3-8B model using text generated by the updated R1.

On its Hugging Face page dedicated to the model, DeepSeek positions DeepSeek-R1-0528-Qwen3-8B as suitable “for both academic research on reasoning models and industrial development focused on small-scale models.”

The model is released under a permissive MIT license, allowing for unrestricted commercial use. Currently, several platforms, including LM Studio, provide access to the model via an API.

Key Features

  • Model Name: DeepSeek-R1-0528-Qwen3-8B
  • Foundation Model: Qwen3-8B (Alibaba)
  • License: MIT License
  • Applications: Academic research and industrial development
#Deepseek#R1#AI model#GPU#artificial intelligence#distilled model