Sky-T1: Open Source AI Model for Reasoning - Cost Under $450

The Increasing Accessibility of Reasoning AI Models
The development of sophisticated reasoning AI models is becoming progressively simpler and less expensive.
Recently, NovaSky, a research group affiliated with UC Berkeley’s Sky Computing Lab, unveiled Sky-T1-32B-Preview. This model demonstrates performance comparable to an earlier iteration of OpenAI’s o1 across several crucial benchmarks.
A Truly Open Source Approach
Sky-T1 is noteworthy as potentially the first genuinely open source reasoning model, capable of complete replication from its foundational elements. The team has made both the training dataset and the associated training code publicly available.
The researchers highlighted that Sky-T1-32B-Preview was trained at a cost of under $450, illustrating the feasibility of replicating advanced reasoning capabilities in a cost-effective manner.
The Impact of Synthetic Data on Costs
While $450 may not seem insignificant, it represents a substantial reduction compared to the multi-million dollar price tags previously associated with training models of similar caliber. The utilization of synthetic training data – data generated by other AI models – has been instrumental in lowering these costs.
For example, Palmyra X 004, a model recently launched by AI company Writer, was trained predominantly on synthetic data and reportedly required a development investment of only $700,000.
Benefits of Reasoning Models
Unlike many conventional AI systems, reasoning models possess an inherent ability to self-verify information, mitigating some of the common errors encountered in other models.
Although reasoning models typically require a longer processing time – generally seconds or minutes – to reach a conclusion, they offer increased reliability in specialized fields like physics, science, and mathematics.
Sky-T1’s Development Process
The NovaSky team initially employed Alibaba’s QwQ-32B-Preview to generate the preliminary training data for Sky-T1.
Subsequently, they refined the data composition and utilized OpenAI’s GPT-4o-mini to restructure the data into a more manageable format. The training of the 32-billion-parameter Sky-T1 model consumed approximately 19 hours on a cluster of 8 Nvidia H100 GPUs. (The number of parameters is often indicative of a model’s problem-solving capacity.)
Performance Benchmarks
According to NovaSky, Sky-T1 surpasses the performance of an early preview version of o1 on MATH500, a challenging collection of mathematical problems.
The model also outperforms the o1 preview on a set of complex tasks from LiveCodeBench, a coding evaluation platform.
Areas for Improvement
However, Sky-T1’s performance falls short of the o1 preview on GPQA-Diamond, a dataset encompassing questions related to physics, biology, and chemistry at a PhD-level difficulty.
It’s also crucial to acknowledge that OpenAI’s generally available (GA) release of o1 is a more powerful model than the preview version, and OpenAI plans to introduce an even more advanced reasoning model, o3, in the near future.
Future Directions
Despite these points, the NovaSky team views Sky-T1 as a foundational step in their ongoing efforts to develop open source models with enhanced reasoning capabilities.
“Our future work will concentrate on creating more efficient models that sustain robust reasoning performance and investigating innovative techniques to improve model efficiency and accuracy during testing,” the team stated. “Further updates on these initiatives will be shared as progress is made.”
Related Posts

Google's New AI Agent vs. OpenAI GPT-5.2: A Deep Dive

Disney Cease and Desist: Google Faces Copyright Infringement Claim

OpenAI Responds to Google with GPT-5.2 After 'Code Red' Memo

Google Disco: Build Web Apps from Browser Tabs with Gemini

Waymo Baby Delivery: Birth in Self-Driving Car
