DeepSeek AI Releases New Image Model Family

DeepSeek Launches New Multimodal AI Models

The rapidly growing AI company, DeepSeek, has unveiled a new suite of multimodal AI models. These models are asserted to exceed the capabilities of OpenAI’s DALL-E 3 in performance.

Available for download via the AI development platform Hugging Face, these models constitute a new family named Janus-Pro. The models vary in scale, ranging from 1 billion to 7 billion parameters.

Understanding Model Parameters

Generally, a model’s capacity for problem-solving is indicated by its number of parameters. Larger parameter counts typically correlate with improved performance compared to models with fewer parameters.

Janus-Pro is released under an MIT license, granting unrestricted commercial usage rights.

viral ai company deepseek releases new image model family

Janus-Pro Capabilities and Benchmarks

Described by DeepSeek as a “novel autoregressive framework,” Janus-Pro is capable of both analyzing existing images and generating new ones.

According to the company’s reports, the largest model within the Janus-Pro family, Janus-Pro-7B, surpasses DALL-E 3 on two key AI evaluation benchmarks: GenEval and DPG-Bench.

It also outperforms models like PixArt-alpha, Emu3-Gen, and Stability AI’s Stable Diffusion XL.

While some of the compared models are older versions, and most Janus-Pro models currently handle images with a resolution limit of 384 x 384, the performance achieved is noteworthy given the models’ relatively small size.

“Janus-Pro surpasses previous unified model approaches and achieves performance on par with, or exceeding, task-specific models,” DeepSeek states in their Hugging Face post.

“Its simplicity, flexibility, and effectiveness position it as a strong contender for the next generation of unified multimodal models.”

DeepSeek's Rise and Implications

DeepSeek, a Chinese AI laboratory primarily funded by High-Flyer Capital Management, a quantitative trading firm, gained significant attention this week.

Their chatbot application quickly climbed to the top of the Apple App Store charts.

The company’s language models, developed using computationally efficient methods, have prompted discussions among Wall Street analysts and technologists regarding the United States’ continued leadership in the AI field.

Concerns have also been raised about the sustainability of demand for AI chips.

Correction: A previous iteration of this article incorrectly stated that Janus-Pro models were limited to outputting images at 384 x 384 resolution. This was inaccurate, and we apologize for the mistake.

Stay informed with TechCrunch’s AI newsletter! Sign up here to receive it directly in your inbox every Wednesday.