New Models
Qwen2.5-Max
Alibaba launches Qwen2.5-Max flagship model with hidden video gen
Alibaba's Qwen team released Qwen2.5-Max, a large MoE flagship model available through the Qwen Chat interface and API, claiming competitive results against DeepSeek V3 and other frontier models. The chat app also quietly shipped a video generation capability powered by Alibaba's Tongyi Wanxiang.
New ModelsOpen weights
Qwen2.5-VL
Alibaba ships Qwen2.5-VL open vision-language model family
Alibaba's Qwen team released Qwen2.5-VL, open-weights vision-language models up to 72B that handle images, documents, video understanding, and on-screen agentic grounding. The 72B Instruct model was immediately available on Hugging Face and in Qwen Chat.
72B Largest variant
New ModelsOpen weights
Tulu 3 405B
Allen Institute releases Tulu 3 405B open post-trained model
The Allen Institute for AI scaled its fully open Tulu 3 post-training recipe to a 405B-parameter model based on Llama 3.1 405B. It demonstrates that Ai2's open RLVR post-training pipeline works at frontier scale, with weights and recipe released openly.
405B Parameters
New ModelsOpen weights
Janus Pro
DeepSeek Janus Pro: open multimodal models in 1.5B and 7B
Amid the R1 frenzy, DeepSeek also released Janus Pro, unified multimodal models at 1.5B and 7B parameters that handle both image understanding and image generation. The open release added to DeepSeek's week of dominating AI news headlines.
1.5B / 7B Model sizes
New ModelsOpen weights
YuE 7B
YuE 7B: open-source Suno-style music generation model
The Multimodal Art Projection (M-A-P) team released YuE, a 7B open-source music generation model dubbed the 'open Suno' on the show, capable of generating full songs with vocals from lyrics. Weights are on Hugging Face with code on GitHub and a hosted demo on fal.ai.
7B Parameters
New ModelsOpen weights
Mistral Small 2501
Mistral Small 2501: 24B open-weights model under Apache 2.0
Mistral AI released Mistral Small 2501, a 24B-parameter instruct model under the permissive Apache 2.0 license. Announced as breaking news during the show, it continues Mistral's tradition of strong small open models suitable for fine-tuning and local deployment.
24B Parameters
New ModelsOpen weights
Eagle 2
NVIDIA releases Eagle 2 open vision-language models
NVIDIA published Eagle 2, a family of open vision-language models with an accompanying paper, model weights on Hugging Face, and a live demo. It is a fully transparent VLM release covering training data strategy and recipes, competitive with much larger vision models.
New ModelsOpen weights
UI-TARS
ByteDance UI-TARS: open computer-use models that control your PC
ByteDance released UI-TARS, open computer-use models in 7B and 72B parameter sizes that can control a Mac or PC, with desktop apps for both platforms. ByteDance claims they beat GPT-4-class models on GUI/computer-control benchmarks.
7B / 72B Model sizes
New ModelsOpen weights
DeepSeek R1
DeepSeek R1: MIT-licensed open source reasoning model rivals o1
DeepSeek released R1, a state-of-the-art open source reasoning model under a permissive MIT license. It matches or beats OpenAI's o1 on key reasoning benchmarks while being fully open weights, and DeepSeek also shipped a family of distilled smaller models. The show called this the hottest week open source AI has ever had.
New Models
Gemini 2.0 Flash Thinking 01-21
Google ships updated Gemini Flash Thinking with 1M context
Google released an updated Gemini Flash Thinking model (01-21) with a 1 million token context window, built-in code execution, and improved evals over the previous Thinking release. It pushes Google's reasoning-model line forward in the same week DeepSeek R1 landed.
1M Context window (tokens)
New ModelsOpen weights
SmolVLM (256M)
Hugging Face SmolVLM: tiny vision-language models run on WebGPU
Hugging Face released SmolVLM, a family of tiny vision-language models including a 256M-parameter version small enough to run entirely in the browser via WebGPU. It demonstrates how far efficient multimodal models have shrunk while remaining usable.
256M Parameters (smallest VLM)
New ModelsOpen weights
Hunyuan3D 2.0
Tencent Hunyuan3D 2.0: SOTA open source 3D generation
Tencent released Hunyuan3D 2.0, a state-of-the-art open source 3D asset generation model on Hugging Face. It produces high-quality 3D shapes and textures and pushes open weights forward in the 3D generation category.