Everything AI Released in January 2025

27 releases covered live on the show — every model, product, paper and tool that mattered, with links and our analysis.

All months February 2025 →

🧠 New Models 12

Alibaba (Qwen) Jan 30, 2025

New Models

Qwen2.5-Max

Alibaba launches Qwen2.5-Max flagship model with hidden video gen

Alibaba's Qwen team released Qwen2.5-Max, a large MoE flagship model available through the Qwen Chat interface and API, claiming competitive results against DeepSeek V3 and other frontier models. The chat app also quietly shipped a video generation capability powered by Alibaba's Tongyi Wanxiang.

X announcement ↗Try it (Qwen Chat) ↗Tongyi Wanxiang ↗

🎙️ Hear our coverage →

#frontier-models #video-gen

Alibaba (Qwen) Jan 30, 2025

New ModelsOpen weights

Qwen2.5-VL

Alibaba ships Qwen2.5-VL open vision-language model family

Alibaba's Qwen team released Qwen2.5-VL, open-weights vision-language models up to 72B that handle images, documents, video understanding, and on-screen agentic grounding. The 72B Instruct model was immediately available on Hugging Face and in Qwen Chat.

72B Largest variant

Project blog ↗Hugging Face ↗GitHub ↗Try it (Qwen Chat) ↗

🎙️ Hear our coverage →

#vision #open-source #multimodal

Allen Institute for AI (Ai2) Jan 30, 2025

New ModelsOpen weights

Tulu 3 405B

Allen Institute releases Tulu 3 405B open post-trained model

The Allen Institute for AI scaled its fully open Tulu 3 post-training recipe to a 405B-parameter model based on Llama 3.1 405B. It demonstrates that Ai2's open RLVR post-training pipeline works at frontier scale, with weights and recipe released openly.

405B Parameters

Blog ↗Hugging Face collection ↗

🎙️ Hear our coverage →

#open-source #training

DeepSeek Jan 30, 2025

New ModelsOpen weights

Janus Pro

DeepSeek Janus Pro: open multimodal models in 1.5B and 7B

Amid the R1 frenzy, DeepSeek also released Janus Pro, unified multimodal models at 1.5B and 7B parameters that handle both image understanding and image generation. The open release added to DeepSeek's week of dominating AI news headlines.

1.5B / 7B Model sizes

GitHub ↗Try it (HF Space) ↗

🎙️ Hear our coverage →

#open-source #image-gen #multimodal

M M-A-P (Multimodal Art Projection) Jan 30, 2025

New ModelsOpen weights

YuE 7B

YuE 7B: open-source Suno-style music generation model

The Multimodal Art Projection (M-A-P) team released YuE, a 7B open-source music generation model dubbed the 'open Suno' on the show, capable of generating full songs with vocals from lyrics. Weights are on Hugging Face with code on GitHub and a hosted demo on fal.ai.

7B Parameters

Demo (fal.ai) ↗Hugging Face ↗GitHub ↗

🎙️ Hear our coverage →

#voice-ai #audio #open-source

Mistral AI Jan 30, 2025

New ModelsOpen weights

Mistral Small 2501

Mistral Small 2501: 24B open-weights model under Apache 2.0

Mistral AI released Mistral Small 2501, a 24B-parameter instruct model under the permissive Apache 2.0 license. Announced as breaking news during the show, it continues Mistral's tradition of strong small open models suitable for fine-tuning and local deployment.

24B Parameters

Hugging Face ↗

🎙️ Hear our coverage →

#open-source #on-device

NVIDIA Jan 30, 2025

New ModelsOpen weights

Eagle 2

NVIDIA releases Eagle 2 open vision-language models

NVIDIA published Eagle 2, a family of open vision-language models with an accompanying paper, model weights on Hugging Face, and a live demo. It is a fully transparent VLM release covering training data strategy and recipes, competitive with much larger vision models.

Paper ↗Models (HF collection) ↗Demo ↗

🎙️ Hear our coverage →

#vision #open-source #multimodal

ByteDance Jan 23, 2025

New ModelsOpen weights

UI-TARS

ByteDance UI-TARS: open computer-use models that control your PC

ByteDance released UI-TARS, open computer-use models in 7B and 72B parameter sizes that can control a Mac or PC, with desktop apps for both platforms. ByteDance claims they beat GPT-4-class models on GUI/computer-control benchmarks.

7B / 72B Model sizes

UI-TARS-7B-SFT on Hugging Face ↗UI-TARS desktop on GitHub ↗

🎙️ Hear our coverage →

#agents #open-source

DeepSeek Jan 23, 2025

New ModelsOpen weights

DeepSeek R1

DeepSeek R1: MIT-licensed open source reasoning model rivals o1

DeepSeek released R1, a state-of-the-art open source reasoning model under a permissive MIT license. It matches or beats OpenAI's o1 on key reasoning benchmarks while being fully open weights, and DeepSeek also shipped a family of distilled smaller models. The show called this the hottest week open source AI has ever had.

DeepSeek on Hugging Face ↗Combine DeepSeek R1 reasoning with GPT-3.5 Turbo (egghead) ↗Run DeepSeek with more thinking (Gist) ↗

🎙️ Hear our coverage →

#open-source #reasoning

Google DeepMind Jan 23, 2025

New Models

Gemini 2.0 Flash Thinking 01-21

Google ships updated Gemini Flash Thinking with 1M context

Google released an updated Gemini Flash Thinking model (01-21) with a 1 million token context window, built-in code execution, and improved evals over the previous Thinking release. It pushes Google's reasoning-model line forward in the same week DeepSeek R1 landed.

1M Context window (tokens)

Noam Shazeer announcement on X ↗

🎙️ Hear our coverage →

#reasoning #architecture

Hugging Face Jan 23, 2025

New ModelsOpen weights

SmolVLM (256M)

Hugging Face SmolVLM: tiny vision-language models run on WebGPU

Hugging Face released SmolVLM, a family of tiny vision-language models including a 256M-parameter version small enough to run entirely in the browser via WebGPU. It demonstrates how far efficient multimodal models have shrunk while remaining usable.

256M Parameters (smallest VLM)

SmolVLM-256M WebGPU demo on Hugging Face ↗

🎙️ Hear our coverage →

#vision #open-source #on-device

Tencent Jan 23, 2025

New ModelsOpen weights

Hunyuan3D 2.0

Tencent Hunyuan3D 2.0: SOTA open source 3D generation

Tencent released Hunyuan3D 2.0, a state-of-the-art open source 3D asset generation model on Hugging Face. It produces high-quality 3D shapes and textures and pushes open weights forward in the 3D generation category.

Hunyuan3D-2 on Hugging Face ↗

🎙️ Hear our coverage →

#world-models #open-source

🚀 Products & Apps 2

Riffusion Jan 30, 2025

Products & Apps

Fuzz

Riffusion launches Fuzz music generation, free for now

Riffusion (written as 'Refusion' in the show notes) launched Fuzz, a hosted AI music generation product that is free to use during its initial period. It was highlighted in the voice and audio segment alongside YuE as part of a wave of new AI music tools.

Fuzz (free for now) ↗

🎙️ Hear our coverage →

#audio #voice-ai

OpenAI Jan 23, 2025

Products & Apps

Operator

OpenAI launches Operator, an agentic browser for ChatGPT Pro

OpenAI launched Operator, an agentic browser-use product that performs tasks for you on the web, available to ChatGPT Pro subscribers at operator.chatgpt.com. As Sam Altman framed it on the launch stream: you give agents a task and they go off and do it.

operator.chatgpt.com ↗

🎙️ Hear our coverage →

✨ Major Features & Updates 2

Exa Jan 30, 2025

Major Features & Updates

Exa DeepSeek Chat

Exa ships free DeepSeek R1 chat demo with web search

Exa integrated DeepSeek R1 into a free hosted chat demo that combines the reasoning model with Exa's web search. Mentioned in the tools section as a no-cost way to try R1 grounded with live search results.

🎙️ Hear our coverage →

#reasoning #search #agents

Perplexity Jan 30, 2025

Major Features & Updates

Perplexity Pro with R1

Perplexity adds DeepSeek R1 as a Pro reasoning model option

Perplexity integrated DeepSeek R1 into its Pro search product, letting subscribers choose R1 as the reasoning model behind answers. It was one of several tools that raced to host R1 on Western infrastructure within days of the model's release.

🎙️ Hear our coverage →

#reasoning #search #agents

🔌 APIs & Platforms 2

Anthropic Jan 23, 2025

APIs & Platforms

Citations (Claude API)

Anthropic adds Citations to the Claude API

Anthropic launched a Citations capability in the Claude API, letting Claude ground its answers in provided source documents and return precise citations. It targets RAG and document-QA use cases where verifiable sourcing matters.

Anthropic Citations docs ↗

🎙️ Hear our coverage →

Perplexity Jan 23, 2025

APIs & Platforms

Sonar Pro Search API

Perplexity ships Sonar Pro search API and an Android AI assistant

Perplexity released its Sonar Pro search-grounded API, giving developers programmatic access to Perplexity-style web-grounded answers, and also launched an AI assistant for Android. Two shipping moves that push Perplexity beyond its consumer answer engine.

Perplexity announcement on X ↗

🎙️ Hear our coverage →

#search #api #consumer-ai

🛠️ Dev Tools 4

Block Jan 30, 2025

Dev ToolsOpen weights

Goose

Block open-sources Goose, a local AI agent framework

Block (the company behind Square) released Goose, an open-source local agent framework that runs on your machine and can use any LLM to execute tasks with tools. It was a centerpiece of the show's agents discussion as an open alternative for building autonomous workflows locally.

X announcement ↗GitHub / docs ↗

🎙️ Hear our coverage →

#agents #open-source #coding

Browser Use Jan 30, 2025

Dev ToolsOpen weights

Browser-use

Browser-use: open-source alternative to OpenAI's Operator

Browser-use is an open-source library that lets LLM agents control a real web browser, positioned on the show as the OSS counterpart to OpenAI's Operator. It enables anyone to build browsing agents with their model of choice instead of a closed hosted product.

🎙️ Hear our coverage →

#agents #open-source

ByteDance Jan 23, 2025

Dev Tools

Trae

ByteDance launches Trae, an AI IDE competing with Cursor

ByteDance launched Trae, an AI-powered code editor positioned as a Cursor competitor. It is ByteDance's second shipping move of the week alongside the UI-TARS computer-use models.

🎙️ Hear our coverage →

P Pietro Schirano Jan 23, 2025

Dev ToolsOpen weights

RAT (Retrieval Augmented Thinking)

RAT: pipe DeepSeek R1 reasoning into other models

Guest Pietro Schirano released RAT (Retrieval Augmented Thinking), a technique and tool that extracts DeepSeek R1's reasoning traces and feeds them to a cheaper, faster model like GPT-3.5 Turbo for the final answer. It showcases the new pattern of mixing open reasoning traces with closed completion models.

RAT announcement on X ↗Combine DeepSeek R1 reasoning with GPT-3.5 Turbo (egghead) ↗

🎙️ Hear our coverage →

#reasoning #coding

📄 Papers & Research 1

UC Berkeley Jan 30, 2025

Papers & ResearchOpen weights

TinyZero & RAGEN

Berkeley TinyZero and RAGEN replicate DeepSeek R1-Zero

Berkeley researchers released TinyZero and RAGEN, open replications of DeepSeek's R1-Zero reinforcement-learning recipe on small models. The projects showed that R1-style emergent reasoning behavior can be reproduced cheaply, with training runs logged publicly on Weights & Biases.

GitHub ↗W&B logs ↗

🎙️ Hear our coverage →

#reasoning #training #open-source

📦 Datasets 1

O Open Thoughts Jan 30, 2025

DatasetsOpen weights

OpenThoughts-114k

Open Thoughts releases OpenThoughts-114k reasoning dataset

An open reasoning dataset with 114k examples released by the Open Thoughts project to fuel open replication of reasoning models like DeepSeek R1. It gives the open-source community high-quality chain-of-thought training data for distilling and fine-tuning reasoning LLMs.

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #reasoning #training

📊 Benchmarks & Evals 1

Center for AI Safety & Scale AI Jan 23, 2025

Benchmarks & Evals

Humanity's Last Exam (HLE)

Humanity's Last Exam: a deliberately unsaturated frontier benchmark

Humanity's Last Exam (HLE) launched as a new, very hard benchmark designed to stay unsaturated as models max out MMLU and math evals. It crowdsourced expert-level questions to measure frontier model capability where existing benchmarks are at 98-99% saturation.

Humanity's Last Exam website ↗

🎙️ Hear our coverage →

#benchmarks #reasoning

💰 Funding 1

OpenAI (with SoftBank & Oracle) Jan 23, 2025

Funding

Stargate Project

Stargate Project: $500B AI infrastructure investment announced

OpenAI, SoftBank (Masayoshi Son's Vision Fund), and Oracle (Larry Ellison) announced the Stargate Project, a planned $500 billion investment in US AI infrastructure. The announcement, made alongside the White House, was framed on the show as an AI 'Manhattan Project'-scale buildout of datacenters and compute.

$500B Planned investment

OpenAI: Announcing the Stargate Project ↗

🎙️ Hear our coverage →

#infrastructure #industry

🌀 Also Released 1

Weights & Biases Jan 23, 2025

Also Released

W&B SWE-bench Verified SOTA agent

W&B programming agent breaks SOTA on SWE-bench Verified

Weights & Biases announced a state-of-the-art AI programming agent built with OpenAI's o1 that broke the SOTA score on SWE-bench Verified. The work was developed and tracked with W&B Weave, the team's LLM observability toolkit.

W&B SOTA programming agent report ↗W&B Weave ↗

🎙️ Hear our coverage →

#coding #agents #benchmarks

All months February 2025 →