Open Source & Open Weights

Open-weight model releases, open datasets, and the open-source AI ecosystem. — 282 releases covered on the show.

July 2026

Mistral AI Jul 8, 2026

New ModelsOpen weights

Robostral Navigate

Mistral releases Robostral Navigate, its first embodied-navigation model

An 8B robotics model that guides robots through natural-language task instructions using a single RGB camera, claiming state of the art on the R2R-CE benchmark. Mistral's first move into embodied AI, and one of the week's most-discussed releases on Hacker News.

8B ParametersSOTA R2R-CE benchmark

X announcement ↗Blog ↗

🎙️ Hear our coverage →

#robotics #open-source

P PyTorch Jul 8, 2026

Dev ToolsOpen weights

PyTorch 2.13

PyTorch 2.13 lands FlexAttention on Apple Silicon and big memory wins

3,328 commits from 526 contributors: FlexAttention on Apple Silicon at roughly 12x over SDPA for sparse patterns, a deterministic CUDA backward path, nn.LinearCrossEntropyLoss with up to 4x peak-memory reduction, torchcomms for large-cluster training, and expanded ROCm/Arm/XPU support.

~12x FlexAttention on Apple Silicon vs SDPA3,328 Commits from 526 contributors

X announcement ↗

🎙️ Hear our coverage →

#open-source #training #infrastructure

Cohere Jul 7, 2026

New ModelsOpen weights

Transcribe Arabic

Cohere open-sources Transcribe Arabic, topping the Arabic ASR leaderboard

A 2B-parameter Apache 2.0 speech-to-text model that leads the Hugging Face Arabic ASR leaderboard at 25.87 WER — about 11 points better than Whisper Large V3 — with human evaluators preferring it in roughly 96% of head-to-head tests. Handles dialect variety, code-switching and Arabic-English bilingual speech, with day-0 mlx-audio support.

25.87 WER (leaderboard #1)2B Parameters, Apache 2.096% Human preference vs Whisper

X announcement ↗

🎙️ Hear our coverage →

#voice-ai #open-source #multilingual

Liquid AI Jul 7, 2026

Papers & ResearchOpen weights

Antidoom

Liquid AI open-sources Antidoom, removing the reasoning doom-loop

An open method that suppresses the failure mode where reasoning models spiral into repetitive degenerate output: doom-loop rates dropped from 22.9% to 1% on Qwen3.5-4B and from 10.2% to 1.4% on an LFM2.5 checkpoint, with eval scores improving across the board.

22.9%→1% Doom-loop rate, Qwen3.5-4B

X announcement ↗

🎙️ Hear our coverage →

#reasoning #open-source #training

S Shanghai AI Lab Jul 7, 2026

New ModelsOpen weights

Agents-A1

Shanghai AI Lab releases Agents-A1, an Apache 2.0 agentic MoE

A 35B MoE built on Qwen3.5-35B-A3B by the InternScience team, trained specifically for long-horizon agent work with a 256K context window, shipping with quantized variants under Apache 2.0.

35B MoE parameters256K Context window

X announcement ↗

🎙️ Hear our coverage →

#agents #open-source

E Exo Labs Jul 2, 2026

Products & Apps

local.ai

Exo Labs launches local.ai to track the local-AI frontier

Announced live on ThursdAI at AI Engineer: local.ai tracks the best model for your hardware, the performance trade versus the cloud, and whether running local beats API-token pricing. Early access is live with signup codes, and the Exo CLI — 'vLLM for consumer devices, with the configs figured out for you' — ships in the coming weeks.

71% Terminal Bench 2.1, REAP-pruned GLM 5.2550B Nemotron-3 Ultra running on 4 NVIDIA Sparks

🎙️ Hear our coverage →

#on-device #open-source #infrastructure

Meituan Jul 2, 2026

New ModelsOpen weights

LongCat-2.0

Meituan reveals LongCat-2.0, a 1.6T MoE trained entirely on Chinese ASICs

Meituan disclosed LongCat-2.0, a 1.6-trillion-parameter MoE trained entirely on Chinese ASICs without NVIDIA hardware. It scores 59.5 on SWE-bench Pro and runs at $0.038 per million tokens with free cache hits. The model had been serving anonymously as 'Owl Alpha' and ranks among OpenRouter's top models by volume — part of a surge that puts Chinese open-weight models at ~30% of global usage, up from 1.2% eleven months ago.

1.6T MoE parameters, no NVIDIA in training59.5 SWE-bench Pro$0.038 per 1M tokens, free cache hits

🎙️ Hear our coverage →

#open-source #frontier-models

June 2026

Moonshot AI Jun 18, 2026

New ModelsOpen weights

Kimi K2.7 Code

Moonshot AI open-sources Kimi K2.7 Code for agentic coding

Moonshot AI open-sourced Kimi K2.7 Code, a trillion-parameter MoE coding model with benchmark jumps over K2.6 and fewer reasoning tokens. On the show it landed as the second half of the open-source coding wave beside GLM-5.2.

1T MoE parameters30% fewer reasoning tokens

Kimi announcement on X ↗Kimi K2.7 Code on Hugging Face ↗Kimi Code beta ↗

🎙️ Hear our coverage →

#open-source #coding #agents

Z.ai (Zhipu AI) Jun 18, 2026

New ModelsOpen weights

GLM-5.2

Z.ai releases GLM-5.2, a 753B open MoE with 1M context

Z.ai released GLM-5.2 as a major open-source coding and agentic model: a 753B-parameter MoE, MIT-licensed, with a one-million-token context window. The episode treated it as the open-source model that arrived exactly as Fable access disappeared, with strong coding and agentic performance close to the frontier.

753B parameters1M context windowMIT license

Z.ai announcement on X ↗GLM-5.2 blog ↗GLM-5.2 on Hugging Face ↗GLM-5.2 docs ↗

🎙️ Hear our coverage →

#open-source #coding #agents

Google DeepMind Jun 4, 2026

New ModelsOpen weights

Gemma 4 12B

Google drops Gemma 4 12B, an encoder-free multimodal local model

Google released Gemma 4 12B, an encoder-free multimodal model under Apache 2.0 that targets 16GB VRAM local setups. Instead of bolting separate vision or audio encoders onto a language model, it uses one unified network, which LDJ and Yam argued makes smaller multimodal models cheaper, cleaner, and easier to run locally.

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #multimodal #on-device

H Company Jun 4, 2026

New ModelsOpen weights

Holo 3.1

H Company launches Holo 3.1 local computer-use agent models

H Company released Holo 3.1, a family of local computer-use agent models ranging from 0.8B to 35B parameters with new quantized checkpoints. The lineup targets running screen-driving agents on local hardware rather than in the cloud.

X announcement ↗Blog ↗

🎙️ Hear our coverage →

#agents #open-source

Ideogram Jun 4, 2026

New ModelsOpen weights

Ideogram 4.0

Ideogram 4.0 becomes the top open-weight text-to-image model

Ideogram released Ideogram 4.0, a 9.3B-parameter text-to-image model with open weights under a non-commercial license. It leads open-weight image models on typography and layout, with bounding-box/layout-style prompting that trades casual generation ease for precise structured control.

9.3B Ideogram 4 parameters

Blog ↗Hugging Face Collection ↗Hugging Face (FP8) ↗X announcement ↗

🎙️ Hear our coverage →

#image-gen #open-source

JetBrains Jun 4, 2026

New ModelsOpen weights

Mellum 2

JetBrains open-sources Mellum 2, a 12B MoE coding model

JetBrains released Mellum 2, a 12B mixture-of-experts coding model with only 2.5B active parameters, trained from scratch by a small team using a three-stage curriculum over 10T tokens. The panel read it as IDE companies converting years of developer-workflow context into model advantage; it is also available on CoreWeave Inference.

Blog ↗Hugging Face ↗X announcement ↗CoreWeave Inference ↗

🎙️ Hear our coverage →

#coding #open-source

Nous Research Jun 4, 2026

Products & Apps

Hermes Desktop

Nous Research launches Hermes Desktop agent app for Mac/Win/Linux

Nous Research launched Hermes Desktop, packaging the Hermes Agent harness into a native desktop app for Mac, Windows, and Linux. Karan previewed chat, permissions, tool-call visibility, reasoning traces, and admin controls aimed at small teams, startups, and personal agent fleets.

X announcement ↗Site ↗

🎙️ Hear our coverage →

#agents #coding #open-source

NVIDIA Jun 4, 2026

New ModelsOpen weights

Nemotron 3.5 ASR

NVIDIA ships Nemotron 3.5 ASR, a 600M streaming speech model

NVIDIA released Nemotron 3.5 ASR, a 600M-parameter open multilingual streaming speech-to-text model aimed at voice agents. It supports 40 languages and reportedly delivers 17x more throughput than Parakeet-style baselines at half the size, pushing the latency/accuracy frontier for open voice-agent infrastructure.

17x Nemotron ASR throughput

Hugging Face ↗X announcement ↗STT Benchmark ↗Voice Agent Repo ↗

🎙️ Hear our coverage →

#voice-ai #open-source

NVIDIA Jun 4, 2026

New ModelsOpen weights

Nemotron 3 Ultra

NVIDIA releases Nemotron 3 Ultra, a 550B open-weight MoE for agents

NVIDIA dropped Nemotron 3 Ultra the day of the show, a 550B-parameter sparse MoE with 55B active parameters built for long-running agentic harnesses like OpenCode, Hermes, and OpenClaw. Chris Alexiuk joined to explain the hybrid Mamba/Transformer architecture and the unusually complete open release: weights, training data, recipes, a GenRM reward model, and an NVFP4 quantized checkpoint.

550B Nemotron 3 Ultra parameters55B Active parameters

Announcement ↗Technical Report ↗Hugging Face (post-trained BF16) ↗X announcement ↗

🎙️ Hear our coverage →

#open-source #agents #reasoning

May 2026

OpenBMB May 28, 2026

New ModelsOpen weights

MiniCPM5-1B

OpenBMB MiniCPM5-1B: new SOTA 1B open-weights model

OpenBMB released MiniCPM5-1B, a state-of-the-art 1B-parameter open-weights model for efficient local and on-device use that runs on a phone. It scores 17.9 on the Artificial Analysis Intelligence Index, 7.4 points ahead of its size class, while using roughly 31x fewer output tokens than Qwen3.5 2B.

17.9 AAII (1B model)

OpenBMB MiniCPM5-1B on Hugging Face ↗MiniCPM5-1B paper ↗Artificial Analysis on MiniCPM5-1B ↗OpenBMB announcement ↗

🎙️ Hear our coverage →

#open-source #on-device

O OpenMOSS May 28, 2026

New ModelsOpen weights

MOSS-TTS-v1.5

MOSS-TTS-v1.5: open-source 8B TTS with 31 languages

OpenMOSS shipped MOSS-TTS-v1.5, an 8B open-source text-to-speech model supporting 31 languages with pause control, released under Apache 2.0. It is one of the larger fully open TTS models available.

MOSS-TTS-v1.5 on Hugging Face ↗MOSS-TTS GitHub ↗MOSS-TTS paper ↗MOSS announcement ↗

🎙️ Hear our coverage →

#voice-ai #open-source

P PrismML May 28, 2026

New ModelsOpen weights

Bonsai Image 4B

PrismML's 1-bit Bonsai Image 4B runs local image gen under 1GB

PrismML released 1-bit and ternary versions of Bonsai Image 4B, a sub-1GB diffusion transformer for local image generation. The quantized model even runs in-browser via WebGPU and ships with an iOS app and a Hugging Face demo.

PrismML Bonsai Image 4B — blog ↗PrismML Bonsai on Hugging Face ↗Bonsai Image demo ↗Bonsai Studio iOS app ↗

🎙️ Hear our coverage →

#image-gen #on-device #infrastructure

Tencent May 28, 2026

New ModelsOpen weights

Hy-MT2

Tencent open-sources Hy-MT2 translation models under Apache 2.0

Tencent released the Hy-MT2 family of translation models under Apache 2.0, including a tiny 1.8B model that beats paid translation APIs like Microsoft's Translator, plus a larger 30B-A3B MoE variant. A small, free, locally-runnable model outperforming commercial translation services was one of the open-source wins of the week.

Tencent Hy-MT2 1.8B ↗Tencent Hy-MT2 30B-A3B ↗Hy-MT2 paper ↗Tencent Hunyuan announcement ↗

🎙️ Hear our coverage →

#open-source #multilingual

Cohere May 21, 2026

New ModelsOpen weights

Command A+

Cohere releases Command A+, a 218B Apache 2.0 MoE with 25B active params

Cohere released Command A+, a 218B-parameter mixture-of-experts model with 25B active parameters, shipping open weights under Apache 2.0. It was the week's headline open-source release, available on Hugging Face in both W4A4 quantized and BF16 variants.

218B Command A+ parameters25B active parameters

Cohere blog ↗Nick Frosst ↗HF W4A4 ↗HF BF16 ↗

🎙️ Hear our coverage →

#open-source #architecture

Nous Research May 21, 2026

Papers & ResearchOpen weights

Lighthouse Attention

Nous Research publishes Lighthouse Attention for fast long-context pretraining

Nous Research released Lighthouse Attention, a sparse attention method for long-context pretraining that delivers major speedups. The release includes a blog post, an arXiv paper and an open-source GitHub implementation.

Blog ↗Nous Research on X ↗arXiv ↗GitHub ↗

🎙️ Hear our coverage →

#research #architecture #open-source

F Fastino Labs May 14, 2026

New ModelsOpen weights

GLiGuard

Fastino Labs GLiGuard: 300M open guardrail model matches SOTA safety models

Fastino Labs released GLiGuard, a 300M-parameter open source guardrail model that matches state-of-the-art safety models 23-90x its size while delivering 16x higher throughput. It ships under Apache 2.0, making small, fast, deployable guardrails available to everyone.

300M parameters

X announcement ↗GitHub ↗

🎙️ Hear our coverage →

#open-source #safety

Meta AI May 14, 2026

New ModelsOpen weights

Sapiens2

Meta Sapiens2: family of 6 human-centric vision models (0.1B-5B)

Meta released Sapiens2, a family of six ViT models ranging from 0.1B to 5B parameters trained on 1 billion human images. The models set SOTA on human-centric vision tasks including pose estimation, segmentation, surface normals, and pointmaps, with weights on Hugging Face.

X announcement ↗Hugging Face collection ↗

🎙️ Hear our coverage →

#vision #open-source

April 2026

DeepSeek Apr 30, 2026

New ModelsOpen weights

DeepSeek V4

DeepSeek V4: 1.6T MoE with CSA+HCA attention and 1M context

DeepSeek released the V4 paper and models (V4-Pro and V4-Flash on Hugging Face), a 1.6T-parameter MoE featuring CSA+HCA attention that fits 1M tokens of context in just 5.7GB of KV cache. It is possibly the first frontier model trained across multiple datacenters, and DeepSeek is offering API tokens at an 80% discount on already much cheaper pricing.

1M context window5.7GB KV cache at 1M context

DeepSeek announcement on X ↗Arxiv paper ↗DeepSeek-V4-Pro on Hugging Face ↗DeepSeek-V4-Flash on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #architecture #training

IBM Apr 30, 2026

New ModelsOpen weights

Granite 4.1

IBM Granite 4.1: dense non-thinking models with top tool calling

IBM released the Granite 4.1 family (3B/8B/30B), dense non-thinking models under Apache 2.0 with best-in-class tool calling, scoring 73 on BFCL with just 8B parameters. IBM claims 20x token efficiency over Qwen3.5 9B, and the models are live on W&B Inference at $0.05/$0.10 per million input/output tokens with 128K context.

IBM Granite blog ↗Hugging Face ↗W&B Inference ↗

🎙️ Hear our coverage →

#open-source #agents #industry

Mistral AI Apr 30, 2026

New ModelsOpen weights

Mistral Medium 3.5

Mistral Medium 3.5: 128B dense flagship with 256K context

Mistral launched Medium 3.5, a 128B dense flagship model with 256K context and configurable reasoning, released with weights on Hugging Face. Alongside it Mistral shipped a Vibe coding agent.

Mistral blog ↗Hugging Face ↗Mistral Vibe on X ↗

🎙️ Hear our coverage →

#open-source #reasoning #coding

NVIDIA Apr 30, 2026

New ModelsOpen weights

Nemotron 3 Nano Omni

NVIDIA Nemotron 3 Nano Omni: hybrid Transformer-Mamba MoE

NVIDIA released Nemotron 3 Nano Omni, a 30B-total/3B-active hybrid Transformer-Mamba MoE with 256K context. It delivers 9x throughput on consumer hardware.

NVIDIA blog ↗

🎙️ Hear our coverage →

#open-source #multimodal #architecture

SenseTime Apr 30, 2026

New ModelsOpen weights

SenseNova U1

SenseTime open-sources SenseNova U1 unified multimodal MoE

SenseTime open-sourced SenseNova U1, a unified multimodal MoE model with 8B total and 3B active parameters that handles understanding and generation with no separate encoder or VAE. The architecture builds on a paper the team presented at ICLR last year.

8B total parameters (3B active MoE)

SenseTime announcement on X ↗Hugging Face collection ↗GitHub ↗Try it ↗

🎙️ Hear our coverage →

#open-source #multimodal #architecture

T Talkie (Alec Radford & David Duvenaud) Apr 30, 2026

New ModelsOpen weights

Talkie

Talkie: 13B open-weight LLM trained only on pre-1930 text

Alec Radford and David Duvenaud released Talkie, a 13B open-weight LLM trained exclusively on pre-1930 text. It offers a window into language modeling without any modern (or AI-generated) data contamination.

talkie-lm.com ↗

🎙️ Hear our coverage →

#open-source #research

Alibaba (Qwen) Apr 23, 2026

New ModelsOpen weights

Qwen3.6-27B

Qwen3.6-27B: dense Apache-2.0 model beats Alibaba's own 400B flagship

Alibaba shipped Qwen3.6-27B, a dense 27B-parameter model under Apache 2.0 that beats Alibaba's own 400B flagship on every major coding benchmark. Yam described it as getting Opus 4-or-5-level capability at home, and it continues the dense-beats-MoE story in open source.

27B dense Qwen3.6

Qwen3.6-27B release ↗Qwen3.6-27B on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #coding

B Brex Apr 23, 2026

Dev ToolsOpen weights

CrabTrap

Brex open-sources CrabTrap, an LLM-as-judge proxy for agent security

Brex's CEO pair-programmed with Codex and open-sourced CrabTrap, an LLM-as-judge HTTP proxy that intercepts outbound agent requests and blocks risky activity using natural-language rule definitions. Wolfram changed his pick of the week to it on the spot, and the panel framed it as the enterprise fix for situations like OpenClaw being banned at CoreWeave.

Brex CrabTrap ↗

🎙️ Hear our coverage →

#agents #safety #open-source

Moonshot AI Apr 23, 2026

New ModelsOpen weights

Kimi K2.6

Kimi K2.6: 1T MoE open-source SOTA on SWE-Bench Pro

Moonshot AI released Kimi K2.6, a 1-trillion-parameter MoE with 32B active parameters, 384 experts, MLA attention, and a 256K context window under a modified MIT license. It claims open-source state of the art on SWE-Bench Pro at 58.6, and Wolfram called it the best open-source model he has ever tested on his private wolf-bench.

1T MoE Kimi K2.6

Kimi K2.6 release ↗Kimi K2.6 on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #coding #agents

OpenAI Apr 23, 2026

New ModelsOpen weights

Privacy Filter

OpenAI open-sources a 1.5B privacy/PII filter that runs in the browser

OpenAI open-sourced a tiny 1.5B MoE model with only 50M active parameters under Apache 2.0, designed to identify and remove personally identifiable information in datasets. It runs fully in the browser on WebGPU via Xenova's Transformers.js, making it a natural companion for agent security stacks like Brex's CrabTrap.

OpenAI Privacy Filter ↗Privacy Filter on Hugging Face ↗Privacy Filter WebGPU demo ↗

🎙️ Hear our coverage →

#open-source #safety

0 0xSero Apr 16, 2026

New ModelsOpen weights

Gemma 4 21B REAP

Gemma 4 21B REAP: 20% expert-pruned Gemma 4 26B MoE

Community researcher 0xSero released Gemma 4 21B-A4B REAP, a 20% expert-pruned version of the Gemma 4 26B MoE created using Cerebras' REAP pruning technique. It shrinks the model for cheaper local inference while preserving most of its quality.

gemma-4-21b-a4b-it-REAP on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #architecture #on-device

Alibaba (Qwen) Apr 16, 2026

New ModelsOpen weights

Qwen 3.6-35B-A3B

Qwen 3.6-35B-A3B: Apache 2.0 MoE with 3B active hits 73.4% SWE-Verified

Alibaba Qwen open-sourced Qwen 3.6-35B-A3B under Apache 2.0 the same morning Opus 4.7 dropped: a 35B MoE with only 3B active parameters that scores 73.4% on SWE-bench Verified, rivaling models 10x its size. It is natively multimodal with 262K context extensible to 1M, and the crew called it the strongest mid-size LLM on nearly all benchmarks, putting to rest doubts about Qwen's open-source commitment after Junyang Ling's departure.

73.4% SWE-bench Verified

Qwen 3.6 announcement (X) ↗Qwen3.6-35B-A3B on Hugging Face ↗Qwen blog: Qwen 3.6-35B-A3B ↗

🎙️ Hear our coverage →

#open-source #architecture #coding

Baidu Apr 16, 2026

New ModelsOpen weights

ERNIE-Image

Baidu ERNIE-Image: 8B DiT ranks #1 on GenEval among open models

Baidu released ERNIE-Image, an 8B diffusion transformer that ranks #1 on GenEval among open models and features precise multilingual text rendering. It is part of this week's wave of Chinese open releases in image and 3D generation.

ERNIE-Image on Hugging Face ↗

🎙️ Hear our coverage →

#image-gen #architecture #open-source

Daily (Pipecat) Apr 16, 2026

Products & AppsOpen weights

Gradient Bang

Gradient Bang: first massively multiplayer fully LLM-driven voice game

Kwindla Kramer's 'side project that broke containment' is a fully LLM-driven multiplayer voice-based space game inspired by BBS-era Trade Wars, built on a new Pipecat Sub-Agents library with a class-based event bus that works locally and over the network. A Deepgram plus GPT-4.1 voice agent always responds in under 1.5 seconds while GPT-5.2 medium-thinking task agents do the work, and the React frontend is rendered from LLM-generated JSON as dynamic UI. The team also open-sourced GB Benchmarks for evaluating agent task execution.

Play Gradient Bang ↗gradient-bang on GitHub ↗Kwindla on Gradient Bang (X) ↗

🎙️ Hear our coverage →

#voice-ai #agents #open-source

J Jiunsong (@songjunkr) Apr 16, 2026

New ModelsOpen weights

Super Gemma 4 26B Uncensored v2

Super Gemma 4 26B Uncensored v2 trends on HF with 0/100 refusals

Community fine-tuner @songjunkr released Super Gemma 4 26B Uncensored v2, which is trending on Hugging Face with 0/100 refusals and fixed tool calling. It ships in GGUF and MLX 4-bit variants for local inference.

Super Gemma 4 26B Uncensored GGUF v2 (HF) ↗Super Gemma 4 26B Uncensored MLX 4bit v2 (HF) ↗@songjunkr on X ↗

🎙️ Hear our coverage →

#open-source #on-device

Marimo Apr 16, 2026

Dev ToolsOpen weights

Marimo Pair

Marimo Pair drops coding agents inside reactive Python notebooks

Marimo released Marimo Pair, which embeds Claude Code, Codex, or OpenCode agents directly inside its reactive, dependency-graph-aware Python notebooks. Founding engineer Trevor Manz joined the show to explain why reactive notebooks are a natural verification surface for agent-written code; the launch trended on Hacker News this week and was featured as part of This Week's Buzz (Marimo is in the CoreWeave family).

Marimo blog: Marimo Pair ↗marimo-pair on GitHub ↗

🎙️ Hear our coverage →

#coding #agents #open-source

NVIDIA Apr 16, 2026

New ModelsOpen weights

Lyra 2.0

NVIDIA Lyra 2.0: single image to explorable 3D worlds, Apache 2.0

NVIDIA released Lyra 2.0 under Apache 2.0, generating persistent, explorable 3D worlds from a single image. Together with Baidu ERNIE-Image and Tencent HYWorld 2.0, it rounds out a week of open releases in the 3D-world-from-single-image race.

Lyra 2.0 project page ↗Lyra-2.0 on Hugging Face ↗

🎙️ Hear our coverage →

#world-models #open-source

Tencent Apr 16, 2026

New ModelsOpen weights

HYWorld 2.0

Tencent HYWorld 2.0 turns a single image into editable 3D scenes

Tencent released HYWorld 2.0, which converts a single image into editable 3D Gaussian Splats and meshes that are ready for Unity, Unreal, and Isaac Sim. It is one of three single-image-to-3D-world releases this week, essentially an open-source equivalent of what Fei-Fei Li's World Labs is building.

HY-World 2.0 on GitHub ↗

🎙️ Hear our coverage →

#world-models #open-source

Weights & Biases Apr 16, 2026

Major Features & Updates

Gemma 4 on W&B Inference

Gemma 4 goes live on W&B Inference with LoRA inference support

Weights & Biases put Gemma 4 live on W&B Inference, running on CoreWeave infrastructure with LoRA inference support. Replying to the W&B announcement post on X with the code 'Gem Drop' gets $20 in free inference credits.

W&B Inference ↗W&B announcement post (X) ↗

🎙️ Hear our coverage →

#infrastructure #open-source

Arena (formerly LMArena) Apr 9, 2026

DatasetsOpen weights

Arena historical leaderboard & prompt datasets

Arena releases 3 years of leaderboard data and prompts on Hugging Face

Arena (formerly LMArena) released three years of historical leaderboard data plus the actual user prompts as datasets on Hugging Face. Peter Gostev, who previously scraped the site by hand into Google Sheets for his charts, now builds his Compute Wars and model-trend analyses straight from the data.

Peter Gostev on X ↗

🎙️ Hear our coverage →

#benchmarks #open-source

M MemPalace (Ben Sigman & Milla Jovovich) Apr 9, 2026

Dev ToolsOpen weights

MemPalace

MemPalace open-source AI memory system goes viral with 26K stars

MemPalace, the open-source AI memory system from Milla Jovovich and Ben Sigman, went viral with 26K GitHub stars in 2 days and claimed top memory-benchmark scores. The team then transparently walked back the overstated benchmark claims in a public correction thread, which the show called a refreshingly honest arc.

MemPalace on GitHub ↗Ben Sigman launch post on X ↗Ben Sigman's transparent correction thread ↗Memory Palace web frontend on GitHub ↗

🎙️ Hear our coverage →

#agents #open-source

Nous Research Apr 9, 2026

New ModelsOpen weights

Hermes 27B

Nous Research ships Hermes 27B, paired with the Hermes harness

Nisten's pick of the week: Hermes 27B, an open model trained specifically to be paired with the Hermes harness and allegedly distilled from the Opus API. Model and harness ship together as a portable unit, a notable take on the harness-engineering trend Swyx discussed.

🎙️ Hear our coverage →

#open-source #agents

OpenClaw Apr 9, 2026

Dev ToolsOpen weights

OpenClaw 2026.4.5

OpenClaw 2026.4.5 ships /dreaming memory consolidation

OpenClaw's biggest release since 4.0: /dreaming goes GA with Light/Deep/REM memory consolidation phases that defrag agent memory into a human-readable Dream Diary (DREAMS.md). The release also adds built-in video and music generation across 4 backends, GPT-5.4 as the new default model, prompt-cache reuse improvements, and Control UI plus docs in 12 new languages. Maintainer Vincent Koc says the ~1.5M-line codebase was refactored into a plugin architecture in nine days.

1.5M lines OpenClaw codebase

OpenClaw v2026.4.5 release notes ↗Vincent Koc announcement on X ↗Dreaming docs ↗Turing Post FOD#147: Can your OpenClaw dream ↗

🎙️ Hear our coverage →

#agents #open-source

Z.ai (Zhipu AI) Apr 9, 2026

New ModelsOpen weights

GLM-5.1

GLM-5.1 takes #1 open-source spot on SWE-Bench Pro at 58.4%

Z.ai released GLM-5.1, now the #1 open-source model on SWE-Bench Pro at 58.4%. It can run autonomously for 8 hours with 1,700+ agent steps, and is already live on W&B Inference. Open weights are up on Hugging Face alongside an arXiv paper.

Z.ai announcement on X ↗GLM-5.1 weights on Hugging Face ↗GLM-5.1 paper on arXiv ↗

🎙️ Hear our coverage →

#open-source #agents #coding

Alibaba (Qwen) Apr 2, 2026

New ModelsOpen weights

Qwen3.5-Omni

Alibaba open-sources Qwen3.5-Omni, a 397B native omni-modal model

Qwen3.5-Omni is Alibaba's natively omni-modal open model handling text, image, audio, and video, with 397B total parameters and 17B active. It extends the Qwen family's open-source momentum into unified multimodal workloads.

Announcement (X) ↗Qwen blog ↗

🎙️ Hear our coverage →

#open-source #multimodal

Google DeepMind Apr 2, 2026

New ModelsOpen weights

Gemma 4

Google releases Gemma 4 open-weights family under Apache 2.0

Google DeepMind's Gemma 4 launch crossed 10M+ downloads with over 1,000 Gemma-4-based fine-tunes on Hugging Face; the Gemma family totals 500M+ downloads. Omar Sanseviero says Gemma is the foundation for the next generation of Gemini Nano shipping on Pixel and Samsung, with the AI Edge gallery letting people run it locally on Android and iOS. It punched above its size on Arena's Pareto curve and is now live on W&B Inference.

Hugging Face Collection ↗Try in AI Studio ↗Omar Sanseviero on X ↗

🎙️ Hear our coverage (+1 follow-up) →

#open-source #agents #on-device

Liquid AI Apr 2, 2026

New ModelsOpen weights

LFM2.5-350M

Liquid AI ships LFM2.5-350M with agentic tool calling at 350M params

Liquid AI released LFM2.5-350M, a 350M-parameter open model that does agentic tool calling and fits under 500MB quantized. It targets edge and on-device agent workloads where tiny deployable models matter.

Announcement (X) ↗Hugging Face ↗Liquid AI blog ↗

🎙️ Hear our coverage →

#open-source #on-device #agents

P PrismML Apr 2, 2026

New ModelsOpen weights

Bonsai

PrismML releases Bonsai 1-bit models, an 8B model in 1.15 GB

PrismML released Bonsai, a family of 1-bit quantized open models fitting an 8B model into 1.15 GB and claiming 10x intelligence density, built on decades of compression research. The panel discussed one-bit quantization as a cost/performance lever for cheap local inference.

Announcement (X) ↗Hugging Face ↗PrismML site ↗

🎙️ Hear our coverage →

#open-source #infrastructure #on-device

R Ryan Carson Apr 2, 2026

Dev ToolsOpen weights

Claw Chief

Ryan Carson open-sources Claw Chief, an AI chief of staff

Co-host Ryan Carson open-sourced Claw Chief, an AI chief-of-staff setup with skills, crons, and scheduling. It packages his agent workflow patterns into a reusable open-source repo.

🎙️ Hear our coverage →

#agents #consumer-ai #open-source

U Ultraworkers (Sigrid Jin & Bellman) Apr 2, 2026

Dev ToolsOpen weights

claw-code

Claw-code clean-room rewrite becomes fastest repo to 100K GitHub stars

After Claude Code's source leaked via npm, Sigrid Jin and Bellman published claw-code, a clean-room rewrite that became the fastest GitHub repo to pass 100K stars, hitting the mark in roughly 24 hours. Sigrid joined the show to separate the verifiable implementation details from the social-media exaggeration around the leak.

100K+ GitHub stars in 24h

🎙️ Hear our coverage →

#coding #agents #open-source

March 2026

A Aratako Mar 26, 2026

New ModelsOpen weights

Irodori-TTS-500M

Irodori-TTS-500M: open Japanese TTS with emoji emotion control

Irodori-TTS-500M is a 500M-parameter open-weights Japanese text-to-speech model released on Hugging Face, notable for controlling emotional delivery through emojis in the input text. It landed as part of the week's wave of voice and audio releases.

Announcement (X) ↗Irodori-TTS-500M on Hugging Face ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Cohere Mar 26, 2026

New ModelsOpen weights

Cohere Transcribe

Cohere Transcribe: open-source 2B ASR tops Open ASR Leaderboard at 5.42% WER

Cohere entered the ASR game with Transcribe, a 2-billion-parameter Apache 2.0 speech recognition model that immediately took the number-one spot on Hugging Face's Open ASR Leaderboard with a 5.42% word error rate versus Whisper Large v3's 7.44%. It wins 61% of human evaluations on average and 64% head-to-head against Whisper, making it a credible local-inference Whisper replacement for regulated industries.

2B Cohere Transcribe ASR size5.42% Word error rate on Open ASR Leaderboard

Cohere announcement (X) ↗Cohere blog: Transcribe ↗Open ASR Leaderboard (Hugging Face) ↗

🎙️ Hear our coverage →

#voice-ai #open-source

MiniMax Mar 26, 2026

New ModelsOpen weights

MiniMax 2.7

MiniMax 2.7 open-source weights discussed as small-model momentum continues

The panel covered MiniMax 2.7 and its open-weights release in the context of small, efficient models becoming genuinely practical for local and specialized agent workflows. The segment focused on capability momentum and how open-weights expectations keep shaping adoption sentiment.

🎙️ Hear our coverage →

#open-source #agents

Mistral AI Mar 26, 2026

New ModelsOpen weights

Voxtral TTS

Mistral drops Voxtral TTS, a 3B open-weight text-to-speech model

Mistral released Voxtral TTS, its first text-to-speech model, as breaking news during the live show: 3 billion parameters, open weights, with emotion controls for neutral, happy, and frustrated voices. Mistral claims it beats ElevenLabs Flash v2.5 in human preference tests with a 58% win rate on flagship voices and 68% on zero-shot voice cloning, though Alex's live test found it decent rather than stunning.

3B Mistral Voxtral TTS size

Mistral AI announcement (X) ↗Mistral blog: Voxtral TTS ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Reka AI Mar 26, 2026

New ModelsOpen weights

Reka Edge

Reka AI ships Edge, a 7B multimodal VLM for sub-second on-device inference

Reka AI launched Reka Edge, a 7B-parameter multimodal vision-language model built for sub-second latency on edge devices. Weights are on Hugging Face and the model is available through OpenRouter, with the panel highlighting it as a notable efficient multimodal release for real-world deployment.

Reka AI announcement (X) ↗Reka Edge on Hugging Face ↗Reka Edge on OpenRouter ↗Reka AI blog ↗

🎙️ Hear our coverage →

#open-source #vision #on-device

H Company Mar 19, 2026

New ModelsOpen weights

Holotron-12B

H Company's Holotron-12B: hybrid SSM computer-use model at 8.9k tok/s

H Company released Holotron-12B, an open-source hybrid SSM model built for computer-use agents. It claims 8,900 tokens/sec generation speed and jumps the WebVoyager benchmark from 35.1% to 80.5%, continuing the trend of hybrid SSM architectures for long-context agent workloads.

8,900 tok/s H Company Holotron 12B

Hugging Face ↗H Company blog ↗H Company on X ↗BricksAI on X ↗

🎙️ Hear our coverage →

#open-source #agents

Hugging Face Mar 19, 2026

Also Released

State of Open Source Spring 2026 Report

Hugging Face report: China passes US in LLM count, Qwen tops 1B downloads

Hugging Face published its Spring 2026 State of Open Source report showing China surpassing the US in number of LLMs for the first time, with Chinese models taking 41% of all downloads. Alibaba's Qwen family crossed 1 billion total downloads (about 1 million per day), overtaking Llama as the most downloaded model family, on a platform now hosting 11M users and 2M+ models.

Hugging Face blog ↗Irene Solaiman on X ↗AeonCorridor on X ↗

🎙️ Hear our coverage →

#open-source #industry

MiniMax Mar 19, 2026

New Models

MiniMax M2.7

MiniMax M2.7: first self-evolving model hits 56% on SWE-Bench Pro

MiniMax dropped M2.7, billed as the first self-evolving model: it ran 100+ autonomous RL optimization loops and wrote its own agent scaffolding, built by one engineer over four days with zero lines of human code. It scores 56.22% on SWE-Bench Pro, within one point of Opus 4.6's 57.3%, and WolfBench shows it roughly matching Sonnet 4.6 on OpenClaw agent tasks. Not yet open weights, though rumors suggest a release is coming.

56% MiniMax 2.7 SWE-bench Pro

MiniMax announcement ↗MiniMax on X ↗TestingCatalog on X ↗MiniMax M2.7 announcement (X) ↗

🎙️ Hear our coverage (+1 follow-up) →

#coding #agents #reasoning

Mistral AI Mar 19, 2026

New ModelsOpen weights

Mistral Small 4

Mistral Small 4: 119B MoE with 6B active unifies vision, coding, reasoning

Mistral returned to open source with Small 4, a 119B-parameter MoE with 128 experts and only 6B active per token, released under Apache 2.0. It unifies the previous Pixtral (vision), Devstral (coding), and Magistral (reasoning) lines into one model and can fit on a single H100 when compressed. Early WolfBench results are sobering at ~17% on OpenClaw agent tasks, roughly on par with similarly sized Nemotron.

119B Mistral Small 4 total params

Mistral blog ↗Hugging Face ↗X announcement ↗

🎙️ Hear our coverage →

#open-source #architecture #multimodal

S State Spaces (Albert Gu et al.) Mar 19, 2026

Papers & ResearchOpen weights

Mamba-3

Mamba-3 lands with three SSM innovations for inference-first linear models

Mamba-3 dropped with three SSM-centric innovations: trapezoidal discretization, complex-valued states, and a MIMO formulation aimed at inference-first linear models. It extends the state-space model line that underpins the growing wave of hybrid SSM architectures for long-context and agentic workloads.

Arxiv paper ↗GitHub ↗Albert Gu on X ↗

🎙️ Hear our coverage →

#research #architecture #open-source

Unsloth AI Mar 19, 2026

Dev ToolsOpen weights

Unsloth Studio

Unsloth Studio: web UI for local fine-tuning with 2x speed, 70% less VRAM

Unsloth launched Studio, an open-source web UI for local LLM training and inference claiming 2x speed and 70% less VRAM, supporting 500+ models across text, vision, audio, and embeddings. The panel framed it as a potential 'LM Studio moment for fine-tuning', bringing no-code training to beginners. Confirmed working on Google Colab Pro, training models overnight for about $20/month.

Unsloth Studio docs ↗X announcement ↗GitHub ↗Daniel Han announcement (X) ↗

🎙️ Hear our coverage (+1 follow-up) →

#training #open-source #coding

Fish Audio Mar 13, 2026

New ModelsOpen weights

Fish Audio S2

Fish Audio S2 open TTS hits sub-150ms latency

Fish Audio S2 is a fully open-source TTS model with inline emotion control via free-text bracket tags like gasp, laughter, and long pause. Alex demoed it live with an OpenClaw skill that let his 5-year-old talk to a voice clone of 'Rocky' from Project Hail Mary; Wolfram called it 'ElevenLabs V3 for free.'

<150ms Fish Audio S2 TTS latency

Fish Audio S2 on X ↗Fish Speech 2 on HuggingFace ↗fish.audio ↗

🎙️ Hear our coverage (+1 follow-up) →

#voice-ai #open-source

Lightricks Mar 13, 2026

New ModelsOpen weights

LTX Video 2.3

Lightricks ships open-source LTX Video 2.3, runs on an RTX 3090

Lightricks released LTX Video 2.3, an open-source video generation model with improved motion, audio, and quality that runs on a single RTX 3090. It is available on GitHub and Hugging Face.

LTX-Video on GitHub ↗LTX-Video on HuggingFace ↗

🎙️ Hear our coverage →

#video-gen #open-source

MiroMind Mar 13, 2026

New ModelsOpen weights

MiroThinker-1.7

MiroThinker-1.7 open-source research agent hits SOTA

MiroMind released MiroThinker-1.7, an open-source deep-research agent model that reaches state of the art on deep research benchmarks. It was covered alongside NVIDIA's Nemotron launch in the open-source segment.

MiroThinker-1.7 on X ↗MiroThinker-1.7 on HuggingFace ↗

🎙️ Hear our coverage →

#agents #open-source #research

NVIDIA Mar 13, 2026

New ModelsOpen weights

Nemotron 3 Super 120B

NVIDIA releases Nemotron 3 Super 120B with $26B open-source bet

NVIDIA launched Nemotron 3 Super, a 120B Hybrid Mamba-Transformer MoE model with 12B active parameters, a 1M-token context window, and 450 tok/s throughput. It shipped with BF16/FP8/NVFP4 weights, a base checkpoint, SFT and pre-training data, and the full training recipe, alongside a $26B 5-year open-source commitment. It is available on W&B Inference at $0.20/M input and $0.80/M output.

120B Nemotron 3 Super total parameters12B Nemotron 3 Super active parameters (MoE)1M Nemotron 3 Super context window (tokens)

NVIDIA on X ↗Nemotron 3 Super blog post ↗Nemotron 3 Super on HuggingFace ↗W&B Inference (Nemotron) ↗

🎙️ Hear our coverage →

#open-source #architecture #reasoning

P Paperclip Mar 13, 2026

Dev ToolsOpen weights

Paperclip.ing

Paperclip.ing: open-source agent orchestration for zero-human companies

Anonymous builder DOTTA presented Paperclip.ing, an open-source agent orchestration framework for 'zero human companies' where an AI CEO recursively hires more agents. It hit 20K GitHub stars in its first week, with a heartbeat system driving agent autonomy and a Memento-style memory architecture keeping agents coherent across tasks.

20K Paperclip GitHub stars in first week

Paperclip on GitHub ↗Paperclip.ing website ↗

🎙️ Hear our coverage →

#agents #open-source

T Templar Mar 13, 2026

New ModelsOpen weights

Covenant-72B

Covenant-72B: a decentralized-trained open 72B LLM

Covenant-72B is a decentralized 72B-parameter open LLM, released and shared via Hugging Face. It was highlighted in the open-source segment as an example of decentralized model training.

Covenant-72B on X ↗Covenant-72B on HuggingFace ↗

🎙️ Hear our coverage →

#open-source #training

Alibaba (Qwen) Mar 5, 2026

New ModelsOpen weights

Qwen3.5 Small Series

Alibaba releases Qwen3.5 small models (2B, 4B, 9B) for local use

Alibaba released the Qwen3.5 small model series with 2B, 4B, and 9B variants, which the panel found highly usable on consumer hardware. The release landed alongside leadership turbulence as Junyang Lin and Binyuan Hui departed Qwen, though the panel expects Alibaba's open-source momentum to continue.

Qwen3.5 small models announcement ↗Qwen3.5-9B on Hugging Face ↗Qwen3.5-4B on Hugging Face ↗Qwen3.5-2B on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #on-device

I IEIT (Yuan AI Lab) Mar 5, 2026

New ModelsOpen weights

Yuan 3.0 Ultra

Yuan AI Lab releases Yuan 3.0 Ultra open-weights model

Yuan AI Lab (IEIT) released Yuan 3.0 Ultra, a new open-weights model published on Hugging Face under the IEITYuan org. It was covered in the open-source LLM roundup as part of a busy week for Chinese open model releases.

Yuan 3.0 Ultra announcement ↗Yuan Lab blog ↗IEITYuan on Hugging Face ↗

🎙️ Hear our coverage →

StepFun Mar 5, 2026

New ModelsOpen weights

Step 3.5 Flash Base

StepFun open-sources Step 3.5 Flash Base with its training stack

StepFun released Step 3.5 Flash Base and Midtrain checkpoints, an unusually open release that includes training artifacts and the SteptronOSS training stack alongside the weights. The panel praised the Apache-2 orientation and called the continuation-pretraining flexibility a major practical unlock for builders.

StepFun announcement ↗Step-3.5-Flash-Base on Hugging Face ↗SteptronOSS training stack on GitHub ↗Step 3.5 Flash paper on arXiv ↗

🎙️ Hear our coverage →

#open-source #research

February 2026

Alibaba (Qwen) Feb 26, 2026

New ModelsOpen weights

Qwen 3.5

Qwen 3.5 lands: 35B/3B-active Medium outperforms the old 235B flagship

Alibaba released the Qwen 3.5 family of open-weight models, headlined by Qwen3.5-35B-A3B, a 35B model with only 3B active parameters that outperforms their previous 235B flagship. Variants include a 122B-A10B and a dense 27B, with the panel highlighting the hybrid state-space (Mamba-layer) architecture and strong practical coding and agent performance at a tiny active-parameter footprint.

35B / 3B active Qwen 3.5 Medium

Qwen announcement on X ↗Qwen3.5-35B-A3B on Hugging Face ↗Qwen3.5-122B-A10B on Hugging Face ↗Qwen 3.5 blog post ↗

🎙️ Hear our coverage →

#open-source #architecture #coding

Liquid AI Feb 26, 2026

New ModelsOpen weights

LFM2-24B-A2B

Liquid AI releases LFM2-24B-A2B, a laptop-friendly 24B MoE

Liquid AI released LFM2-24B-A2B, a 24B mixture-of-experts model with only 2.3B active parameters that runs on consumer laptops. The panel highlighted its speed and surprisingly strong non-coding reasoning, reinforcing the trend of efficient low-active-parameter open models for local use.

Liquid AI announcement on X ↗LFM2-24B-A2B on Hugging Face ↗Liquid AI blog post ↗

🎙️ Hear our coverage →

#open-source #architecture #on-device

Perplexity Feb 26, 2026

New ModelsOpen weights

pplx-embed

Perplexity launches pplx-embed SOTA embedding models

Perplexity released pplx-embed, a family of state-of-the-art embedding models built for web-scale retrieval. The models are available on Hugging Face and through Perplexity's API with quickstart docs.

pplx-embed research blog ↗pplx-embed Hugging Face collection ↗Perplexity embeddings API quickstart ↗

🎙️ Hear our coverage →

#search #open-source

Weights & Biases Feb 26, 2026

Major Features & Updates

W&B Inference: MiniMax 2.5 & Kimi K2.5

W&B Inference adds MiniMax 2.5 and Kimi K2.5

Weights & Biases added MiniMax M2.5 and Kimi K2.5 to its CoreWeave-backed Inference service. The panel emphasized price/performance, with MiniMax 2.5 presented as roughly 10x cheaper than premium alternatives in some tiers and Kimi K2.5 praised for practical function calling and image-in-loop use cases.

MiniMax M2.5 on W&B Inference ↗

🎙️ Hear our coverage →

#infrastructure #api #open-source

Alibaba (Qwen) Feb 19, 2026

New ModelsOpen weights

Qwen3.5-397B-A17B

Alibaba opens Qwen 3.5: 397B-param multimodal MoE with only 17B active

Alibaba released Qwen3.5-397B-A17B, billed as the first open-weight native multimodal MoE model, with 397B total parameters, just 17B active, 512 experts, and 262K native context extendable to 1M. It delivers 8.6-19x faster inference than Qwen3-Max and continues Qwen's strength in multilingual and medical tasks, scoring 52.5% on Terminal Bench, third place among open-source models. Nisten found coding still trails GLM-5.

397B Qwen 3.5 Parameters

Qwen 3.5 announcement (X) ↗Qwen3.5-397B-A17B on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #architecture #multilingual

Cohere Labs Feb 19, 2026

New ModelsOpen weights

Tiny Aya

Cohere Labs releases Tiny Aya, a 3.35B multilingual model for 70+ languages

Cohere Labs released Tiny Aya, a 3.35B-parameter multilingual model family supporting 70+ languages that is small enough to run locally on phones. It extends Cohere's Aya line of open multilingual models, bringing broad language coverage to on-device deployments.

Tiny Aya announcement (X) ↗Tiny Aya collection on Hugging Face ↗Tiny Aya Global on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #multilingual #on-device

Weights & Biases Feb 19, 2026

Major Features & Updates

Kimi K2.5 on W&B Inference

W&B adds Kimi K2.5 to its inference service

Weights & Biases launched Kimi K2.5 on its inference service, making Moonshot AI's model available to W&B users. In Wolfram's Terminal Bench deep dive for W&B, Kimi K2.5 achieved a 67.4% ceiling score across multiple runs, among the strongest open-model results he measured.

W&B Inference ↗

🎙️ Hear our coverage →

#infrastructure #open-source

Zyphra Feb 19, 2026

New ModelsOpen weights

ZUNA

Zyphra opens ZUNA, a 380M-param EEG brain-computer interface model

Zyphra released ZUNA, a 380M-parameter open-source BCI foundation model that translates EEG brain signals into text, reconstructing clinical-grade brain signals from sparse, noisy data. Dubbed 'thought to text' by the community, it works with roughly $500 non-invasive EEG headsets, likely needs personalized training per user, and is small enough to run in real time on a consumer gaming GPU. It is Apache licensed.

ZUNA announcement (X) ↗Zyphra blog: ZUNA ↗ZUNA on GitHub ↗

🎙️ Hear our coverage →

#research #open-source

MiniMax Feb 12, 2026

New ModelsOpen weights

MiniMax M-2.5

MiniMax M-2.5 hits 80.2% SWE-Bench Verified with 10B active params

MiniMax dropped M-2.5 thirty minutes before the show: a 200B-total, 10B-active open-weights model scoring 80.2% on SWE-Bench Verified, approaching Opus 4.6 at roughly 1/20th the cost (~15 cents per task with a 57% win rate over Opus). Trained with MiniMax's decoupled Forge RL framework and optimized for end-to-end task time with fewer tool calls and thinking tokens. Senior researcher Olive Song joined live and revealed the model was still training — they cut a checkpoint for early release.

80.2% SWE-Bench Verified15¢ Cost per task

MiniMax M2.5 benchmarks on X ↗

🎙️ Hear our coverage →

#open-source #coding #agents

Weights & Biases Feb 12, 2026

Major Features & Updates

W&B Inference (GLM-5 & Kimi K2.5)

W&B Inference adds day-zero GLM-5 and Kimi K2.5 support

Weights & Biases launched day-zero GLM-5 support on its CoreWeave-powered W&B Inference service, alongside Kimi K2.5, with MiniMax 2.5 coming soon. Alex announced $50 in free credits for listeners to test the new open-weights models.

W&B announcement on X ↗W&B Inference ↗

🎙️ Hear our coverage →

#infrastructure #open-source

Zhipu AI (Z.ai) Feb 12, 2026

New ModelsOpen weights

GLM-5

Z.ai launches GLM-5, the open-weights agentic coding crown

Z.ai released GLM-5, a 744B-parameter MoE model (40B active) trained on 28.5 trillion tokens that takes the #1 open-source ranking for agentic coding with 77.8% SWE-bench Verified. It introduces the SLIM asynchronous RL framework for post-training, adopts DeepSeek's sparse attention to cut deployment cost, and was trained on Huawei chips rather than NVIDIA. Lou from Z.ai joined the show live and summed it up as bigger, faster, better, and cheaper.

744B GLM-5 Parameters28.5T Training tokens

Z.ai announcement on X ↗GLM-5 on Hugging Face ↗W&B Inference day-zero support ↗

🎙️ Hear our coverage →

#open-source #coding #agents

A ACE Step Feb 5, 2026

New ModelsOpen weights

ACE-Step 1.5

ACE-Step 1.5: open-source 'Suno at home' music generation under MIT

ACE-Step 1.5 is an MIT-licensed AI music generator that produces full songs in under 10 seconds on consumer GPUs and runs on a MacBook. The panel demoed it live via Pinocchio, generating a ThursdAI song on the spot, and it is available for one-click install.

X announcement ↗GitHub ↗Hugging Face ↗Project page ↗

🎙️ Hear our coverage →

#audio #open-source

Alibaba (Qwen) Feb 5, 2026

New ModelsOpen weights

Qwen3-Coder-Next

Qwen3-Coder-Next hits 70.6% SWE-Bench Verified with 3B active params

Alibaba's Qwen3-Coder-Next is an 80B MoE coding agent model with only 3B active parameters that scores 70.6% on SWE-Bench Verified and 44% on the much harder SWE-Bench Pro. It was trained on 7.5T tokens with 20,000 parallel RL environments and runs under 48GB of RAM with GGUF quantization, making near-frontier agentic coding feasible on local hardware.

70.6% SWE-Bench Verified44% SWE-Bench Pro

X announcement ↗Qwen blog ↗Hugging Face collection ↗

🎙️ Hear our coverage →

#open-source #coding #agents

Ant Group Feb 5, 2026

New ModelsOpen weights

LingBot-World

LingBot-World: open-source world model challenges Google Genie 3

Ant Group released LingBot-World, an open-source world model that generates 10-minute playable environments at 16fps. It positions open weights as a direct challenger to Google's closed Genie 3 in interactive world generation.

X thread ↗Hugging Face ↗

🎙️ Hear our coverage →

#world-models #video-gen #open-source

InternLM (Shanghai AI Lab) Feb 5, 2026

New ModelsOpen weights

Intern-S1-Pro

Intern-S1-Pro: 1 trillion parameter open MoE for scientific reasoning

InternLM released Intern-S1-Pro, a 1 trillion parameter open-source MoE model targeting SOTA scientific reasoning across chemistry, biology, materials, and earth sciences. The panel noted it beats frontier models on science benchmarks, a massive compute investment for an open release.

X announcement ↗Hugging Face ↗Arxiv ↗ModelScope ↗

🎙️ Hear our coverage →

#open-source #reasoning #research

Mistral AI Feb 5, 2026

New ModelsOpen weights

Voxtral Transcribe 2

Mistral's Voxtral Transcribe 2 dethrones Whisper as SOTA transcription

Mistral AI launched Voxtral Transcribe 2, state-of-the-art speech-to-text with sub-200ms latency, native diarization support, and open weights under Apache 2.0. The panel called it the first model to dethrone Whisper after roughly three years, and Alex used it to transcribe this very episode.

X announcement ↗Mistral blog ↗Docs ↗Demo ↗

🎙️ Hear our coverage →

#voice-ai #open-source

OpenBMB Feb 5, 2026

New ModelsOpen weights

MiniCPM-o 4.5

MiniCPM-o 4.5: first open-source full-duplex omni model

OpenBMB released MiniCPM-o 4.5, the first open-source full-duplex omni-modal LLM that can see, listen, and speak simultaneously. It can listen while speaking and even interrupt the user, bringing real-time conversational behavior to open weights.

X announcement ↗Hugging Face ↗GitHub ↗

🎙️ Hear our coverage →

#open-source #voice-ai #multimodal

StepFun Feb 5, 2026

New ModelsOpen weights

Step 3.5 Flash

StepFun Step 3.5 Flash: frontier reasoning claims at 11B active params

StepFun released Step 3.5 Flash, a 196B sparse MoE model with only 11B active parameters, claiming frontier-level reasoning while generating at 100-350 tokens per second. It continues the trend of sparse Chinese MoE models delivering high speed at low active parameter counts.

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #reasoning

Zhipu AI (Z.ai) Feb 5, 2026

New ModelsOpen weights

GLM-OCR

Z.ai GLM-OCR: 0.9B model takes #1 on OmniDocBench

Z.ai released GLM-OCR, a tiny 0.9B parameter document understanding model that achieves the #1 ranking on OmniDocBench V1.5. It shows that strong OCR and document parsing no longer require large models.

X announcement ↗Hugging Face ↗Announcement ↗

🎙️ Hear our coverage →

#open-source #vision

January 2026

Alibaba (Tongyi Lab) Jan 29, 2026

New ModelsOpen weights

Z-Image

Tongyi Lab releases Z-Image generation model

Alibaba's Tongyi Lab released Z-Image, a new image generation model, with support landing in the open-source DiffSynth-Studio toolkit on GitHub. Covered in the AI Art segment alongside HunyuanImage 3.0.

Announcement (X) ↗GitHub (DiffSynth-Studio) ↗

🎙️ Hear our coverage →

#image-gen #open-source

Arcee AI Jan 29, 2026

New ModelsOpen weights

Trinity Large

Arcee AI ships Trinity Large: 400B MOE trained in 33 days for $20M

Arcee AI's Trinity Large is a 400B-parameter MOE with 13B active parameters, trained on 17T tokens across 2000 B300 GPUs in 33 days for $20M. It has 512K native context (twice Kimi K2.5), is free on OpenRouter until February 2026, and the panel called it the largest Western open-source lab model.

400B Arcee Trinity Large512K Trinity native context

Announcement (X) ↗Blog ↗Hugging Face (Preview) ↗Hugging Face (Base) ↗

🎙️ Hear our coverage →

#open-source #architecture

Jan AI Jan 29, 2026

New ModelsOpen weights

Jan v3

Jan AI releases Jan v3, a 4B model built for fast local inference

Jan v3 is a 4B-parameter open model optimized for local inference, hitting 132 tokens/sec with a 262K context window and a 40% improvement on coding. The Jan desktop app it powers has reached 5M downloads.

4B Jan v3 parameters

Announcement (X) ↗Hugging Face ↗Hugging Face (GGUF) ↗Jan.ai ↗

🎙️ Hear our coverage →

#open-source #on-device #coding

Moonshot AI Jan 29, 2026

New ModelsOpen weights

Kimi K2.5

Moonshot AI releases Kimi K2.5, the new open-source king

Moonshot AI's Kimi K2.5 takes the open-source crown, becoming the most-used model on OpenRouter and topping open-source leaderboards. The panel highlighted its strong agentic coding performance and tool use.

Announcement (X) ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #agents #coding

NVIDIA Jan 29, 2026

New ModelsOpen weights

PersonaPlex-7B

NVIDIA releases PersonaPlex-7B voice model

NVIDIA released PersonaPlex-7B, an open voice/audio model published on Hugging Face with code on GitHub. Listed in the week's Voice & Audio releases.

Announcement (X) ↗Hugging Face ↗GitHub ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Alibaba (Qwen) Jan 22, 2026

New ModelsOpen weights

Qwen3-TTS

Qwen3-TTS: open-source TTS family with 97ms latency and voice cloning

Alibaba's Qwen team released Qwen3-TTS, a full open-source text-to-speech family under Apache 2 that dropped 30 minutes before the show. It spans 5 models from 0.6B to 1.7B parameters, with 97ms latency, voice cloning from just 3 seconds of audio, voice description prompting, and 10-language support.

97ms Latency

Qwen3-TTS announcement (X) ↗Qwen3-TTS on Hugging Face ↗Qwen3-TTS on GitHub ↗

🎙️ Hear our coverage →

#voice-ai #open-source

F FlashLabs Jan 22, 2026

New ModelsOpen weights

Chroma 1.0

FlashLabs Chroma 1.0: open-source real-time speech-to-speech under 150ms

FlashLabs released Chroma 1.0, billed as the world's first open-source end-to-end real-time speech-to-speech model with voice cloning under 150ms latency. The 4B parameter model is built on Qwen 2.5 Omni and released under Apache 2; its live demo with RAG and document upload impressed the whole panel.

FlashLabs Chroma 1.0 announcement (X) ↗FlashLabs Chroma-4B on Hugging Face ↗Chroma paper (arXiv) ↗FlashLabs Voice Agents demo ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Liquid AI Jan 22, 2026

New ModelsOpen weights

LFM2.5-1.2B-Thinking

Liquid AI's LFM2.5-1.2B-Thinking: on-device reasoning under 900MB

Liquid AI released LFM2.5-1.2B-Thinking, a 1.2B parameter reasoning model that runs entirely on-device with under 900MB of memory. Its hybrid architecture with gated convolutions delivers 239 tokens/sec on an AMD CPU and 82 tokens/sec on a mobile NPU, making it practical for edge devices, Raspberry Pi, and older iPhones.

1.2B Parameters, under 900MB memory

LFM2.5-1.2B-Thinking announcement (X) ↗LFM2.5-1.2B-Thinking on Hugging Face ↗LFM2.5-1.2B-Thinking on Liquid LEAP ↗

🎙️ Hear our coverage →

#open-source #reasoning #on-device

P Peter Steinberger Jan 22, 2026

Dev ToolsOpen weights

Clawdbot

Clawdbot: open-source self-improving personal AI assistant for macOS

Clawdbot, created by Peter Steinberger, is an open-source personal AI assistant that runs locally on your Mac and connects via WhatsApp, Telegram, or Discord. Its killer feature is self-improvement: ask it to learn something and it writes its own skill files, giving a single chat conversation control over multiple agents, persistent memory, voice messages, image generation, and browser automation on your actual computer.

Clawdbot by Peter Steinberger (X post) ↗Clawdbot review on MacStories ↗clawd.bot — Official site ↗

🎙️ Hear our coverage →

#agents #consumer-ai #open-source

Z.AI (Zhipu) Jan 22, 2026

New ModelsOpen weights

GLM-4.7-Flash

GLM-4.7-Flash: 30B MoE local coding agent with only 3B active params

Z.AI released GLM-4.7-Flash, a 30B parameter MoE model with only 3B active parameters, designed as the ultimate local coding and agent assistant. It hits 59% on SWE-Bench Verified (approaching Sonnet 4's 64%) and runs at 120 tokens/sec on a stock Mac Studio M3 Ultra, fast enough to run RALF autonomous coding loops even on CPU.

59% SWE-Bench Verified120 tps Speed on Mac Studio M3 Ultra

GLM-4.7-Flash announcement (X) ↗GLM-4.7 Technical Blog ↗GLM-4.7-Flash on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #coding #agents

Black Forest Labs Jan 15, 2026

New ModelsOpen weights

Flux 2 Klein

Black Forest Labs drops Flux 2 Klein, fast open-weights image model

Wolfram broke the news mid-show: Black Forest Labs released Flux 2 Klein, a fast 4B/9B image generation model with open weights under Apache 2.0. It is designed for near-real-time editing and style iteration, and Alex used it minutes later in his live Claude Cowork demo.

🎙️ Hear our coverage →

#image-gen #open-source

B Byte Jan 15, 2026

New ModelsOpen weights

M3

M3: 235B open-source medical LLM claims to beat GPT 5.2 on HealthBench

Byte released M3, a 235B parameter medical LLM fine-tuned from Qwen3 and licensed Apache 2.0. With only 22B active parameters, it is runnable at usable speeds on an M3 Ultra, and it claims to beat GPT 5.2 on HealthBench. Nisten suggested pairing it with smaller imaging models like MedGemma rather than treating them as substitutes.

235B M3 Medical LLM

🎙️ Hear our coverage →

#open-source #research

C Chorus Jan 15, 2026

Major Features & UpdatesOpen weights

Chorus Skills Support

Chorus adds agent skills support for every LLM via OpenRouter

Alex used a Ralph loop with Claude Code to add full agent skills support to Chorus, the open-source app that compares answers across multiple LLMs, in about 3.5 hours. The work added a settings panel, filesystem skill discovery, front-matter parsing, and cross-model skill injection, letting the same Claude-style skills run on GPT 5.2 Codex, Gemini, and any OpenRouter model.

🎙️ Hear our coverage →

#agents #open-source

Google DeepMind Jan 15, 2026

New ModelsOpen weights

MedGemma 1.5

Google releases MedGemma 1.5 for offline medical imaging

Google released MedGemma 1.5, a small (4B-class) open model for medical use cases, compact enough to run offline for medical imaging. The panel stressed it is a different model class from Byte's giant M3 medical LLM and that the two pair well together rather than replacing each other.

🎙️ Hear our coverage →

#research #open-source #vision

Meituan (LongCat) Jan 15, 2026

New ModelsOpen weights

LongCat Flash Thinking

Meituan's LongCat Flash Thinking: 560B MoE with 27B active, MIT licensed

Meituan released LongCat Flash Thinking, an open-source reasoning MoE with 560B total parameters and only 27B active, under an MIT license. It continued the run of large sparse Chinese open-weights models offering frontier-style reasoning at low active-parameter cost.

560B/27B LongCat Flash

🎙️ Hear our coverage →

#open-source #reasoning

Lightricks Jan 8, 2026

New ModelsOpen weights

LTX-2

Lightricks open-sources LTX-2 synchronized audio-video model

Lightricks open-sourced LTX-2, billed as the first truly open audio-video generation model with synchronized audio and video output, releasing full training code alongside the weights. A distilled version is available to try on Replicate.

LTX-2 on GitHub ↗LTX-2 Paper ↗LTX-2 on Replicate ↗

🎙️ Hear our coverage →

#video-gen #open-source #audio

Liquid AI Jan 8, 2026

New ModelsOpen weights

LFM 2.5

Liquid AI LFM 2.5: 1B on-device family with end-to-end audio

Liquid AI released LFM 2.5, a family of ~1.2B parameter on-device models spanning text, vision, and audio, announced at CES alongside AMD's Lisa Su. The models hit 239 tokens/sec on AMD CPU and 100 tokens/sec on iPhone 16 Pro Max, and include a revolutionary end-to-end audio model that skips the traditional ASR-LLM-TTS pipeline entirely, running in as little as 8GB of RAM.

Liquid AI LFM 2.5 on X ↗LFM 2.5 on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #on-device #voice-ai

MiroMind AI Jan 8, 2026

New ModelsOpen weights

MiroThinker 1.5

MiroThinker 1.5: 30B search agent beats trillion-param models

MiroMind AI released MiroThinker 1.5, a 30B parameter open source search agent that achieves 56.1% on BrowseComp and 66.8% on BrowseComp Chinese, outperforming trillion-parameter models. It introduces 'interactive scaling' as a third scaling dimension beyond parameters and context, and is a fine-tune of Qwen 3 Thinking with 147K open training samples.

MiroThinker 1.5 on X ↗MiroThinker 1.5 on Hugging Face ↗MiroThinker on GitHub ↗

🎙️ Hear our coverage →

#open-source #agents #search

Nous Research Jan 8, 2026

New ModelsOpen weights

NousCoder 14B

NousCoder 14B: 7% LiveCodeBench jump in 4 days of RL training

Nous Research released NousCoder 14B, an open source competitive programming model that achieved a 7% jump on LiveCodeBench accuracy in just four days of RL training on 48 NVIDIA B200 GPUs. Training used 24,000 verifiable problems, and the release ships under a full Apache 2 license with training code and a benchmark harness.

NousCoder 14B on X ↗NousCoder W&B Dashboard ↗NousCoder Atropos on GitHub ↗

🎙️ Hear our coverage →

#open-source #coding #training

NVIDIA Jan 8, 2026

New ModelsOpen weights

Alpha Mayo

NVIDIA Alpha Mayo: open source reasoning self-driving models

NVIDIA announced Alpha Mayo at CES, a family of open source reasoning-based self-driving AI models. The models perform end-to-end autonomous driving with explicit reasoning steps, like identifying jaywalkers and stopping accordingly, demoed in a Mercedes-Benz.

NVIDIA CES 2026 News ↗

🎙️ Hear our coverage →

#robotics #reasoning #open-source

NVIDIA Jan 8, 2026

New ModelsOpen weights

Nemotron Speech ASR

Nemotron Speech ASR: 600M streaming model with 24ms latency

NVIDIA released Nemotron Speech ASR, a 600M parameter open source streaming speech recognition model with 24ms median latency and support for 900 concurrent streams on a single H100. Kwindla Hultman Kramer of Daily.co demoed sub-500ms voice-to-voice latency using a three-model pipeline of Nemotron ASR, Nemotron Nano LLM, and Magpie TTS.

24ms Nemotron Speech latency

NVIDIA Nemotron AI Dev on X ↗Nemotron Speech on Hugging Face ↗Nemotron Speech ASR Blog ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Upstage Jan 8, 2026

New ModelsOpen weights

Solar Open 100B

Upstage Solar Open 100B: 102B MoE trained on 19.7T tokens

Upstage released Solar Open 100B, a 102B parameter MoE model with only 12B active parameters per token (129 experts, top-8 activation), trained on 19.7 trillion tokens including 4.5T synthetic via a 'data factory' approach. It outperforms GLM 4.5 Air on many benchmarks, features the SNAP PO reinforcement learning technique with a 50% training speedup, and delivers best-in-class Korean language performance.

102B Solar Open params

Solar Open 100B on X ↗Solar Open 100B on Hugging Face ↗Solar Open Tech Report ↗

🎙️ Hear our coverage →

#open-source #architecture #multilingual

December 2025

Alibaba (Qwen) Dec 25, 2025

New ModelsOpen weights

Qwen 3 Coder

Qwen 3 Coder posts insane scores in the race for the coding crown

Alibaba's Qwen 3 Coder landed in July with what the crew called insane benchmark scores for an open-weights coding model. Together with Kimi K2 and GLM 4.5 it made July the peak month for Chinese open source.

🎙️ Hear our coverage →

#open-source #coding

Alibaba (Qwen) Dec 25, 2025

New ModelsOpen weights

Qwen speech-to-speech model

Qwen launches speech-to-speech model with emotion handling

Qwen released a speech-to-speech model in March with internal emotion handling, joining the wave of voice-native models. It was part of the Qwen team's relentless 2025 release cadence across modalities.

Mar 27 Episode ↗

🎙️ Hear our coverage →

#voice-ai #open-source

DeepSeek Dec 25, 2025

New ModelsOpen weights

DeepSeek R1

DeepSeek R1: the open reasoning model that crashed NVIDIA's stock

DeepSeek's open-weights reasoning model dropped January 23rd and matched OpenAI's o1 at roughly 50x cheaper pricing, with an alleged training cost of just $5.5M. It crashed NVIDIA stock 17% — a $560B single-day loss, the largest single-company monetary loss in history — and made Chinese AI a household topic. The crew named it the earthquake that shattered assumptions about who leads AI.

$560B NVIDIA stock loss$5.5M DeepSeek R1 training cost

Jan 24 Episode ↗Jan 30 Episode ↗

🎙️ Hear our coverage →

#open-source #reasoning

DeepSeek Dec 25, 2025

New ModelsOpen weights

DeepSeek V3.1 Terminus

DeepSeek V3.1 Terminus lands amid September's relentless pace

DeepSeek resurfaced in September with V3.1 Terminus, another strong open-weights release that arrived just as the crew was barely keeping up with the weekly firehose. Nisten noted that missing a single week in this period left you completely lost.

🎙️ Hear our coverage →

#open-source #reasoning

H Hexgrad (Kokoro) Dec 25, 2025

New ModelsOpen weights

Kokoro TTS

Kokoro TTS: 82M-param Apache 2 model hits #1 on TTS Arena

Kokoro, a tiny 82M parameter text-to-speech model, went viral in January after hitting #1 on TTS Arena. Released under Apache 2.0 and small enough to run in the browser, it showed that high-quality speech synthesis no longer required huge models.

Jan 10 Episode ↗

🎙️ Hear our coverage →

#voice-ai #open-source

MiniMax (Hailuo) Dec 25, 2025

New Models

Hailuo 2.3

MiniMax drops Hailuo 2.3 in November

MiniMax released Hailuo 2.3 (referred to as 'Hailuo LLM 2.3' on the show) in November, cited as another strong release from the Chinese labs. It closed out a year in which MiniMax shipped everything from 4M-context LLMs to media models.

🎙️ Hear our coverage →

MiniMax (Hailuo) Dec 25, 2025

New ModelsOpen weights

MiniMax-01

MiniMax-01: open model with a 4M token context window

MiniMax (Hailuo) released MiniMax-01 in January with a 4 million token context window, by far the largest context of any open-weights model at the time. It was an early sign of the Chinese-lab open source dominance that defined 2025.

Jan 17 Episode ↗

🎙️ Hear our coverage →

#open-source #architecture

Moonshot AI (Kimi) Dec 25, 2025

New ModelsOpen weights

Kimi K2

Kimi K2: the Chinese open model that earned mainstream respect

Moonshot AI's Kimi K2 dropped in July and earned serious mainstream recognition, marking peak Chinese-lab dominance of open source. It was named in the show's TL;DR as one of the defining open-weights releases of 2025.

🎙️ Hear our coverage →

#open-source #agents

Tencent (Hunyuan) Dec 25, 2025

New ModelsOpen weights

Hunyuan open weights

Tencent enters the open weights race

In July, Tencent's Hunyuan team (rendered as 'HO One' in the episode) joined Huawei in entering the open-weights model race. It widened the field of Chinese labs shipping serious open models beyond DeepSeek, Qwen, and Moonshot.

🎙️ Hear our coverage →

Zhipu AI (GLM) Dec 25, 2025

New ModelsOpen weights

GLM 4.5

GLM 4.5 runs on Cerebras fast enough to win hackathons

Zhipu's GLM 4.5 came out in July and was the first open model that ran on Cerebras hardware fast enough that hackathon competitors were winning with it. It set up GLM's quiet rise as a business workhorse later in the year.

🎙️ Hear our coverage →

#open-source #infrastructure

Zhipu AI (GLM) Dec 25, 2025

New ModelsOpen weights

GLM 4.6

GLM 4.6 quietly becomes the model businesses actually use

Zhipu's GLM 4.6 arrived in October and, per Nisten, quietly became a go-to model that many businesses still run today. It continued GLM's trajectory from hackathon favorite to production workhorse.

🎙️ Hear our coverage →

#open-source #coding

Allen AI Dec 18, 2025

New ModelsOpen weights

BOLMO

Allen AI's BOLMO reaches byte-level parity with tokenized models

Allen AI released BOLMO, described as the first byte-level language model to reach parity with regular tokenization-based models. The panel framed it as a research breakthrough that could eventually remove tokenizers from the LLM stack.

BOLMO announcement ↗

🎙️ Hear our coverage →

#open-source #research #architecture

Allen AI Dec 18, 2025

New ModelsOpen weights

OLMO 2 (multimodal)

Allen AI adds video-input multimodal OLMO models in 4B/7B/8B sizes

Allen AI extended its OLMO family with multimodal models that accept video input, released in 4B, 7B, and 8B sizes. It continues Allen AI's fully open approach to model development alongside the BOLMO byte-level work.

OLMO multimodal announcement ↗

🎙️ Hear our coverage →

#open-source #multimodal #vision

Google DeepMind Dec 18, 2025

New ModelsOpen weights

FunctionGemma

FunctionGemma: Google's 270M function-calling model for edge agents

Google released FunctionGemma, a tiny 270M-parameter open model specialized for function calling on-device. With a roughly 500MB RAM footprint and strong gains after fine-tuning for mobile actions, it points toward privacy-first local agents on constrained hardware.

FunctionGemma docs ↗FunctionGemma blog ↗FunctionGemma announcement on X ↗

🎙️ Hear our coverage →

#on-device #agents #open-source

Meta AI Dec 18, 2025

New ModelsOpen weights

SAM Audio

Meta SAM Audio brings promptable source separation to audio

Meta released SAM Audio, an audio source separation model that extends the Segment Anything concept to sound. It supports multimodal prompting via text, visual, and temporal cues to isolate sources from audio, with weights on Hugging Face and code on GitHub.

Meta SAM Audio (GitHub) ↗SAM Audio (HF) ↗SAM Audio announcement ↗

🎙️ Hear our coverage →

#audio #open-source

NVIDIA Dec 18, 2025

New ModelsOpen weights

Nemotron 3 Nano

NVIDIA ships Nemotron 3 Nano, a 30B hybrid Mamba-MoE with full recipes

NVIDIA released Nemotron 3 Nano, a 30B-parameter hybrid Mamba-MoE model with only 3B active parameters for efficient inference. The panel called it the most consequential open release of the week because NVIDIA shipped not just weights but technical reports, training recipes, and details on the 25T-token training data.

30B (3B active) Nemotron 3 Nano parameters

NVIDIA Nemotron 3 Nano announcement ↗NVIDIA Nemotron 3 Nano (HF BF16) ↗NVIDIA Nemotron 3 Nano (HF FP8) ↗

🎙️ Hear our coverage →

#open-source #architecture #infrastructure

Resemble AI Dec 18, 2025

New ModelsOpen weights

Chatterbox Turbo

Resemble AI open-sources Chatterbox Turbo, a 350M MIT-licensed TTS

Resemble AI released Chatterbox Turbo, an MIT-licensed 350M-parameter open text-to-speech model. The company claims it beats ElevenLabs in blind listening tests, pushing high-quality TTS into fully open, accessible territory.

Resemble Chatterbox Turbo (GitHub) ↗Chatterbox Turbo (HF) ↗Chatterbox Turbo blog ↗Chatterbox Turbo on X ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Arcee AI Dec 4, 2025

New ModelsOpen weights

Arcee Trinity

Arcee Trinity launches US-trained open MoE family

Arcee AI introduced Trinity, a family of US-trained open mixture-of-experts models built from scratch, starting with Trinity-Mini and Trinity-Nano-Preview. CTO Lukas Atkins joined the show to discuss the training approach and previewed Trinity-Large for January 2026. The release positions Arcee as a domestic alternative in an open-weights field dominated by Chinese labs.

Arcee Trinity Manifesto ↗Trinity-Mini (Hugging Face) ↗Trinity-Nano-Preview (Hugging Face) ↗Lukas Atkins announcement on X ↗

🎙️ Hear our coverage →

#open-source #architecture

DeepSeek Dec 4, 2025

New ModelsOpen weights

DeepSeek V3.2 / V3.2-Speciale

DeepSeek V3.2 and V3.2-Speciale post gold-medal reasoning under MIT license

DeepSeek released V3.2 and the reasoning-first V3.2-Speciale, a 685B-parameter MoE under MIT license. Speciale posted gold-medal-level olympiad results and 96% on AIME (versus GPT-5 High at 94%), with V3.2 hitting 73.1% on SWE-Bench Verified. Aggressive pricing around 28 cents per 1M tokens on OpenRouter pushes open models closer to top closed-model capability.

96% AIME73.1% SWE-Bench Verified685B Total parameters (MoE)

DeepSeek V3.2 (Hugging Face) ↗DeepSeek V3.2-Speciale (Hugging Face) ↗DeepSeek V3.2 announcement ↗DeepSeek announcement on X ↗

🎙️ Hear our coverage →

#open-source #reasoning #coding

Microsoft Dec 4, 2025

New ModelsOpen weights

VibeVoice-Realtime-0.5B

Microsoft shares VibeVoice-Realtime-0.5B with ~300ms latency TTS

Microsoft published VibeVoice-Realtime-0.5B on Hugging Face, a small realtime text-to-speech model claiming roughly 300ms latency. The show framed it as more evidence that sub-second audio response is becoming table stakes for production voice agents.

~300ms Claimed TTS latency0.5B Parameters

Microsoft VibeVoice-Realtime-0.5B (Hugging Face) ↗Community post on X ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Mistral AI Dec 4, 2025

New ModelsOpen weights

Mistral 3 (Large 3 + Ministral 3)

Mistral returns to Apache 2.0 with Mistral Large 3 and Ministral 3

Mistral relaunched its model family under permissive Apache 2.0 licensing with Mistral Large 3 and the small Ministral 3 edge models. Large 3 ships a 256K context window and strong open-model coding positioning. The licensing shift reignited discussion around open model portability and deployability.

256K Mistral Large 3 context window

Mistral 3 blog ↗Mistral Large 3 (Hugging Face collection) ↗Ministral 3 (Hugging Face collection) ↗Mistral announcement on X ↗

🎙️ Hear our coverage →

#open-source #on-device #coding

Nous Research Dec 4, 2025

New ModelsOpen weights

Hermes 4.3

Nous Research ships Hermes 4.3 36B with decentralized training

Nous Research released Hermes 4.3-36B, highlighted on the show for being trained with decentralized infrastructure and for state-of-the-art RefusalBench performance. The release continues the Hermes line of open, steerable instruction-tuned models.

Hermes 4.3-36B (Hugging Face) ↗Nous Research on X ↗

🎙️ Hear our coverage →

#open-source #training

November 2025

Alibaba (Tongyi) Nov 27, 2025

New ModelsOpen weights

Z-Image Turbo

Tongyi's Z-Image Turbo brings sub-second open image generation

Alibaba's Tongyi lab released Z-Image Turbo, a 6B-parameter open image generation model that produces images in under a second. It pushes open-source image generation toward real-time speeds at a fraction of the size of competing models.

6B Parameters

Z-Image Turbo on HuggingFace ↗Z-Image on GitHub ↗

🎙️ Hear our coverage →

#image-gen #open-source #architecture

Black Forest Labs Nov 27, 2025

New ModelsOpen weights

FLUX.2

Black Forest Labs releases FLUX.2, a 32B multi-reference image model

Black Forest Labs released FLUX.2, a 32B-parameter image model with open weights (FLUX.2-dev) that supports multi-reference image editing. It lets users combine multiple reference images and prompt edits with variables, a step up in controllable image editing.

32B Parameters

FLUX.2 on HuggingFace ↗FLUX.2 Blog ↗FLUX.2 Announcement on X ↗

🎙️ Hear our coverage →

#image-gen #open-source

DeepSeek Nov 27, 2025

New ModelsOpen weights

DeepSeek Math V2

DeepSeek Math V2: 685B open-weights model with IMO gold-level math

DeepSeek surfaced DeepSeek Math V2, a 685B-parameter Apache-2.0 model that reaches IMO gold-level math reasoning. It is the first open-weights math champion at this level, dropped quietly on HuggingFace during the week.

685B Parameters

DeepSeek Math V2 on HuggingFace ↗

🎙️ Hear our coverage →

#open-source #reasoning

Microsoft Nov 27, 2025

New ModelsOpen weights

Fara-7B

Microsoft ships Fara-7B, a 7B on-device computer use agent

Microsoft Research released Fara-7B, a best-in-class 7B-parameter vision-language model for computer use that runs on-device. It scores 73.5% on WebVoyager, beating OpenAI's computer-use preview while being small enough to run locally.

73.5% WebVoyager

Fara-7B on HuggingFace ↗Fara-7B Blog ↗Fara-7B Announcement on X ↗Fara on GitHub ↗

🎙️ Hear our coverage →

#open-source #agents #on-device

Prime Intellect Nov 27, 2025

New ModelsOpen weights

INTELLECT-3

Prime Intellect releases INTELLECT-3, a 106B open MoE model

Prime Intellect released INTELLECT-3, a 106B-parameter mixture-of-experts model with 12B active parameters that scores 90% on AIME 2024/2025. The lab fully open-sourced the training stack alongside the weights, showing a small lab can train frontier-scale models.

106B Total parameters (12B active)90% AIME 2024/2025

INTELLECT-3 on HuggingFace ↗INTELLECT-3 Blog ↗INTELLECT-3 Announcement on X ↗Try INTELLECT-3 ↗

🎙️ Hear our coverage →

#open-source #reasoning #architecture

Tencent (Hunyuan) Nov 27, 2025

New ModelsOpen weights

HunyuanOCR

Tencent's 1B HunyuanOCR beats 72B models on OCRBench

Tencent released HunyuanOCR, a 1B-parameter OCR model that scores 860 on OCRBench, beating models as large as Qwen3-VL-72B. It is a striking example of task-specialized small models outperforming generalist giants.

1B Parameters860 OCRBench score

HunyuanOCR on HuggingFace ↗HunyuanOCR on GitHub ↗HunyuanOCR Announcement on X ↗Hunyuan Vision Blog ↗

🎙️ Hear our coverage →

#vision #open-source #on-device

Tencent (Hunyuan) Nov 27, 2025

New ModelsOpen weights

HunyuanVideo 1.5

Tencent releases HunyuanVideo 1.5, a lightweight open video model

Tencent released HunyuanVideo 1.5, a lightweight DiT-based open-source video generation model. It brings capable video generation to a smaller footprint, continuing the trend of open video models closing the gap with closed offerings.

HunyuanVideo on HuggingFace ↗HunyuanVideo on GitHub ↗HunyuanVideo 1.5 Announcement on X ↗

🎙️ Hear our coverage →

#video-gen #open-source #architecture

Allen Institute for AI (Ai2) Nov 20, 2025

New ModelsOpen weights

OLMo 3

OLMo 3: Allen AI's fully open 32B model with complete recipe

Allen AI released OLMo 3, a fully open 32B dense model where the dataset, training recipe, and hyperparameters are all public — not just the weights. LDJ contrasted it with open-weights-only releases from Qwen and DeepSeek, which have never published a fully open recipe.

32B Dense parameters, fully open dataset and recipe

🎙️ Hear our coverage →

Meta AI Nov 20, 2025

New ModelsOpen weights

SAM 3

Meta SAM 3: open-vocabulary segmentation and tracking in video

Meta's Segment Anything Model 3 adds open-vocabulary segmentation with text and exemplar prompts, letting you click or type to segment and track any object across images and video. The panel demoed it live on golden retriever videos, and it ships openly as part of Meta's open-source push.

🎙️ Hear our coverage →

#vision #open-source

Meta AI Nov 20, 2025

New ModelsOpen weights

SAM 3D

SAM 3D turns single photos into 3D objects and human bodies

Released alongside SAM 3, SAM 3D reconstructs 3D objects and full human bodies from a single image with surprisingly high quality. It extends the Segment Anything family from 2D segmentation into single-image 3D reconstruction.

🎙️ Hear our coverage →

#vision #world-models #open-source

Baidu Nov 13, 2025

New ModelsOpen weights

ERNIE-4.5-VL-28B-A3B-Thinking

Baidu open-sources ERNIE-4.5-VL-28B-A3B-Thinking visual reasoning model

Baidu released ERNIE-4.5-VL-28B-A3B-Thinking, an Apache 2.0 open-weights visual reasoning MoE with only 3B active parameters that claims to rival much larger models like GPT-5 High on vision tasks. It features image zooming, spatial grounding, and reasoning, with strong small-model performance attributed to GSPO training from the Qwen team.

3B Active Parameters

Baidu announcement on X ↗Hugging Face model page ↗GitHub repo ↗Ernie blog post ↗

🎙️ Hear our coverage →

#open-source #vision #reasoning

H Company Nov 13, 2025

New ModelsOpen weights

Holo2

H Company open-sources Holo2 multimodal computer-use agent family

Dropped live during the show: H Company open-sourced Holo2, a next-generation multimodal agent family fine-tuned on Qwen3-VL for grounding, navigation, and reasoning across web, desktop, and mobile. It posts SOTA results on computer-use and web-navigation benchmarks like OSWorld-G and ships in 4B, 8B, and 30B variants under Apache 2.0.

🎙️ Hear our coverage →

#agents #open-source

Meta AI Nov 13, 2025

New ModelsOpen weights

Omnilingual ASR

Meta releases Omnilingual ASR covering 1,600+ languages

Meta released Omnilingual ASR, an Apache 2.0 speech recognition family supporting over 1,600 languages, including 500+ never before served by any ASR system, with character error rate under 10% for 78 languages. The release includes an open corpus of 500k+ rows of transcribed audio, and the 1B model was praised as a near drop-in state-of-the-art replacement on Hugging Face.

1600+ Languages Supported

AI at Meta announcement on X ↗Meta blog post ↗Research paper ↗Omnilingual ASR corpus on Hugging Face ↗

🎙️ Hear our coverage →

#voice-ai #open-source

W WeiboAI Nov 13, 2025

New ModelsOpen weights

VibeThinker-1.5B

WeiboAI releases VibeThinker-1.5B open reasoning model

Weibo's AI team open-sourced VibeThinker-1.5B, a tiny reasoning model that reportedly outperforms much larger models like DeepSeek R1 on select reasoning benchmarks. Part of a week where small open-weights models from Chinese labs kept punching above their weight.

WeiboLLM announcement on X ↗Hugging Face model page ↗Arxiv paper ↗VentureBeat coverage ↗

🎙️ Hear our coverage →

#open-source #reasoning #on-device

Allen Institute for AI (Ai2) Nov 6, 2025

New ModelsOpen weights

OlmoEarth

Ai2 launches OlmoEarth foundation models and open Earth-intelligence platform

Ai2 launched OlmoEarth, a family of foundation models plus an open, end-to-end platform for fast, high-resolution Earth intelligence. It applies the lab's open-model approach to geospatial and remote-sensing data, making Earth observation workloads accessible without proprietary stacks.

🎙️ Hear our coverage →

#open-source #vision #frontier-models

Hugging Face Nov 6, 2025

Also ReleasedOpen weights

Smol Training Playbook

Hugging Face publishes the Smol Training Playbook for LLM pretraining

Hugging Face published the Smol Training Playbook, a 200+ page end-to-end guide to reliably pretraining and operating LLMs. It distills the team's practical experience from the SmolLM line into an open resource for anyone training their own models.

X ↗Announcement ↗

🎙️ Hear our coverage →

#open-source #training

M Maya Research Nov 6, 2025

New ModelsOpen weights

Maya-1

Maya-1 open-source voice generation model released

Maya-1 is a new open-source voice generation model that was demoed on the show as part of the week's voice AI wave. The panel highlighted how quickly open voice model quality is improving, with expressive output that holds up against commercial systems.

🎙️ Hear our coverage →

#voice-ai #open-source

Meituan (LongCat) Nov 6, 2025

New ModelsOpen weights

LongCat Flash Omni

Meituan releases LongCat Flash Omni, a 560B (27B active) omni model

Meituan's LongCat team released LongCat Flash Omni, a 560B-parameter mixture-of-experts model with roughly 27B active parameters that accepts text, audio, and video input. It extends the open LongCat Flash line into omni-modal territory from a lab better known for food delivery than frontier models.

X ↗HF ↗Announcement ↗

🎙️ Hear our coverage →

#open-source #multimodal

Moonshot AI Nov 6, 2025

New ModelsOpen weights

Kimi K2 Thinking

Moonshot AI releases Kimi K2 Thinking, an open 1T-param reasoning MoE

Moonshot AI released Kimi K2 Thinking, an open-source 1-trillion-parameter mixture-of-experts reasoning agent with 256K context and large-scale tool-calling capacity. The panel treated it as the open-source centerpiece of the week, focusing on its reasoning quality and coding utility rather than just benchmark screenshots, and as a sign open models keep closing the usability gap with frontier closed models.

X ↗HF ↗Tech Blog ↗Arxiv ↗

🎙️ Hear our coverage →

#open-source #reasoning #agents

October 2025

IBM Oct 30, 2025

New ModelsOpen weights

Granite 4.0 Nano

IBM Granite 4.0 Nano: ultra-efficient tiny models for edge deployment

IBM released Granite 4.0 Nano, a set of ultra-efficient tiny open models aimed at edge deployment. The release continues the trend of capable sub-billion-to-few-billion parameter models that can run locally on constrained hardware.

Artificial Analysis on X ↗Artificial Analysis: Granite ↗

🎙️ Hear our coverage →

#open-source #on-device

InclusionAI (Ant Group) Oct 30, 2025

New ModelsOpen weights

Ming-flash-omni Preview

Ming-flash-omni Preview: sparse MoE omni-modal open model

Ant Group's InclusionAI team released Ming-flash-omni Preview, a sparse mixture-of-experts omni-modal model on Hugging Face. It handles multiple input and output modalities in a single open-weights model, adding to the wave of Chinese open omni-modal releases.

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #multimodal #architecture

MiniMax Oct 30, 2025

New ModelsOpen weights

MiniMax M2

MiniMax M2: open-source agentic model at 8% of Claude's price, 2x speed

MiniMax released M2, an open-source agentic model positioned at roughly 8% of Claude's price while running about twice as fast. Head of Engineering Skyler Miao joined the show for a deep dive, framing M2 as both a model story and a speed story, and the panel read it as part of a broader open-model pressure wave on frontier labs.

8% of Claude's price2x speed vs comparable frontier models

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #agents #coding

Moonshot AI (Kimi) Oct 30, 2025

New ModelsOpen weights

Kimi Linear

Kimi Linear: 48B open model with linear attention and 1M context

Moonshot AI released Kimi Linear, a 48B parameter (A3B active) instruct model that uses linear attention to reach a 1M token context window. It is an open-weights bet on efficient long-context architectures from the Kimi team.

48B parameters (3B active)1M token context window

Hugging Face ↗

🎙️ Hear our coverage →

#open-source #architecture

OpenAI Oct 30, 2025

New ModelsOpen weights

GPT-OSS-Safeguard

OpenAI ships GPT-OSS-Safeguard, first open-weight safety reasoning models

OpenAI released GPT-OSS-Safeguard, its first open-weight safety reasoning models, built on the GPT-OSS family. The models let developers apply custom safety policies via reasoning rather than fixed classifiers, extending OpenAI's open-weights push into the trust-and-safety layer.

X announcement ↗Hugging Face collection ↗

🎙️ Hear our coverage →

#open-source #safety #reasoning

Alibaba (Qwen) Oct 23, 2025

New ModelsOpen weights

Qwen3-VL 2B & 32B

Qwen3-VL adds compact 2B and 32B multimodal models

Alibaba's Qwen team extended the Qwen3-VL family with newly updated 2B and 32B checkpoints. The 2B is a generic VLM (OCR-capable) that holds up against its 4B and 8B siblings from prior weeks, while the 32B reportedly outperforms GPT-5 mini and Claude 4 Sonnet on benchmarks.

X ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #vision #multimodal

Allen Institute for AI (Ai2) Oct 23, 2025

New ModelsOpen weights

olmOCR 2 7B

Ai2 releases olmOCR 2 7B open OCR model

The Allen Institute for AI updated its open OCR line with olmOCR 2 at 7B (released as an FP8 checkpoint), landing in the same week as DeepSeek-OCR, Qwen3-VL, and Liquid's LFM2-VL. Another sign that document understanding became this week's hottest open-model category.

🎙️ Hear our coverage →

#vision #open-source

DeepSeek Oct 23, 2025

New ModelsOpen weights

DeepSeek-OCR

DeepSeek-OCR turns text into compressed vision tokens for massive contexts

DeepSeek open-sourced DeepSeek-OCR, a 3B model (~570M active parameters) that is less an OCR model and more a context-compression breakthrough: it renders text as images, compresses it up to 10x while retaining 97% decoding accuracy (60% even at 20x), and reads it back with a tiny vision decoder. The approach suggests text tokenization is far from optimal and points at vastly cheaper long-context processing; alphaXiv reportedly OCR'd all of arXiv for $1000 versus $7500 with MistralOCR, and a single H100 can process up to 200K pages.

97% decoding accuracy at 10x compression~570M active parameters (3B total)200K pages scannable on a single H100

X ↗HF ↗Paper ↗

🎙️ Hear our coverage →

#vision #open-source #search

Krea AI Oct 23, 2025

New ModelsOpen weights

Krea Realtime Video

Krea open-sources a 14B real-time video generation model

Krea AI open-sourced a 14-billion-parameter real-time video model, with weights on Hugging Face. It joins the week's clear trend of generative video racing toward live, interactive experiences rather than offline rendering.

14B parameters

🎙️ Hear our coverage →

#video-gen #voice-ai #open-source

Lightricks Oct 23, 2025

New ModelsOpen weights

LTX-2

LTX-2: native 4K audio+video generation engine from Lightricks

Lightricks announced LTX-2 as breaking news on the show: a video generation engine producing native 4K video (no upscaling) with synchronized audio, positioned as a fast, efficient open alternative to closed models like Sora. It is billed as open-source with weights coming this fall.

4K native generation resolution, no upscaling

X ↗Website ↗GitHub ↗

🎙️ Hear our coverage →

#video-gen #open-source #audio

Liquid AI Oct 23, 2025

New ModelsOpen weights

LFM2-VL-3B

Liquid AI ships LFM2-VL-3B tiny multilingual vision-language model

Liquid AI released LFM2-VL-3B, a tiny multilingual vision-language model, part of a wave of OCR-and-VLM releases this week. It targets efficient on-device and edge vision-language workloads at the 3B scale.

🎙️ Hear our coverage →

#vision #open-source #on-device

Pokee AI Oct 23, 2025

New ModelsOpen weights

PokeeResearch-7B

PokeeResearch-7B: open-source SOTA deep research agent model

Pokee AI released PokeeResearch-7B, an open-source 7B deep research agent model claiming state-of-the-art results for its size. Weights, code, a paper, and a hosted deep-research preview all shipped together.

X ↗HF ↗ArXiv ↗GitHub ↗

🎙️ Hear our coverage →

#open-source #agents #search

Alibaba (Qwen) Oct 16, 2025

New ModelsOpen weights

Qwen3-VL 3B/8B

Qwen3-VL adds compact 3B and 8B open vision-language models

Alibaba's Qwen team released smaller Qwen3-VL vision-language models in 3B and 8B sizes, bringing the flagship VL capabilities down to edge- and laptop-friendly scales. Weights are open on Hugging Face as part of the Qwen3-VL collection.

X announcement ↗Hugging Face collection ↗

🎙️ Hear our coverage →

#open-source #vision #multimodal

Google DeepMind Oct 16, 2025

New ModelsOpen weights

C2S-Scale 27B

Google's C2S-Scale 27B validates a cancer hypothesis in living cells

Google released C2S-Scale 27B, a Gemma-based single-cell biology model that generated a novel cancer therapy hypothesis later validated in living cells. The show called this a bombshell example of AI contributing to real scientific discovery rather than just benchmarks.

Sundar Pichai on X ↗Google Blog ↗Paper (bioRxiv) ↗

🎙️ Hear our coverage →

#research #open-source

KAIST Oct 16, 2025

New ModelsOpen weights

KORMo 10B

KAIST releases KORMo, a bilingual Korean/English 10B open model

KAIST published KORMo, a 10B parameter fully open bilingual model for Korean and English, with weights on Hugging Face and an accompanying paper. It continues the trend of strong national-language open models coming out of Korean labs.

Hugging Face ↗Paper ↗

🎙️ Hear our coverage →

#open-source #multilingual

September 2025

Alibaba (Qwen) Sep 25, 2025

New ModelsOpen weights

Qwen3-Omni

Qwen3-Omni ships open-weights any-to-any audio, vision, and text

Alongside Qwen3-VL, Alibaba released Qwen3-Omni, an end-to-end omni-modal open-weights model that takes text, image, audio, and video input and can respond with streaming speech. The show treated it as direct evidence of how fast open multimodal systems are improving, with weights on Hugging Face, a GitHub repo, demos, and availability in Qwen Chat and the Model Studio API.

HF ↗GitHub ↗Qwen Chat ↗Demo ↗

🎙️ Hear our coverage →

#open-source #multimodal #voice-ai

Alibaba (Qwen) Sep 25, 2025

New ModelsOpen weights

Qwen3-VL

Alibaba releases Qwen3-VL open-weights vision-language flagship

Alibaba's Qwen team shipped Qwen3-VL, its new flagship open-weights vision-language family, headlining the episode's 'Qwen-mas' barrage. The panel discussed it as a practical workflow tool for visual understanding and agentic GUI tasks, not just another model card, with weights, a blog post, and a Hugging Face demo all available at launch.

X ↗HF ↗Blog ↗Demo ↗

🎙️ Hear our coverage →

#open-source #vision #multimodal

Alibaba (Wan) Sep 25, 2025

New ModelsOpen weights

Wan 2.2 Animate

Wan Animate brings open-weights character animation and replacement

Alibaba's Wan team released Wan 2.2 Animate, an open-weights model that animates a character image from a performance video, replicating motion and expressions, or swaps a character into existing footage. It landed in the episode's closing run of video releases showing multimodal product quality climbing across the board.

🎙️ Hear our coverage →

#video-gen #open-source

DeepSeek Sep 25, 2025

New ModelsOpen weights

DeepSeek V3.1 Terminus

DeepSeek V3.1 Terminus refines agents and bilingual output

DeepSeek released V3.1 Terminus, an update to V3.1 with cleaner bilingual output, stronger agentic tool use, and cheaper long-context handling. The open weights are available on Hugging Face, continuing DeepSeek's cadence of iterative open releases.

🎙️ Hear our coverage →

#open-source #agents #reasoning

IBM Sep 25, 2025

New ModelsOpen weights

Granite Docling 258M

IBM releases Granite Docling 258M compact document-parsing VLM

IBM published Granite Docling 258M, an ultra-compact open-source vision-language model for document understanding that converts documents into structured output. At just 258M parameters it reinforced the show's point that tiny specialized models are becoming genuinely useful workflow tools.

🎙️ Hear our coverage →

#vision #on-device #open-source

Liquid AI Sep 25, 2025

New ModelsOpen weights

Liquid Nanos

Liquid AI ships Liquid Nanos, tiny task-specific on-device models

Liquid AI released Liquid Nanos, a family of very small task-specific models built for jobs like extraction, translation, RAG, and tool calling that can run on-device. The collection landed on Hugging Face, fitting the episode's theme of small-but-capable models powering real products.

🎙️ Hear our coverage →

#open-source #on-device

Meta AI Sep 25, 2025

New ModelsOpen weights

Code World Model (CWM)

Meta releases 32B Code World Model for agentic code reasoning

Meta released CWM, a 32B open-weights research model trained to internally model code execution, aimed at agentic code reasoning rather than plain code completion. The weights are on Hugging Face under facebook/cwm, giving the open-source community a new approach to code world modeling.

🎙️ Hear our coverage →

#open-source #coding #agents

Moondream AI Sep 25, 2025

New ModelsOpen weights

Moondream 3

Moondream 3 preview punches above its weight in the tiny-VLM race

Moondream released a preview of Moondream 3, a small open vision-language model that punches well above its size class. CTO and co-founder Vik Korrapati joined the show to explain why small, capable vision models matter for real product building, framing Moondream 3 as a practical tool rather than a benchmark flex.

🎙️ Hear our coverage →

#vision #on-device #open-source

Alibaba (Tongyi Lab) Sep 18, 2025

New ModelsOpen weights

Tongyi DeepResearch 30B-A3B

Tongyi DeepResearch: open-source A3B web agent rivals OpenAI Deep Research

Alibaba's Tongyi Lab open-sourced Tongyi DeepResearch, a 30B mixture-of-experts web research agent with only 3B active parameters. The lab claims parity with OpenAI's Deep Research on agentic search and report-writing tasks, and the weights are available on Hugging Face.

🎙️ Hear our coverage →

#open-source #agents #search

ByteDance / Tsinghua Sep 18, 2025

New ModelsOpen weights

HuMo

HuMo: human-centric multimodal video generation from ByteDance/Tsinghua

ByteDance research and Tsinghua released HuMo, a human-centric video generation model that conditions on multimodal inputs (text, image, and audio) to produce videos of people. The weights are available on Hugging Face.

🎙️ Hear our coverage →

#video-gen #open-source

Mistral AI Sep 18, 2025

New ModelsOpen weights

Magistral-Small-2509

Mistral updates its open reasoning model with Magistral-Small-2509

Mistral published Magistral-Small-2509, an updated checkpoint of its small open-weights reasoning model. The refresh keeps Mistral's open reasoning line current as the open-model competitive baseline moves quickly.

🎙️ Hear our coverage →

#open-source #reasoning

Moondream Sep 18, 2025

New ModelsOpen weights

Moondream 3 (Preview)

Moondream 3 Preview: 9B MoE VLM with 2B active parameters

Moondream released a preview of Moondream 3, a 9B mixture-of-experts vision-language model with only 2B active parameters. It targets frontier-level visual reasoning at small-model cost, continuing Moondream's run of efficient open vision models.

🎙️ Hear our coverage →

#vision #open-source #architecture

P Perceptron AI Sep 18, 2025

New ModelsOpen weights

Isaac 0.1

Perceptron AI introduces Isaac 0.1, a 2B perceptive-language model

Perceptron AI released Isaac 0.1, a 2B parameter perceptive-language model with open weights on Hugging Face. Despite its small size, the show notes highlight that it 'points better than GPT', excelling at visual grounding and pointing tasks relative to much larger models.

X ↗HF ↗Blog ↗

🎙️ Hear our coverage →

#open-source #vision #multimodal

Alibaba (Tongyi Lab) Sep 4, 2025

New ModelsOpen weights

WebWatcher-32B

Alibaba's Tongyi Lab open-sources WebWatcher vision-language research agent

Alibaba's Tongyi Lab open-sourced WebWatcher, a vision-language deep research agent that sets new state-of-the-art results on agentic browsing and research tasks. The 32B model combines visual understanding with web research capabilities and is available on Hugging Face.

🎙️ Hear our coverage →

#open-source #agents #search

Apple Sep 4, 2025

New ModelsOpen weights

FastVLM-7B

Apple's FastVLM-7B lands with a speed-first vision encoder, 85x faster TTFT

Apple released FastVLM-7B, a vision-language model built around a speed-first vision encoder that delivers up to 85x faster time-to-first-token than peer VLMs. Quantized variants (7B-int4, 1.5B-int8) on Hugging Face make it practical for on-device and real-time vision use, anchoring the show's fast-VLM discussion.

X ↗HF ↗HF (1.5B int8) ↗

🎙️ Hear our coverage →

#vision #on-device #open-source

Google DeepMind Sep 4, 2025

New ModelsOpen weights

EmbeddingGemma

Google releases EmbeddingGemma, a 300M-param SOTA embedding model for RAG

Google released EmbeddingGemma, a 300M-parameter open embedding model that achieves state-of-the-art results for its size, aimed at RAG and on-device semantic search. It dropped as breaking news during the show, with browser-based demos like Semantic Galaxy showing it running fully client-side.

X ↗HF ↗Try It ↗

🎙️ Hear our coverage →

#search #open-source #on-device

Nous Research Sep 4, 2025

New ModelsOpen weights

Hermes 4 14B

Nous Research releases Hermes 4 14B compact hybrid reasoning model

Nous Research launched Hermes 4 at 14B, a compact hybrid reasoning model with tool calling designed for both local and cloud use. It extends the Hermes 4 family down to a size practical for local deployment while keeping reasoning and tool-use capabilities, with a full tech report published on arXiv.

X ↗HF ↗Tech Report ↗

🎙️ Hear our coverage →

#open-source #reasoning #agents

S Swiss AI Initiative Sep 4, 2025

New ModelsOpen weights

Apertus-8B / Apertus-70B

Switzerland launches Apertus-8B and 70B, fully open multilingual LLMs

The Swiss AI Initiative launched Apertus-8B and Apertus-70B, fully open multilingual LLMs trained on 15T tokens covering more than 1,800 languages. The release stands out for full openness (weights, data recipe, and training transparency) and unusually broad language coverage from a national effort.

🎙️ Hear our coverage →

#open-source #multilingual

Tencent Sep 4, 2025

New ModelsOpen weights

Hunyuan-MT-7B

Tencent open-sources Hunyuan-MT-7B translation model after sweeping WMT2025

Tencent open-sourced Hunyuan-MT-7B, a 7B-parameter machine translation model, after it swept the WMT2025 translation competition. It gives the open-weights community a small, focused translation model that punches well above its size class.

🎙️ Hear our coverage →

#open-source #multilingual

July 2025

Agentica Jul 3, 2025

New ModelsOpen weights

DeepSWE-Preview

DeepSWE-Preview hits 59% SWE-Bench Verified with pure RL on Qwen3-32B

Agentica and collaborators (with guest Michael Luo of UC Berkeley) released DeepSWE-Preview, a fully open-sourced RL-trained coding agent built on Qwen3-32B that reached 59% on SWE-Bench Verified, a top open result in a benchmark dominated by closed systems. The team published training methodology and weights, emphasizing reproducible reward design and verification over sealed benchmark numbers.

59% SWE-Bench Verified

Training write-up (Notion) ↗Hugging Face model ↗

🎙️ Hear our coverage →

#open-source #coding #agents

Baidu Jul 3, 2025

New ModelsOpen weights

ERNIE 4.5

Baidu open-sources ERNIE 4.5, a 10-model multimodal family

Baidu open-sourced the ERNIE 4.5 series, a family of 10 models ranging from 424B down to 0.3B parameters with multimodal capabilities, reportedly beating o1 on DocVQA. The release marks a sharp reversal from Baidu's previous anti-open-source posture and another sign that Chinese labs are setting the pace in open source.

10 ERNIE 4.5 models

X announcement ↗Hugging Face ↗Technical report (PDF) ↗

🎙️ Hear our coverage →

#open-source #multimodal #multilingual

Huawei Jul 3, 2025

New ModelsOpen weights

Pangu Pro MoE

Huawei's Pangu Pro MoE: 72B model trained entirely on Ascend NPUs

Huawei released Pangu Pro, a 72B-parameter MoE trained on its own Ascend NPUs rather than Nvidia or AMD hardware, hitting 1,528 tokens/sec and pretrained on 13T tokens. The panel framed it as the geopolitical open-model story of the week, showing how far Chinese compute stacks have advanced under sanctions.

X coverage ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #architecture #infrastructure

Kyutai Jul 3, 2025

New ModelsOpen weights

Kyutai TTS

Kyutai releases open low-latency TTS for English and French

Kyutai Labs released an open 1.6B-parameter text-to-speech model with low latency and high voice similarity in English and French. It was one of two TTS launches closing out the episode, underscoring how quickly multimodal product quality is rising.

X announcement ↗Hugging Face model ↗

🎙️ Hear our coverage →

#voice-ai #open-source

Tencent Jul 3, 2025

New ModelsOpen weights

Hunyuan-A13B-Instruct

Tencent ships Hunyuan-A13B: 80B MoE with only 13B active params

Tencent released Hunyuan-A13B-Instruct, an 80B-parameter MoE that activates only 13B parameters at inference while keeping a 256K context window. Built by the team with WizardLM lineage, it posts strong reasoning benchmarks and feels unusually practical for its class, though the panel flagged its license limits.

13B Hunyuan active params

X announcement ↗Hugging Face ↗Try it ↗

🎙️ Hear our coverage →

#open-source #architecture #reasoning

May 2025

DeepSeek May 29, 2025

New ModelsOpen weights

DeepSeek-R1-0528

DeepSeek drops R1-0528, an updated open reasoning model with big gains

DeepSeek released R1-0528 out of nowhere, an update to their open-weights reasoning model with serious performance jumps: AIME 91, LiveCodeBench 73, and SWE-bench Verified 57.6. They also shipped an 8B distilled version based on Qwen3 that can run on a laptop, keeping it among the best open-weight models available.

91 AIME score, beating previous R1 by a mile8B Distilled Qwen3-based version runnable on a laptop

🎙️ Hear our coverage →

#open-source #reasoning

Haize Labs May 29, 2025

New ModelsOpen weights

j1-nano & j1-micro

Haize Labs releases j1-nano and j1-micro tiny reward models

Haize Labs shipped j1-nano (600M params) and j1-micro (1.7B params), tiny open reward models for judging LLM outputs. Despite their small size, j1-micro scores 80.7% on RewardBench, making capable reward modeling accessible on modest hardware.

Tweet ↗GitHub ↗HF j1-micro ↗HF j1-nano ↗

🎙️ Hear our coverage →

#open-source #training #benchmarks

Resemble AI May 29, 2025

New ModelsOpen weights

Chatterbox

Resemble AI open-sources Chatterbox voice cloning with emotion control

Resemble AI released Chatterbox, an open-source voice cloning model with emotion control. Weights and code are public on GitHub and Hugging Face, bringing controllable, expressive voice cloning to the open ecosystem.

GitHub ↗Hugging Face ↗

🎙️ Hear our coverage →

#voice-ai #open-source

A A-M Team May 15, 2025

New ModelsOpen weights

AM-Thinking v1

AM-Thinking v1: 32B dense reasoning model beats bigger MoEs at math and code

A 32B dense open-weights reasoning LLM from a new Chinese team that takes on much larger mixture-of-experts models and comes out on top for math and code, hitting 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard. It supports a /think reasoning toggle, ships with a permissive license, is tooled for vLLM, LM Studio, and Ollama, and runs at 25 tokens/sec on a single 80GB GPU with INT4 quantization. A multilingual RLHF pass and 128k context window are in the works.

32B dense parameters85.3% AIME 202425 tokens/sec on a single 80GB GPU with INT4

Hugging Face ↗Paper ↗Project page ↗

🎙️ Hear our coverage →

#open-source #reasoning

Alibaba May 15, 2025

New ModelsOpen weights

Wan 2.1

Alibaba's Wan 2.1: open-source diffusion-transformer text-to-video suite

Alibaba, the team behind the Qwen LLMs, released Wan 2.1, a full stack of open-source diffusion-transformer text-to-video foundation models. Amid the show's discussion of video-model fatigue, this was called out as a release that cuts through the noise, with weights on Hugging Face and code on GitHub.

Hugging Face ↗GitHub ↗Announcement tweet ↗Try it ↗

🎙️ Hear our coverage →

#video-gen #open-source #architecture

Nous Research May 15, 2025

Products & AppsOpen weights

Psyche

Nous Research launches Psyche, a decentralized cooperative-training network

Psyche is Nous Research's decentralized cooperative-training network that lets distributed participants jointly train large models over the internet. The launch includes open code on GitHub and a live dashboard tracking the first run, a 40B model called Consilience. COO Dillon Rolnick joined the show to explain the decentralized training push.

Website ↗GitHub ↗Announcement tweet ↗Consilience 40B dashboard ↗

🎙️ Hear our coverage →

#training #open-source #infrastructure

Stability AI May 15, 2025

New ModelsOpen weights

Stable Audio Open Small

Stability AI and Arm release Stable Audio Open Small for on-device audio

Stability AI, together with Arm, released Stable Audio Open Small, a 341M-parameter open text-to-audio model built for real-world on-device deployment. The show framed it as part of a small comeback for Stability, with weights on Hugging Face and an accompanying paper.

Blog ↗Paper ↗Hugging Face ↗Announcement on X ↗

🎙️ Hear our coverage →

#audio #on-device #open-source

StepFun May 15, 2025

New ModelsOpen weights

Step1X-3D

StepFun's Step1X-3D: open two-stage framework for textured 3D assets

StepFun released Step1X-3D, an open two-stage framework for high-fidelity, controllable generation of textured 3D assets: it first synthesizes watertight geometry, then generates view-consistent textures. Trained on 2M curated meshes, the release also includes a curated dataset of 800K assets and a Hugging Face demo.

Hugging Face ↗Demo ↗Dataset ↗

🎙️ Hear our coverage →

#world-models #open-source #training

Technology Innovation Institute (TII) May 15, 2025

New ModelsOpen weights

Falcon-Edge

Falcon-Edge: ternary BitNet LLMs for edge deployment under 1GB VRAM

TII's Falcon-Edge project releases ternary BitNet LLMs (1B and 3B base models) that slash memory and compute requirements, enabling inference on less than 1GB of VRAM. Fine-tuners get pre-quantized checkpoints and a clear path to 1-bit LLMs.

Blog ↗Falcon-E-1B on Hugging Face ↗Falcon-E-3B on Hugging Face ↗

🎙️ Hear our coverage →

#open-source #on-device #infrastructure

Alibaba (Qwen) May 1, 2025

New ModelsOpen weights

Qwen 2.5 Omni

Qwen 2.5 Omni gets an update

Alongside the Qwen 3 launch, Alibaba updated its Qwen 2.5 Omni multimodal model line. Mentioned briefly in the open-source roundup as part of the week's Qwen ecosystem push.

Alibaba Qwen announcement (X) ↗

🎙️ Hear our coverage →

#open-source #multimodal

Alibaba (Qwen) May 1, 2025

New ModelsOpen weights

Qwen 3

Alibaba open-weights the full Qwen 3 family under Apache 2.0

Alibaba released the entire Qwen 3 stack: two MoE models (235B total/22B active and 30B/3B active) plus six dense siblings from 32B down to 0.6B, all Apache 2.0 with day-one support in LM Studio, Ollama, vLLM, MLX and llama.cpp. The headline feature is a runtime hybrid 'thinking' toggle (/think and /no_think) that trades latency for reasoning depth. Trained on ~36T tokens with 128K context and 119-language coverage, the 235B MoE rivals DeepSeek-R1, o1, o3-mini and Gemini 2.5 Pro on coding and math.

235 B Flagship MoE total parameters (22B active)30 B Qwen3-30B-A3B hit 57 tok/s on a Mac with speculative decoding36 Trillions of pre-training tokens (2x Qwen 2.5)

Qwen 3 blog post ↗GitHub ↗Hugging Face collection ↗HF demo ↗

🎙️ Hear our coverage →

#open-source #reasoning #architecture

HiDream May 1, 2025

New ModelsOpen weights

HiDream E1

HiDream E1: open-weights image model with standout Ghibli style

HiDream released E1, an open-weights image editing/generation model (Apache 2.0-style licensing) noted for beautiful Ghibli-style outputs. It ranks #4 on the Artificial Analysis image arena leaderboard, sitting among top contenders like Google Imagen and ReCraft.

Hugging Face: HiDream-E1-Full ↗

🎙️ Hear our coverage →

#image-gen #open-source

JetBrains May 1, 2025

New ModelsOpen weights

Mellum-4b-base

JetBrains open-sources Mellum-4b, its code completion focal model

JetBrains published Mellum-4b-base on Hugging Face, a 4B-parameter model specialized for code completion that powers its IDE AI features. Listed in the episode's open-source links roundup.

Hugging Face: Mellum-4b-base ↗

🎙️ Hear our coverage →

#open-source #coding

Kyutai May 1, 2025

New ModelsOpen weights

Helium-1

Kyutai releases Helium-1, a 2B European-language model plus dactory pipeline

Kyutai released Helium-1, a 2B-parameter model distilled from Gemma-2-9B and purpose-built for Europe's 24 official languages, under CC-BY 4.0. It sets a new state of the art for its size class on MMLU-EU, ARC-EU and FLORES translation while fitting in under 2GB VRAM for edge and phone deployment. They also open-sourced 'dactory' (MIT), their full Common Crawl data-processing pipeline that scores, dedups and tags webpages.

Blog post ↗Hugging Face: helium-1-2b ↗Dactory pipeline (GitHub) ↗

🎙️ Hear our coverage →

#open-source #multilingual #on-device

Meta AI May 1, 2025

New ModelsOpen weights

Llama Guard 4

Meta ships Llama protection suite: Llama Guard 4, Firewall, Prompt Guard 2

Meta's LlamaCon security drop included Llama Guard 4 (text + image protection), Llama Firewall (stops prompt hacks and risky code), Prompt Guard 2 (faster jailbreak defense), CyberSecEval 4, and a new Defender Program for security researchers.

AI at Meta LlamaCon announcements (X) ↗

🎙️ Hear our coverage →

#safety #open-source

Microsoft May 1, 2025

New ModelsOpen weights

Phi-4-reasoning

Microsoft ships Phi-4-reasoning and Phi-4-reasoning-plus (14B, MIT)

Microsoft fine-tuned the 14B Phi-4 on 1.4M curated chain-of-thought traces (SFT) and added a small RL stage (Plus variant) to create two MIT-licensed reasoning models. They punch far above their weight: Phi-4-reasoning-plus outperforms DeepSeek-R1-Distill-70B on AIME 25 (78% vs 51%) and sits within a few points of the full 671B DeepSeek-R1, while running on a single GPU with explicit <think> scaffolding.

ArXiv paper ↗Tech report ↗Hugging Face: Phi-4-reasoning ↗Suriya's thread ↗

🎙️ Hear our coverage →

#open-source #reasoning #on-device

OpenPipe May 1, 2025

New ModelsOpen weights

ART·E

OpenPipe's ART·E: RL-trained open email agent that beats o3

OpenPipe released ART·E, an Apache 2.0 email research agent built on a 14B Qwen 2.5 backbone, trained on 500K Enron emails plus synthetic Q&A and refined with reinforcement learning. It tops o3 on accuracy (96% vs 90%) while running 5x faster (1.1s median) and 64x cheaper ($0.85 per 1,000 queries), using a simple three-tool loop.

Launch thread (X) ↗Blog post ↗GitHub: OpenPipe/ART ↗

🎙️ Hear our coverage →

#agents #training #open-source

Xiaomi May 1, 2025

New ModelsOpen weights

MiMo-7B

Xiaomi enters open weights with MiMo-7B, MIT-licensed reasoning family

Xiaomi's first open-weights release is a 7B dense family (Base, SFT, RL, RL-Zero) trained from scratch on 25T tokens with a multi-token-prediction objective and rule-verifiable reinforcement learning. The RL variant matches OpenAI o1-mini on benchmark suites despite being far smaller, scoring 55.4% on AIME 2025 and 49.3% on LiveCodeBench v6, all under an MIT license with vLLM-ready weights.

Hugging Face model hub ↗

🎙️ Hear our coverage →

#open-source #reasoning #training

April 2025

Daily (Pipecat) Apr 24, 2025

New ModelsOpen weights

Smart-Turn VAD

Pipecat releases Smart-Turn, an open source semantic VAD model

The Pipecat team (from Daily) released Smart-Turn, an open source semantic voice activity detection model that understands when a speaker has actually finished their turn rather than just detecting silence. Kwindla Kramer joined the show to break down how semantic VAD makes voice agent conversations feel far more natural, with a community training effort at turn-training.pipecat.ai.

GitHub ↗HF Model ↗Fal.ai Playground ↗Try It Demo ↗

🎙️ Hear our coverage →

#voice-ai #open-source #agents

Google DeepMind Apr 24, 2025

New ModelsOpen weights

Gemma 3 QAT

Google ships Quantization-Aware Trained Gemma 3 models for consumer GPUs

Google released Quantization-Aware Training (QAT) versions of the Gemma 3 family, dramatically cutting memory requirements while preserving quality. The 27B model drops from a hefty 54GB to just 14.1GB, and even the 1B model goes from 2GB to about half a gig, making state-of-the-art open models runnable on consumer GPUs. Wolfram took the 4B QAT model for a spin in LM Studio on the show.

27B Gemma 3 27B QAT: 54GB down to 14.1GB1B Gemma 3 1B QAT: 2GB down to ~0.5GB4B 4B QAT model tested in LM Studio

X Post ↗Blog ↗Reddit thread ↗

🎙️ Hear our coverage →

#open-source #infrastructure #on-device

HumanLayer Apr 24, 2025

Dev ToolsOpen weights

12-Factor Agents

Dex Horthy publishes 12-Factor Agents, a guide to production-ready agents

HumanLayer founder Dex Horthy published 12-Factor Agents, an open GitHub repo and essay distilling common patterns and pitfalls for building reliable, production-ready AI agents. Drawing on his experience building agent SDKs, it argues that serious teams end up writing large parts from scratch and lays out principles for robust agent design, discussed in depth on the show.

GitHub Repo ↗Webinar Recording ↗

🎙️ Hear our coverage →

#agents #coding #open-source

L Lvmin Zhang (lllyasviel) Apr 24, 2025

New ModelsOpen weights

FramePack

FramePack generates 120-second videos on just 6GB of VRAM

FramePack, from ControlNet creator Lvmin Zhang (lllyasviel), is an open source next-frame prediction approach for long video generation that runs on consumer hardware. It can generate videos up to 120 seconds long on as little as 6GB of VRAM by packing input frame context into a fixed length.

120s Max video length6GB Minimum VRAM

Project Page ↗GitHub ↗

🎙️ Hear our coverage →

#video-gen #open-source #on-device

Nari Labs Apr 24, 2025

New ModelsOpen weights

Dia-1.6B

Nari Labs' Dia: a wild 1.6B open source TTS model that blew up Twitter

Nari Labs released Dia, a 1.6B parameter open-weights text-to-speech model that absolutely blew up Twitter with its expressive, emotional dialogue generation, including laughs, coughs, and multi-speaker conversations. Built by a tiny team, it punches far above its weight against commercial TTS systems and supports voice cloning, with demos available on Fal.ai.

1.6B Parameters

X Post Highlight ↗HF Model ↗GitHub ↗Fal.ai Voice Clone Demo ↗

🎙️ Hear our coverage →

#voice-ai #open-source

NVIDIA Apr 24, 2025

New ModelsOpen weights

Describe Anything (DAM-3B)

NVIDIA releases DAM-3B for region-based image and video captioning

NVIDIA dropped the Describe Anything Model (DAM-3B), a 3 billion parameter multimodal model for region-based image and video captioning. You can point it at a specific region of an image or video and it generates a detailed description of just that area. NVIDIA also published an accompanying DescribeAnything dataset and a Hugging Face demo.

3B Parameters

X Post ↗HF Model ↗HF Demo ↗HF Dataset ↗

🎙️ Hear our coverage →

#vision #multimodal #open-source

Sand AI Apr 24, 2025

New ModelsOpen weights

MAGI-1

Sand AI surprises with MAGI-1, a 24B streaming autoregressive video model

Sand AI released MAGI-1, a 24B autoregressive diffusion model for long-form, streaming video generation with remarkable character consistency, often the Achilles' heel of AI video. It predicts video in 24-frame chunks with causal attention between them, enabling real-time streaming generation where compute doesn't scale with length. Nisten speculated it could be a major step toward usable AI-generated movies by solving the face/character consistency problem.

24B Parameters24 Frames per autoregressive chunk

X Post ↗GitHub ↗PDF Report ↗HF Repo ↗

🎙️ Hear our coverage →

#video-gen #open-source #architecture

Microsoft Apr 17, 2025

New ModelsOpen weights

BitNet b1.58

Microsoft releases BitNet 1.58-bit model weights on Hugging Face

Microsoft published BitNet (listed in the show notes as BitNet v1.5), its native 1.58-bit quantized LLM, as open weights on Hugging Face. The ternary-weight approach targets extremely efficient CPU inference at a fraction of the memory of standard models.

Hugging Face ↗

🎙️ Hear our coverage →

#open-source #infrastructure

OpenAI Apr 17, 2025

Dev ToolsOpen weights

Codex CLI

OpenAI debuts Codex CLI, an open source terminal coding agent

OpenAI released Codex CLI, an open source coding tool for the terminal. It ships with hardened security, using Apple Seatbelt on macOS to limit execution to the current directory plus temp files.

🎙️ Hear our coverage →

#coding #agents #open-source

Prime Intellect Apr 17, 2025

New ModelsOpen weights

INTELLECT-2

Prime Intellect launches INTELLECT-2, a 32B globally-distributed RL run

Prime Intellect released INTELLECT-2, a 32B reasoning model trained with globally decentralized reinforcement learning, a follow-up to the INTELLECT-1 decentralized pretraining run covered on the show in December. The release includes open weights on Hugging Face, a tech report, and the PRIME-RL training code.

Blog ↗X ↗Blog ↗Tech report ↗

🎙️ Hear our coverage (+1 follow-up) →

#open-source #training #reasoning

Zhipu AI (Z.ai) Apr 17, 2025

New ModelsOpen weights

GLM-4-0414

Z.ai (formerly chatGLM) releases the GLM-4-0414 open-source family

Z.ai, the rebranded Zhipu AI / chatGLM team, released the GLM-4-0414 family of open-source models. The drop includes base, reasoning and rumination variants published on Hugging Face and GitHub.

X ↗HF Collection ↗GitHub ↗

🎙️ Hear our coverage →

#open-source #reasoning

D Deep Cogito Apr 10, 2025

New ModelsOpen weights

Cogito v1 Preview (3B-70B)

Deep Cogito debuts Cogito v1 Preview models from 3B to 70B, beating DeepSeek 70B

New lab Deep Cogito released the Cogito v1 Preview family of open models ranging from 3B to 70B parameters, claiming SOTA results at each size and beating DeepSeek's 70B distill. The models are available on Hugging Face, giving local AI enthusiasts the small-to-mid sizes Llama 4 skipped.

3B-70B Model size range

Deep Cogito research blog: Cogito v1 Preview ↗Hugging Face: cogito-v1-preview-llama-70B ↗

🎙️ Hear our coverage →

#open-source #reasoning

G GitMCP (Liad Yosef & Ido Salomon) Apr 10, 2025

Dev ToolsOpen weights

GitMCP

GitMCP turns any GitHub repo into an MCP server instantly

Creators Liad Yosef and Ido Salomon launched GitMCP, a free tool that turns any GitHub repository into an MCP server by simply swapping the domain (gitmcp.io/user/repo). It lets AI assistants ground themselves in a repo's docs and code, and the creators joined the show to demo it.

🎙️ Hear our coverage →

#agents #coding #open-source

Google Apr 10, 2025

Also ReleasedOpen weights

Agent2Agent (A2A) protocol

Google announces A2A, an open agent-to-agent communication protocol

Google announced the Agent2Agent (A2A) protocol at Cloud Next, an open spec for agents from different vendors to discover and communicate with each other. The spec was published on GitHub with a long list of launch partners, including Weights & Biases.

Google Developers blog: A2A ↗A2A spec on GitHub ↗W&B partnership blog ↗

🎙️ Hear our coverage →

#agents #open-source

HiDream AI Apr 10, 2025

New ModelsOpen weights

HiDream-I1-Dev

HiDream-I1-Dev: 17B MIT-licensed image model surpasses Flux 1.1 [pro]

HiDream released HiDream-I1-Dev, a 17B parameter open-weights image generation model under an MIT license. It became the new leading open-weights image generator, surpassing Flux 1.1 [pro] on quality benchmarks.

17B Parameters, MIT license

Hugging Face collection: HiDream-I1 ↗

🎙️ Hear our coverage →

#image-gen #open-source

Jina AI Apr 10, 2025

New ModelsOpen weights

Jina Reranker M0

Jina Reranker M0: SOTA multilingual, multimodal document reranker

Jina AI released Jina Reranker M0, a state-of-the-art multimodal and multilingual document reranker model. It reranks documents that include both text and images, targeting retrieval and RAG pipelines, with weights available on Hugging Face.

Jina blog: Reranker M0 ↗Hugging Face: jina-reranker-m0 ↗

🎙️ Hear our coverage →

#search #open-source #multimodal

Meta AI Apr 10, 2025

New ModelsOpen weights

Llama 4 (Scout & Maverick)

Meta drops Llama 4 Scout (109B) and Maverick (400B) open-weights MoE models

Meta released the long-awaited Llama 4 family in a chaotic Saturday drop: Scout (17B active / ~109B total, 16 experts) and Maverick (17B active / ~400B total, 128 experts), with a 2T-parameter Behemoth still in training. The models are multimodal, multilingual MoE architectures trained on ~30T tokens with FP8 and interleaved attention (iRoPE), claiming 10M context for Scout and 1M for Maverick. The release was marred by drama: the LMArena version differed from the released model, and the community criticized the lack of small local-friendly sizes.

10M Stated context window for Llama 4 Scout288B Active parameters of unreleased Behemoth (2T total)17B Active parameters for both Scout and Maverick

Meta blog: Llama 4 multimodal intelligence ↗Hugging Face: meta-llama ↗Try it at meta.ai ↗

🎙️ Hear our coverage →

#open-source #architecture #multimodal

Moonshot AI (Kimi) Apr 10, 2025

New ModelsOpen weights

Kimi-VL & Kimi-VL-Thinking

Moonshot drops Kimi-VL and Kimi-VL-Thinking, tiny A3B open vision models

Moonshot AI released Kimi-VL and Kimi-VL-Thinking, compact vision-language models with only ~3B active parameters (A3B MoE). The thinking variant adds reasoning to a tiny VLM, and both are available openly on Hugging Face.

A3B ~3B active parameters (MoE)

Hugging Face collection: Kimi-VL-A3B ↗

🎙️ Hear our coverage →

#open-source #vision #reasoning

NVIDIA Apr 10, 2025

New ModelsOpen weights

Llama-3.1-Nemotron-Ultra-253B

NVIDIA ships Nemotron Ultra, a 253B pruned and distilled Llama 3.1-405B

NVIDIA released Nemotron Ultra, a pruned and distilled finetune of Llama 3.1-405B at roughly half the parameters (253B). Its benchmarks even included Llama 4 comparisons, showing the older finetuned Llama beating the new models on AIME, GPQA and more. It supports 128K context and fits on a single 8xH100 node for inference.

253B Parameters (pruned from Llama 3.1-405B)128K Context window

Hugging Face: Llama-3_1-Nemotron-Ultra-253B-v1 ↗Announcement on X ↗

🎙️ Hear our coverage →

#open-source #training #reasoning

Together AI & Agentica (UC Berkeley) Apr 10, 2025

New ModelsOpen weights

DeepCoder-14B-Preview

DeepCoder-14B: open RL-finetuned coder beats DeepSeek R1 and o3-mini on coding

Together AI and Agentica (UC Berkeley Sky Computing Lab) released DeepCoder-14B-Preview, a reasoning model finetuned with RL that beats DeepSeek R1 and even o3-mini on several coding benchmarks. The project aims to democratize RL: the team open-sourced the model, the training dataset, the Weights & Biases logs, and the eval logs. Guest Michael Luo from Agentica joined the show to discuss the release.

14B Model parameters

Together AI blog: DeepCoder ↗Announcement on X ↗Hugging Face: DeepCoder-14B-Preview ↗Hugging Face dataset: DeepCoder-Preview-Dataset ↗

🎙️ Hear our coverage →

#open-source #coding #reasoning

All Hands AI Apr 3, 2025

New ModelsOpen weights

OpenHands LM 32B

OpenHands LM 32B: MIT-licensed coding agent model hits 37.2% SWE-Bench

All Hands AI (formerly OpenDevin) released OpenHands LM 32B, an MIT-licensed Qwen finetune that scores 37.2% on SWE-Bench Verified, competing with much larger models on real-world repo tasks. The OpenHands agent also took the #2 spot on the new Live SWE-Bench leaderboard, and the 32B model runs locally on a single RTX 3090. A hosted OpenHands Cloud version is also available; guest Xingyao Wang joined the show to discuss it.

37.2% SWE-Bench Verified score#2 Live SWE-Bench leaderboard (OpenHands agent)

Introducing OpenHands LM 32B (blog) ↗Model on Hugging Face (MIT license) ↗OpenHands Cloud ↗

🎙️ Hear our coverage →

#open-source #coding #agents

Nomic AI Apr 3, 2025

New ModelsOpen weights

Nomic Embed Multimodal

Nomic Embed Multimodal: SOTA embeddings for visual documents

Nomic AI released Nomic Embed Multimodal, new 3B and 7B parameter embedding models built on Alibaba's Qwen2.5-VL. They achieve SOTA on visual document retrieval by embedding interleaved text-image sequences, ideal for PDFs and complex webpages. The 7B model ships under Apache 2.0 with open weights, code, and data; guest Zach Nussbaum discussed the release on the show.

3B parameters (smaller model)7B parameters (Apache 2.0 model)

Nomic Embed Multimodal blog post ↗Models on Hugging Face ↗

🎙️ Hear our coverage →

#search #multimodal #open-source

March 2025

Alibaba (Qwen) Mar 27, 2025

New ModelsOpen weights

Qwen2.5-Omni-7B

Qwen launches Omni 7B: sees, hears, reads, and talks back

Qwen released Qwen2.5-Omni-7B, an open-weights omni-modal model that perceives text, images, audio, and video, and generates both text and speech. It packs end-to-end multimodal perception and spoken output into a 7B parameter model available on Hugging Face.

7B parameters

Hugging Face ↗

🎙️ Hear our coverage →

#open-source #multimodal #voice-ai

DeepSeek Mar 27, 2025

New ModelsOpen weights

DeepSeek-V3-0324

DeepSeek silently drops V3-0324, 685B params under MIT license

DeepSeek silently updated their V3 base model with DeepSeek-V3-0324, a 685B parameter MoE released on Hugging Face under the MIT license. This is not R1 (their reasoning model) but the powerful base model R1 was built on, and supposedly the base for a future R2.

685B parameters

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #frontier-models

M MLX Community (Prince Canuma) Mar 27, 2025

Dev ToolsOpen weights

MLX-Audio v0.0.3

Prince Canuma releases MLX-Audio v0.0.3 for speech on Apple Silicon

Prince Canuma, creator of MLX-VLM, FastMLX, and MLX Embeddings, released MLX-Audio v0.0.3, an open-source library bringing speech and audio models to Apple Silicon via MLX. It makes powerful open-source TTS and audio models accessible locally on Mac hardware.

GitHub repo ↗Prince Canuma on X ↗

🎙️ Hear our coverage →

#voice-ai #open-source #on-device

C Canopy Labs Mar 20, 2025

New ModelsOpen weights

Orpheus 3B

Canopy Labs drops Orpheus 3B natural-sounding speech model

Canopy Labs released Orpheus, an open speech language model that produces natural, human-sounding speech, headlined by a 3B model with smaller variants (1B, 500M, 150M) in the family. Weights are on Hugging Face with a Colab for trying it out, discussed on the show with Daily.co CEO Kwindla Kramer in the voice AI segment.

Blog ↗HF ↗Colab ↗

🎙️ Hear our coverage →

#voice-ai #open-source

LG AI Research Mar 20, 2025

New ModelsOpen weights

EXAONE Deep 32B

LG open sources EXAONE and EXAONE Deep 32B reasoning model

LG AI Research open sourced its EXAONE family, headlined by EXAONE Deep 32B, a thinking/reasoning model. The release puts a large Korean lab's reasoning model in open weights on Hugging Face, and Alex published a live reaction video to the launch.

LG Blog ↗HuggingFace page ↗Alex Reaction Video ↗

🎙️ Hear our coverage →

#open-source #reasoning

Mistral AI Mar 20, 2025

New ModelsOpen weights

Mistral Small 3.1

Mistral Small 3.1 24B: open-weights multimodal model

Mistral released Mistral Small 3.1, a 24B-parameter open-weights model that adds multimodal (vision) capabilities to the Small line. Both instruct and base checkpoints were published on Hugging Face, making it a strong local multimodal option at the 24B size class.

Blog Post ↗HuggingFace page ↗Base Model on HF ↗

🎙️ Hear our coverage →

#open-source #multimodal #vision

NVIDIA Mar 20, 2025

New ModelsOpen weights

Canary 1B/180M Flash

NVIDIA Canary Flash: Apache 2 speech recognition and translation

NVIDIA released Canary 1B Flash and 180M Flash, Apache 2.0 licensed speech recognition and translation models built as Llama finetunes. The permissive license makes them freely usable for commercial ASR and translation workloads.

🎙️ Hear our coverage →

#voice-ai #multilingual #open-source

NVIDIA Mar 20, 2025

New ModelsOpen weights

Llama-Nemotron (Super 49B, Nano 8B)

NVIDIA drops Llama-Nemotron reasoning models plus training dataset

NVIDIA released the Llama-Nemotron family, including Super 49B and Nano 8B reasoning models, announced around GTC. Alongside the open weights, NVIDIA published the Llama-Nemotron post-training dataset, giving the community both the models and the data recipe behind them.

Announcement ↗X ↗Llama-Nemotron HuggingFace Collection ↗Dataset ↗

🎙️ Hear our coverage →

#open-source #reasoning #training

Roboflow Mar 20, 2025

New ModelsOpen weights

RF-DETR

Roboflow drops RF-DETR, a SOTA open-source object detection model

Roboflow released RF-DETR, a state-of-the-art real-time object detection model, announced as breaking news on the show by CEO Joseph Nelson. The model is fully open source on GitHub and targets practical, deployable computer vision workloads.

RF-DETR Blog Post ↗RF-DETR Github ↗

🎙️ Hear our coverage →

#vision #open-source

StepFun Mar 20, 2025

New ModelsOpen weights

Step-Video-TI2V

StepFun releases Step-Video-TI2V image-to-video model

Chinese lab StepFun dropped Step-Video-TI2V, an open text/image-to-video generation model. Weights are on Hugging Face with code on GitHub, adding another open-weights option to the fast-moving video generation space.

TI2V HuggingFace Space ↗TI2V Github ↗

🎙️ Hear our coverage →

#video-gen #open-source

Tencent Mar 20, 2025

New ModelsOpen weights

Hunyuan3D 2.0 MV & Turbo

Tencent updates Hunyuan3D 2.0 with MultiView and Turbo variants

Tencent updated its Hunyuan3D 2.0 image-to-3D model with an MV (MultiView) version that conditions on multiple input views, plus a faster Turbo variant. The show highlighted it as new SOTA for 3D generation, available to try in a Hugging Face space.

Hunyuan3D-2mv HF Space ↗

🎙️ Hear our coverage →

#world-models #open-source

Allen Institute for AI (Ai2) Mar 13, 2025

New ModelsOpen weights

OLMo 2 32B

AllenAI ships OLMo 2 32B, a fully open GPT-4-class model

The Allen Institute for AI released OLMo 2 32B, its biggest fully open model yet, with weights, code, and dataset all published under Apache 2.0. Announced by Nathan Lambert as a last-second addition, it reportedly beats GPT-3.5 and GPT-4o mini as well as leading open-weight models like Qwen and Mistral at its size.

X announcement ↗Blog ↗Try It ↗Follow-up tweet ↗

🎙️ Hear our coverage →

#open-source #research

Cohere Mar 13, 2025

New ModelsOpen weights

Command A

Cohere Command A: 111B enterprise model with 256K context on just 2 GPUs

Cohere announced Command A, a 111B parameter open-weights model with a 256K context window, presented on the show by Cohere's Sandra Kublik. It runs on only two GPUs where models of this size typically require around 32, and is built for enterprise use: agentic tasks, tool use, multilingual performance, and secure private deployments.

🎙️ Hear our coverage →

#open-source #industry #agents

E EuroBERT team Mar 13, 2025

New ModelsOpen weights

EuroBERT

EuroBERT: multilingual encoder models from 210M to 2.1B parameters

EuroBERT is a new family of multilingual encoder models ranging from 210M to 2.1B parameters, trained on a 5 trillion-token dataset across 15 languages with 8K context support. It targets European and global language NLP tasks like retrieval and RAG, where properly encoding non-English character sets matters.

🎙️ Hear our coverage →

#open-source #search #multilingual

Google DeepMind Mar 13, 2025

New ModelsOpen weights

Gemma 3

Google open sources Gemma 3, 1B-27B multimodal family with 128K context

Google released Gemma 3, an open-weights model family spanning 1B to 27B parameters with multimodal (text, image, video) capabilities, support for over 140 languages, and a 128K context window. The 27B model runs on a single GPU, with Sundar Pichai claiming competitors need roughly 10x the compute for similar performance. It shipped with day-one open source ecosystem support (Hugging Face, Ollama, Kaggle) plus ShieldGemma 2 for content moderation.

Blog ↗AI Studio ↗HF Collection ↗Hugging Face (27B) ↗

🎙️ Hear our coverage →

#open-source #multimodal #on-device

H HPC-AI Tech Mar 13, 2025

New ModelsOpen weights

Open-Sora 2.0

OpenSora 2.0: 11B open-source video model trained for $200K

OpenSora 2.0 is an 11B parameter open-source video generation model that claims state-of-the-art results while costing only about $200,000 to train. The team claims performance approaching OpenAI's Sora on some benchmarks, underscoring how fast open-source video generation is improving.

🎙️ Hear our coverage →

#video-gen #open-source

Nous Research Mar 13, 2025

New ModelsOpen weights

DeepHermes 3 (24B / 3B)

Nous Research releases DeepHermes 24B and 3B hybrid reasoning models

Nous Research released DeepHermes hybrid reasoners at 24B (Mistral-based) and 3B sizes, models that can toggle between standard chat responses and long chain-of-thought reasoning. The 24B preview is available on Hugging Face as part of the week's wave of open-source reasoning model releases.

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #reasoning

Reka AI Mar 13, 2025

New ModelsOpen weights

Reka Flash 3

Reka Flash 3: 21B open-source reasoning model under Apache 2.0

Reka AI open sourced Reka Flash 3, a 21B parameter reasoning model released under an Apache 2.0 license and trained with the REINFORCE Leave One-Out (RLOO) reinforcement learning technique. It excels at chat, coding, instruction following, and function calling, with Nisten calling it possibly one of the best ~20B models available.

Blog ↗Hugging Face ↗X announcement ↗

🎙️ Hear our coverage →

#open-source #reasoning

R Remade AI Mar 13, 2025

New ModelsOpen weights

Wan 2.1 14B I2V LoRA video effects

Remade AI releases 8 open LoRA video effects for Wan 2.1

Remade AI published eight LoRA video effects for Alibaba's Wan 2.1 14B image-to-video model, including effects like squish, inflate, deflate, and cakeify. The open release shows video effects becoming trainable and customizable via LoRAs on top of open video models.

Hugging Face collection ↗

🎙️ Hear our coverage →

#video-gen #open-source

AI21 Labs Mar 6, 2025

New ModelsOpen weights

Jamba 1.6 Large & Mini

AI21 releases Jamba 1.6 Large and Jamba 1.6 Mini open-weights models

AI21 Labs released Jamba 1.6 in Large and Mini sizes, updating its hybrid SSM-Transformer (Mamba-based) model family with open weights on Hugging Face. The Jamba architecture targets long-context efficiency compared to pure transformer models.

Announcement (X) ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #architecture

Alibaba (Qwen) Mar 6, 2025

New ModelsOpen weights

QwQ-32B

Qwen releases QwQ-32B reasoning model that matches R1 on some evals

Alibaba's Qwen team released QwQ-32B, an open-weights reasoning model that matches DeepSeek R1 on several evals despite being roughly 20x smaller at 32B parameters. Qwen tech lead Junyang Lin joined the show to announce it, and the episode dubbed it Alibaba's 'R1 killer' for bringing strong reasoning to a size that runs on consumer hardware.

Announcement (X) ↗Blog ↗Hugging Face ↗Chat Demo ↗

🎙️ Hear our coverage →

#open-source #reasoning

Cohere For AI Mar 6, 2025

New ModelsOpen weights

Aya Vision

Cohere For AI releases Aya Vision 8B and 32B open multilingual vision models

Cohere For AI released Aya Vision in 8B and 32B sizes, extending the multilingual Aya family with open-weights vision-language capabilities. The models target multilingual multimodal understanding across many languages.

Announcement (X) ↗Hugging Face Collection ↗

🎙️ Hear our coverage →

#open-source #vision #multilingual

E ElectricAlexis (research) Mar 6, 2025

New ModelsOpen weights

NotaGen

NotaGen open symbolic music model generates classical sheet music

NotaGen is an open symbolic music generation model that produces high-quality classical sheet music rather than raw audio. The release includes code on GitHub, weights on Hugging Face, and a browser demo.

GitHub ↗Demo ↗Hugging Face ↗

🎙️ Hear our coverage →

#audio #open-source

Tencent Mar 6, 2025

New ModelsOpen weights

HunyuanVideo-I2V

Tencent releases HunyuanVideo-I2V open image-to-video model

Tencent finally shipped the long-awaited image-to-video version of HunyuanVideo, with open weights on Hugging Face and a hosted try-it experience. It lets users animate still images using one of the strongest open video generation models.

Announcement (X) ↗Hugging Face ↗Try It ↗

🎙️ Hear our coverage →

#video-gen #open-source

Zhipu AI (GLM) Mar 6, 2025

New ModelsOpen weights

CogView 4 (6B)

Zhipu AI open-sources CogView 4, a 6B text-to-image model

Zhipu AI released CogView 4, a 6B-parameter open text-to-image model in the CogView family, with code available on GitHub. It is notable as an open-weights image generation option with strong Chinese and English prompt support.

Announcement (X) ↗GitHub ↗

🎙️ Hear our coverage →

#image-gen #open-source

February 2025

DeepSeek Feb 27, 2025

Dev ToolsOpen weights

Open Source Week infra releases

DeepSeek open-sources its infra stack during Open Source Week

DeepSeek ran its Open Source Week, releasing a series of production infrastructure repos (including FlashMLA, DeepEP, and DeepGEMM) that power its training and inference stack. The drops gave the open-source community a rare look at the low-level kernels and communication libraries behind DeepSeek's efficient frontier models.

🎙️ Hear our coverage →

#open-source #infrastructure

Microsoft Feb 27, 2025

New ModelsOpen weights

Phi-4-multimodal

Microsoft releases Phi-4-multimodal and Phi-4-mini open weights

Microsoft expanded the Phi family with Phi-4-multimodal-instruct, a small open-weights model that handles text, vision, and audio in a single model, alongside a compact Phi-4-mini. The weights shipped on Hugging Face, continuing Microsoft's push for capable small models that can run on-device.

Blog ↗HuggingFace ↗

🎙️ Hear our coverage →

#open-source #on-device #multimodal

Arc Institute & NVIDIA Feb 20, 2025

New ModelsOpen weights

Evo 2

Arc Institute and NVIDIA release Evo 2, a 40B state-of-the-art genomics model

Arc Institute and NVIDIA introduced Evo 2, a state-of-the-art genomics model with around 40 billion parameters trained on 9.3 trillion nucleotides. It uses the StripedHyena architecture to process genetic sequences up to 1 million nucleotides, enabling prediction of genetic mutation effects and even design of entire genomes. Fully open: two papers, weights, data, and training and inference codebases.

Announcement on X ↗

🎙️ Hear our coverage →

#research #open-source #architecture

Haize Labs Feb 20, 2025

Dev ToolsOpen weights

Verdict

Haize Labs open-sources Verdict, a framework for composing LLM judges

Haize Labs released Verdict, an open-source framework for composing LLM judges that tackles core LLM-as-a-judge problems: self-preference bias, prompt sensitivity, and meta-evaluation. Verdict combines simpler judging primitives into more robust and efficient evaluators ('judge-time compute scaling'), achieving near state-of-the-art results on benchmarks like ExpertQA at a fraction of the cost, fast enough to use as a real-time guardrail. Co-founders Leonard Tang and Nimit joined the show to discuss it.

Whitepaper ↗GitHub ↗Thread on X ↗

🎙️ Hear our coverage →

#benchmarks #open-source

H Hao AI Lab Feb 20, 2025

Dev ToolsOpen weights

FastVideo

Hao AI Lab's FastVideo makes HunyuanVideo 3x faster with no extra training

Hao AI Lab released FastVideo, a method that makes HunyuanVideo (HY-Video) three times faster with no additional training, using a technique called Sliding Tile Attention that outperforms even flash attention for this workload. Faster inference makes open-source video models far more practical, and it supports HY-Video LoRAs for fine-tuned applications.

🎙️ Hear our coverage →

#video-gen #infrastructure #open-source

Hugging Face Feb 20, 2025

Also ReleasedOpen weights

Ultra Scale Playbook

Hugging Face publishes the Ultra Scale Playbook for training on GPU clusters

Hugging Face released the Ultra Scale Playbook, a guide to building and scaling AI models on large GPU clusters. The team ran 4,000 scaling experiments on up to 512 GPUs to distill practical guidance for labs training big models.

Hugging Face ↗

🎙️ Hear our coverage →

#training #infrastructure #open-source

Perplexity Feb 20, 2025

New ModelsOpen weights

R1-1776

Perplexity releases R1-1776, a censorship-free DeepSeek R1 fine-tune

Perplexity open-sourced R1-1776, a fine-tuned version of DeepSeek R1 designed to remove Chinese government censorship on topics like Tiananmen Square and Taiwanese independence. They used human experts to identify around 300 sensitive topics and built a censorship classifier to train the bias out, claiming no significant impact on standard eval performance. The name 1776 is a nod to American independence.

Hugging Face ↗Blog post ↗

🎙️ Hear our coverage →

#open-source #reasoning #safety

StepFun Feb 20, 2025

New ModelsOpen weights

Step-Video-T2V

StepFun open-sources Step-Video-T2V, a SOTA 30B text-to-video model

StepFun released Step-Video-T2V (plus a T2V Turbo variant), a 30 billion parameter state-of-the-art text-to-video model under an MIT license. Results impressed especially on text integration, such as rendering 'We will open source' on a scroll as a character unfurls it, marking one of the strongest open-source video drops of the week.

Paper ↗Hugging Face ↗GitHub ↗Try it ↗

🎙️ Hear our coverage →

#video-gen #open-source

January 2025

Alibaba (Qwen) Jan 30, 2025

New ModelsOpen weights

Qwen2.5-VL

Alibaba ships Qwen2.5-VL open vision-language model family

Alibaba's Qwen team released Qwen2.5-VL, open-weights vision-language models up to 72B that handle images, documents, video understanding, and on-screen agentic grounding. The 72B Instruct model was immediately available on Hugging Face and in Qwen Chat.

72B Largest variant

Project blog ↗Hugging Face ↗GitHub ↗Try it (Qwen Chat) ↗

🎙️ Hear our coverage →

#vision #open-source #multimodal

Allen Institute for AI (Ai2) Jan 30, 2025

New ModelsOpen weights

Tulu 3 405B

Allen Institute releases Tulu 3 405B open post-trained model

The Allen Institute for AI scaled its fully open Tulu 3 post-training recipe to a 405B-parameter model based on Llama 3.1 405B. It demonstrates that Ai2's open RLVR post-training pipeline works at frontier scale, with weights and recipe released openly.

405B Parameters

Blog ↗Hugging Face collection ↗

🎙️ Hear our coverage →

#open-source #training

Block Jan 30, 2025

Dev ToolsOpen weights

Goose

Block open-sources Goose, a local AI agent framework

Block (the company behind Square) released Goose, an open-source local agent framework that runs on your machine and can use any LLM to execute tasks with tools. It was a centerpiece of the show's agents discussion as an open alternative for building autonomous workflows locally.

X announcement ↗GitHub / docs ↗

🎙️ Hear our coverage →

#agents #open-source #coding

Browser Use Jan 30, 2025

Dev ToolsOpen weights

Browser-use

Browser-use: open-source alternative to OpenAI's Operator

Browser-use is an open-source library that lets LLM agents control a real web browser, positioned on the show as the OSS counterpart to OpenAI's Operator. It enables anyone to build browsing agents with their model of choice instead of a closed hosted product.

🎙️ Hear our coverage →

#agents #open-source

DeepSeek Jan 30, 2025

New ModelsOpen weights

Janus Pro

DeepSeek Janus Pro: open multimodal models in 1.5B and 7B

Amid the R1 frenzy, DeepSeek also released Janus Pro, unified multimodal models at 1.5B and 7B parameters that handle both image understanding and image generation. The open release added to DeepSeek's week of dominating AI news headlines.

1.5B / 7B Model sizes

GitHub ↗Try it (HF Space) ↗

🎙️ Hear our coverage →

#open-source #image-gen #multimodal

M M-A-P (Multimodal Art Projection) Jan 30, 2025

New ModelsOpen weights

YuE 7B

YuE 7B: open-source Suno-style music generation model

The Multimodal Art Projection (M-A-P) team released YuE, a 7B open-source music generation model dubbed the 'open Suno' on the show, capable of generating full songs with vocals from lyrics. Weights are on Hugging Face with code on GitHub and a hosted demo on fal.ai.

7B Parameters

Demo (fal.ai) ↗Hugging Face ↗GitHub ↗

🎙️ Hear our coverage →

#voice-ai #audio #open-source

Mistral AI Jan 30, 2025

New ModelsOpen weights

Mistral Small 2501

Mistral Small 2501: 24B open-weights model under Apache 2.0

Mistral AI released Mistral Small 2501, a 24B-parameter instruct model under the permissive Apache 2.0 license. Announced as breaking news during the show, it continues Mistral's tradition of strong small open models suitable for fine-tuning and local deployment.

24B Parameters

Hugging Face ↗

🎙️ Hear our coverage →

#open-source #on-device

NVIDIA Jan 30, 2025

New ModelsOpen weights

Eagle 2

NVIDIA releases Eagle 2 open vision-language models

NVIDIA published Eagle 2, a family of open vision-language models with an accompanying paper, model weights on Hugging Face, and a live demo. It is a fully transparent VLM release covering training data strategy and recipes, competitive with much larger vision models.

Paper ↗Models (HF collection) ↗Demo ↗

🎙️ Hear our coverage →

#vision #open-source #multimodal

O Open Thoughts Jan 30, 2025

DatasetsOpen weights

OpenThoughts-114k

Open Thoughts releases OpenThoughts-114k reasoning dataset

An open reasoning dataset with 114k examples released by the Open Thoughts project to fuel open replication of reasoning models like DeepSeek R1. It gives the open-source community high-quality chain-of-thought training data for distilling and fine-tuning reasoning LLMs.

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #reasoning #training

UC Berkeley Jan 30, 2025

Papers & ResearchOpen weights

TinyZero & RAGEN

Berkeley TinyZero and RAGEN replicate DeepSeek R1-Zero

Berkeley researchers released TinyZero and RAGEN, open replications of DeepSeek's R1-Zero reinforcement-learning recipe on small models. The projects showed that R1-style emergent reasoning behavior can be reproduced cheaply, with training runs logged publicly on Weights & Biases.

GitHub ↗W&B logs ↗

🎙️ Hear our coverage →

#reasoning #training #open-source

ByteDance Jan 23, 2025

New ModelsOpen weights

UI-TARS

ByteDance UI-TARS: open computer-use models that control your PC

ByteDance released UI-TARS, open computer-use models in 7B and 72B parameter sizes that can control a Mac or PC, with desktop apps for both platforms. ByteDance claims they beat GPT-4-class models on GUI/computer-control benchmarks.

7B / 72B Model sizes

UI-TARS-7B-SFT on Hugging Face ↗UI-TARS desktop on GitHub ↗

🎙️ Hear our coverage →

#agents #open-source

DeepSeek Jan 23, 2025

New ModelsOpen weights

DeepSeek R1

DeepSeek R1: MIT-licensed open source reasoning model rivals o1

DeepSeek released R1, a state-of-the-art open source reasoning model under a permissive MIT license. It matches or beats OpenAI's o1 on key reasoning benchmarks while being fully open weights, and DeepSeek also shipped a family of distilled smaller models. The show called this the hottest week open source AI has ever had.

DeepSeek on Hugging Face ↗Combine DeepSeek R1 reasoning with GPT-3.5 Turbo (egghead) ↗Run DeepSeek with more thinking (Gist) ↗

🎙️ Hear our coverage →

#open-source #reasoning

Hugging Face Jan 23, 2025

New ModelsOpen weights

SmolVLM (256M)

Hugging Face SmolVLM: tiny vision-language models run on WebGPU

Hugging Face released SmolVLM, a family of tiny vision-language models including a 256M-parameter version small enough to run entirely in the browser via WebGPU. It demonstrates how far efficient multimodal models have shrunk while remaining usable.

256M Parameters (smallest VLM)

SmolVLM-256M WebGPU demo on Hugging Face ↗

🎙️ Hear our coverage →

#vision #open-source #on-device

Tencent Jan 23, 2025

New ModelsOpen weights

Hunyuan3D 2.0

Tencent Hunyuan3D 2.0: SOTA open source 3D generation

Tencent released Hunyuan3D 2.0, a state-of-the-art open source 3D asset generation model on Hugging Face. It produces high-quality 3D shapes and textures and pushes open weights forward in the 3D generation category.

Hunyuan3D-2 on Hugging Face ↗

🎙️ Hear our coverage →

#world-models #open-source