New Models
FLUX.1 Kontext
Black Forest Labs drops FLUX.1 Kontext, SOTA image editing
Black Forest Labs, creators of Flux, released Kontext: three models (Pro, Max, and a 12B open-weights Dev in private preview) for consistent, context-aware text and image editing. Unlike GPT-image or VEO-style regeneration, Kontext keeps identity consistent across edits, adding what you ask for without changing your face every generation. Broke as news during the show.
New ModelsOpen weights
DeepSeek-R1-0528
DeepSeek drops R1-0528, an updated open reasoning model with big gains
DeepSeek released R1-0528 out of nowhere, an update to their open-weights reasoning model with serious performance jumps: AIME 91, LiveCodeBench 73, and SWE-bench Verified 57.6. They also shipped an 8B distilled version based on Qwen3 that can run on a laptop, keeping it among the best open-weight models available.
91 AIME score, beating previous R1 by a mile8B Distilled Qwen3-based version runnable on a laptop
New ModelsOpen weights
j1-nano & j1-micro
Haize Labs releases j1-nano and j1-micro tiny reward models
Haize Labs shipped j1-nano (600M params) and j1-micro (1.7B params), tiny open reward models for judging LLM outputs. Despite their small size, j1-micro scores 80.7% on RewardBench, making capable reward modeling accessible on modest hardware.
New ModelsOpen weights
Chatterbox
Resemble AI open-sources Chatterbox voice cloning with emotion control
Resemble AI released Chatterbox, an open-source voice cloning model with emotion control. Weights and code are public on GitHub and Hugging Face, bringing controllable, expressive voice cloning to the open ecosystem.
New Models
HunyuanPortrait
Tencent's HunyuanPortrait animates portraits from a single photo
Tencent's Hunyuan team published HunyuanPortrait, a model for high-fidelity portrait video generation from a single photo. It animates a still portrait into realistic talking-head video, with an accompanying paper.
New ModelsOpen weights
HunyuanVideo-Avatar
Tencent releases HunyuanVideo-Avatar for audio-driven avatars
Tencent Hunyuan released HunyuanVideo-Avatar, an audio-driven full-body avatar animation model. Feed it audio and a reference image and it animates a full-body avatar in sync, pushing AI-generated humans further toward indistinguishable.
New ModelsOpen weights
AM-Thinking v1
AM-Thinking v1: 32B dense reasoning model beats bigger MoEs at math and code
A 32B dense open-weights reasoning LLM from a new Chinese team that takes on much larger mixture-of-experts models and comes out on top for math and code, hitting 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard. It supports a /think reasoning toggle, ships with a permissive license, is tooled for vLLM, LM Studio, and Ollama, and runs at 25 tokens/sec on a single 80GB GPU with INT4 quantization. A multilingual RLHF pass and 128k context window are in the works.
32B dense parameters85.3% AIME 202425 tokens/sec on a single 80GB GPU with INT4
New ModelsOpen weights
Wan 2.1
Alibaba's Wan 2.1: open-source diffusion-transformer text-to-video suite
Alibaba, the team behind the Qwen LLMs, released Wan 2.1, a full stack of open-source diffusion-transformer text-to-video foundation models. Amid the show's discussion of video-model fatigue, this was called out as a release that cuts through the noise, with weights on Hugging Face and code on GitHub.
New Models
Seed1.5-VL
ByteDance publishes Seed1.5-VL, a 20B vision-language thinking model
ByteDance's Seed team published the technical report for Seed1.5-VL, a 20B-parameter vision-language model with thinking capabilities. It was covered among the big-company releases of the week, with the tech report shared on GitHub.
New Models
LTX Video (distilled)
LTX distilled model enables near real-time video generation
Lightricks shared a distilled version of its LTX video model that generates video at near real-time speeds. It was highlighted in the vision and video segment as a notable speed milestone for video generation.
New ModelsOpen weights
Stable Audio Open Small
Stability AI and Arm release Stable Audio Open Small for on-device audio
Stability AI, together with Arm, released Stable Audio Open Small, a 341M-parameter open text-to-audio model built for real-world on-device deployment. The show framed it as part of a small comeback for Stability, with weights on Hugging Face and an accompanying paper.
New ModelsOpen weights
Step1X-3D
StepFun's Step1X-3D: open two-stage framework for textured 3D assets
StepFun released Step1X-3D, an open two-stage framework for high-fidelity, controllable generation of textured 3D assets: it first synthesizes watertight geometry, then generates view-consistent textures. Trained on 2M curated meshes, the release also includes a curated dataset of 800K assets and a Hugging Face demo.
New ModelsOpen weights
Falcon-Edge
Falcon-Edge: ternary BitNet LLMs for edge deployment under 1GB VRAM
TII's Falcon-Edge project releases ternary BitNet LLMs (1B and 3B base models) that slash memory and compute requirements, enabling inference on less than 1GB of VRAM. Fine-tuners get pre-quantized checkpoints and a clear path to 1-bit LLMs.
New ModelsOpen weights
Qwen 2.5 Omni
Qwen 2.5 Omni gets an update
Alongside the Qwen 3 launch, Alibaba updated its Qwen 2.5 Omni multimodal model line. Mentioned briefly in the open-source roundup as part of the week's Qwen ecosystem push.
New ModelsOpen weights
Qwen 3
Alibaba open-weights the full Qwen 3 family under Apache 2.0
Alibaba released the entire Qwen 3 stack: two MoE models (235B total/22B active and 30B/3B active) plus six dense siblings from 32B down to 0.6B, all Apache 2.0 with day-one support in LM Studio, Ollama, vLLM, MLX and llama.cpp. The headline feature is a runtime hybrid 'thinking' toggle (/think and /no_think) that trades latency for reasoning depth. Trained on ~36T tokens with 128K context and 119-language coverage, the 235B MoE rivals DeepSeek-R1, o1, o3-mini and Gemini 2.5 Pro on coding and math.
235 B Flagship MoE total parameters (22B active)30 B Qwen3-30B-A3B hit 57 tok/s on a Mac with speculative decoding36 Trillions of pre-training tokens (2x Qwen 2.5)
New ModelsOpen weights
HiDream E1
HiDream E1: open-weights image model with standout Ghibli style
HiDream released E1, an open-weights image editing/generation model (Apache 2.0-style licensing) noted for beautiful Ghibli-style outputs. It ranks #4 on the Artificial Analysis image arena leaderboard, sitting among top contenders like Google Imagen and ReCraft.
New ModelsOpen weights
Mellum-4b-base
JetBrains open-sources Mellum-4b, its code completion focal model
JetBrains published Mellum-4b-base on Hugging Face, a 4B-parameter model specialized for code completion that powers its IDE AI features. Listed in the episode's open-source links roundup.
New ModelsOpen weights
Helium-1
Kyutai releases Helium-1, a 2B European-language model plus dactory pipeline
Kyutai released Helium-1, a 2B-parameter model distilled from Gemma-2-9B and purpose-built for Europe's 24 official languages, under CC-BY 4.0. It sets a new state of the art for its size class on MMLU-EU, ARC-EU and FLORES translation while fitting in under 2GB VRAM for edge and phone deployment. They also open-sourced 'dactory' (MIT), their full Common Crawl data-processing pipeline that scores, dedups and tags webpages.
New ModelsOpen weights
Llama Guard 4
Meta ships Llama protection suite: Llama Guard 4, Firewall, Prompt Guard 2
Meta's LlamaCon security drop included Llama Guard 4 (text + image protection), Llama Firewall (stops prompt hacks and risky code), Prompt Guard 2 (faster jailbreak defense), CyberSecEval 4, and a new Defender Program for security researchers.
New ModelsOpen weights
Phi-4-reasoning
Microsoft ships Phi-4-reasoning and Phi-4-reasoning-plus (14B, MIT)
Microsoft fine-tuned the 14B Phi-4 on 1.4M curated chain-of-thought traces (SFT) and added a small RL stage (Plus variant) to create two MIT-licensed reasoning models. They punch far above their weight: Phi-4-reasoning-plus outperforms DeepSeek-R1-Distill-70B on AIME 25 (78% vs 51%) and sits within a few points of the full 671B DeepSeek-R1, while running on a single GPU with explicit <think> scaffolding.
New ModelsOpen weights
ART·E
OpenPipe's ART·E: RL-trained open email agent that beats o3
OpenPipe released ART·E, an Apache 2.0 email research agent built on a 14B Qwen 2.5 backbone, trained on 500K Enron emails plus synthetic Q&A and refined with reinforcement learning. It tops o3 on accuracy (96% vs 90%) while running 5x faster (1.1s median) and 64x cheaper ($0.85 per 1,000 queries), using a simple three-tool loop.
New ModelsOpen weights
MiMo-7B
Xiaomi enters open weights with MiMo-7B, MIT-licensed reasoning family
Xiaomi's first open-weights release is a 7B dense family (Base, SFT, RL, RL-Zero) trained from scratch on 25T tokens with a multi-token-prediction objective and rule-verifiable reinforcement learning. The RL variant matches OpenAI o1-mini on benchmark suites despite being far smaller, scoring 55.4% on AIME 2025 and 49.3% on LiveCodeBench v6, all under an MIT license with vLLM-ready weights.