Qwen3-Omni ships open-weights any-to-any audio, vision, and text
Alongside Qwen3-VL, Alibaba released Qwen3-Omni, an end-to-end omni-modal open-weights model that takes text, image, audio, and video input and can respond with streaming speech. The show treated it as direct evidence of how fast open multimodal systems are improving, with weights on Hugging Face, a GitHub repo, demos, and availability in Qwen Chat and the Model Studio API.
Qwen3-TTS-Flash multilingual text-to-speech lands via Alibaba's API
Part of the same Qwen release streak, Qwen3-TTS-Flash is a low-latency multilingual text-to-speech model with multiple voices and dialect support, offered through Alibaba Cloud Model Studio's API rather than as open weights. It fed into the episode's closing audio-demo pileup, where voice launches were treated as product proof points.
Alibaba's Qwen team shipped Qwen3-VL, its new flagship open-weights vision-language family, headlining the episode's 'Qwen-mas' barrage. The panel discussed it as a practical workflow tool for visual understanding and agentic GUI tasks, not just another model card, with weights, a blog post, and a Hugging Face demo all available at launch.
Wan Animate brings open-weights character animation and replacement
Alibaba's Wan team released Wan 2.2 Animate, an open-weights model that animates a character image from a performance video, replicating motion and expressions, or swaps a character into existing footage. It landed in the episode's closing run of video releases showing multimodal product quality climbing across the board.
DeepSeek V3.1 Terminus refines agents and bilingual output
DeepSeek released V3.1 Terminus, an update to V3.1 with cleaner bilingual output, stronger agentic tool use, and cheaper long-context handling. The open weights are available on Hugging Face, continuing DeepSeek's cadence of iterative open releases.
IBM releases Granite Docling 258M compact document-parsing VLM
IBM published Granite Docling 258M, an ultra-compact open-source vision-language model for document understanding that converts documents into structured output. At just 258M parameters it reinforced the show's point that tiny specialized models are becoming genuinely useful workflow tools.
Kling 2.5 Turbo upgrades AI video generation quality and cost
Kuaishou's Kling AI shipped Kling 2.5 Turbo, an update to its video generation model with better motion, prompt adherence, and cinematic quality at a lower price. Together with Wan Animate it was cited on the show as proof that video model quality is being turbocharged this season.
Liquid AI ships Liquid Nanos, tiny task-specific on-device models
Liquid AI released Liquid Nanos, a family of very small task-specific models built for jobs like extraction, translation, RAG, and tool calling that can run on-device. The collection landed on Hugging Face, fitting the episode's theme of small-but-capable models powering real products.
Meta releases 32B Code World Model for agentic code reasoning
Meta released CWM, a 32B open-weights research model trained to internally model code execution, aimed at agentic code reasoning rather than plain code completion. The weights are on Hugging Face under facebook/cwm, giving the open-source community a new approach to code world modeling.
Moondream 3 preview punches above its weight in the tiny-VLM race
Moondream released a preview of Moondream 3, a small open vision-language model that punches well above its size class. CTO and co-founder Vik Korrapati joined the show to explain why small, capable vision models matter for real product building, framing Moondream 3 as a practical tool rather than a benchmark flex.
Suno rolled out v5, its newest flagship music generation model with cleaner audio quality and more natural vocals. The live audio demos in the show's closing segment were treated as product proof points for how fast AI music quality is climbing.
xAI ships Grok 4 Fast with 2M context at a fraction of the cost
xAI released Grok 4 Fast, a cost-efficient model with a 2M token context window that unifies reasoning and non-reasoning behavior in one set of weights and prices far below Grok 4. The panel treated it as part of the larger competitive pressure cycle on price and speed among frontier labs.
Tongyi DeepResearch: open-source A3B web agent rivals OpenAI Deep Research
Alibaba's Tongyi Lab open-sourced Tongyi DeepResearch, a 30B mixture-of-experts web research agent with only 3B active parameters. The lab claims parity with OpenAI's Deep Research on agentic search and report-writing tasks, and the weights are available on Hugging Face.
HuMo: human-centric multimodal video generation from ByteDance/Tsinghua
ByteDance research and Tsinghua released HuMo, a human-centric video generation model that conditions on multimodal inputs (text, image, and audio) to produce videos of people. The weights are available on Hugging Face.
Luma's Ray3: a 'reasoning' video model with native HDR
Luma AI launched Ray3, a video generation model it bills as a 'reasoning' video model, with native HDR output, a fast Draft Mode, and Hi-Fi mastering. It is available in Luma's Dream Machine and feeds the episode's closing theme of a next wave of video models.
Mistral updates its open reasoning model with Magistral-Small-2509
Mistral published Magistral-Small-2509, an updated checkpoint of its small open-weights reasoning model. The refresh keeps Mistral's open reasoning line current as the open-model competitive baseline moves quickly.
Moondream 3 Preview: 9B MoE VLM with 2B active parameters
Moondream released a preview of Moondream 3, a 9B mixture-of-experts vision-language model with only 2B active parameters. It targets frontier-level visual reasoning at small-model cost, continuing Moondream's run of efficient open vision models.
OpenAI ships GPT-5-Codex, an agentic coding upgrade for Codex
OpenAI released GPT-5-Codex, a version of GPT-5 finetuned for agentic coding inside the Codex product family. It anchors the episode's coding discussion, with the panel focusing on how coding models are becoming trustworthy enough for longer, productized agent workflows rather than just one-shot completions.
Perceptron AI introduces Isaac 0.1, a 2B perceptive-language model
Perceptron AI released Isaac 0.1, a 2B parameter perceptive-language model with open weights on Hugging Face. Despite its small size, the show notes highlight that it 'points better than GPT', excelling at visual grounding and pointing tasks relative to much larger models.
Reka Speech: high-throughput multilingual ASR and speech translation
Reka AI announced Reka Speech, a high-throughput multilingual speech recognition and speech translation model with timestamps, aimed at batch-scale transcription pipelines. It positions Reka in the production ASR market against incumbent transcription APIs.
Tencent launches Hunyuan 3D 3.0 with a hosted 3D studio
Tencent released Hunyuan 3D 3.0, the next version of its 3D asset generation model, available to try through a hosted 3D studio. It continues Tencent's rapid cadence of generative 3D releases.
Alibaba's Tongyi Lab open-sources WebWatcher vision-language research agent
Alibaba's Tongyi Lab open-sourced WebWatcher, a vision-language deep research agent that sets new state-of-the-art results on agentic browsing and research tasks. The 32B model combines visual understanding with web research capabilities and is available on Hugging Face.
Apple's FastVLM-7B lands with a speed-first vision encoder, 85x faster TTFT
Apple released FastVLM-7B, a vision-language model built around a speed-first vision encoder that delivers up to 85x faster time-to-first-token than peer VLMs. Quantized variants (7B-int4, 1.5B-int8) on Hugging Face make it practical for on-device and real-time vision use, anchoring the show's fast-VLM discussion.
Google releases EmbeddingGemma, a 300M-param SOTA embedding model for RAG
Google released EmbeddingGemma, a 300M-parameter open embedding model that achieves state-of-the-art results for its size, aimed at RAG and on-device semantic search. It dropped as breaking news during the show, with browser-based demos like Semantic Galaxy showing it running fully client-side.
Nous Research releases Hermes 4 14B compact hybrid reasoning model
Nous Research launched Hermes 4 at 14B, a compact hybrid reasoning model with tool calling designed for both local and cloud use. It extends the Hermes 4 family down to a size practical for local deployment while keeping reasoning and tool-use capabilities, with a full tech report published on arXiv.
OpenAI ships gpt-realtime and takes the Realtime API to GA
OpenAI shipped the gpt-realtime speech-to-speech model and moved the Realtime API to general availability. The GA release adds remote MCP tool support, image input, and SIP phone calling, making it a full production stack for voice agents and tying into the episode's voice-agents discussion with Kwindla Kramer.
Switzerland launches Apertus-8B and 70B, fully open multilingual LLMs
The Swiss AI Initiative launched Apertus-8B and Apertus-70B, fully open multilingual LLMs trained on 15T tokens covering more than 1,800 languages. The release stands out for full openness (weights, data recipe, and training transparency) and unusually broad language coverage from a national effort.
Tencent open-sources Hunyuan-MT-7B translation model after sweeping WMT2025
Tencent open-sourced Hunyuan-MT-7B, a 7B-parameter machine translation model, after it swept the WMT2025 translation competition. It gives the open-weights community a small, focused translation model that punches well above its size class.
Grok Code 1 takes ~50% of coding traffic on OpenRouter
xAI's new Grok Code 1 coding model rocketed to roughly 50% of all coding traffic on OpenRouter shortly after launch, helped by a free promotional period and fast, cheap inference. The panel discussed it as evidence that the coding-model market is highly price- and speed-sensitive.
Huxe personal audio briefing app opens to everyone
Huxe, the personal audio app from former Google NotebookLM team members, just opened up publicly, generating proactive personalized audio briefings. It came up alongside ChatGPT Pulse as another take on proactive, ambient AI products.
Meta Connect: new AI glasses with a display and neural control interface
At Meta Connect, Meta unveiled new AI glasses featuring a built-in display, a neural wristband control interface, and a new AI mode. The panel treats the glasses as an interface milestone, arguing the product surface for AI is shifting from apps to display-equipped wearables.
Reve launches 4-in-1 AI visual platform taking on Nano Banana and Seedream
Reve launched a 4-in-1 AI visual creation platform combining image generation, editing, and related visual workflows in one app. The panel spends real time on it as a serious challenger to Nano Banana and Seedream in the AI image tooling race.
Fei-Fei Li's World Labs presents Marble, a generative world model
World Labs, Fei-Fei Li's spatial intelligence startup, presented Marble, a generative world model that creates explorable 3D environments. The demo is treated on the show as evidence that world models are getting meaningfully closer to usable products.
OpenAI introduced ChatGPT Pulse, a preview feature that proactively researches overnight and delivers personalized daily briefing cards based on your chats, memory, and connected apps, initially for Pro users on mobile. On the show it was discussed as part of OpenAI's push to build a durable product moat as raw model access commoditizes.
Google puts Gemini in Chrome with cross-tab AI assistance
Google shipped Gemini directly into Chrome, adding an AI assistant that works across tabs, a smarter omnibox, and safer-browsing features. It moves the browser itself into the AI interface race, putting an assistant in front of Chrome's massive user base.
OpenAI rolled out thinking budgets in the ChatGPT app, letting users control how much reasoning effort the model spends on a request. It is a small but notable product lever for tuning the cost-versus-quality tradeoff of reasoning models.
W&B brings Weave traces into Models workspaces for RL runs
Weights & Biases shipped Weave inside W&B Models workspaces, so reinforcement learning runs can now be logged and inspected with Weave trace tooling alongside training metrics. The show frames it as giving RL training 'x-ray vision' into what the model is actually doing.
Mistral adds 20+ MCP-powered connectors and Memories to Le Chat
Mistral upgraded Le Chat with more than 20 MCP-powered connectors and controllable Memories targeted at enterprise workflows. The update positions Le Chat as a serious enterprise assistant by wiring it into existing tools via the Model Context Protocol while giving users explicit control over what the assistant remembers.
OpenAI opens ChatGPT Projects to free users with bigger uploads
OpenAI rolled out Projects to free ChatGPT users, adding larger file uploads and project-only memory controls. The change brings organized, memory-scoped workspaces to the free tier rather than keeping them behind paid plans.
Jeremy Berman and Eric Pang set new ARC-AGI SOTA using Grok-4
Independent researchers Jeremy Berman and Eric Pang published a new state-of-the-art result on ARC-AGI, built on Grok-4 with heavy test-time compute and iterative program synthesis. Berman joins the show to walk through the method, its limitations, and why iteration matters more than leaderboard narratives; the approach is documented in a detailed write-up.
NBER & OpenAI publish 'How People Use ChatGPT' usage study
OpenAI and NBER published a working paper analyzing ChatGPT usage growth, demographics, and scale. The study gives the first rigorous public look at how the consumer ChatGPT user base actually behaves, feeding the episode's closing discussion of usage stats and momentum.
Hunyuan SRPO: preference optimization that supercharges diffusion models
Tencent Hunyuan published SRPO (Semantic Relative Preference Optimization), a post-training technique that significantly improves the output quality of diffusion image models. The team released weights on Hugging Face along with a project page and striking before/after comparisons.
Gaia2 agent benchmark and Agents Research Environments released
Meta and Hugging Face released Gaia2, a follow-up agent benchmark, together with ARE (Agents Research Environments) for testing agents in dynamic, asynchronous settings. It fed the episode's recurring concern that evaluation has to keep up whenever agent product claims get ambitious.
OpenAI launches GDPval to measure models on real economic work
OpenAI introduced GDPval, an evaluation that measures model performance on real-world, economically valuable tasks drawn from a range of occupations and GDP sectors. On the show it anchored the discussion about agents moving from chat quality toward action and reliability in real environments.
Scale AI debuts SWE-bench Pro, a harder contamination-resistant eval
Scale AI released SWE-bench Pro, a tougher, contamination-resistant successor to SWE-bench for evaluating coding agents on realistic software engineering tasks. It ships with a public dataset on Hugging Face plus separate public and commercial leaderboards, and frontier models score far lower than on the original SWE-bench.
Nous launches Husky Hold'em Bench, an open-source pokerbot eval for LLMs
Nous Research released Husky Hold'em Bench, an open-source poker benchmark that evaluates LLM strategic play in a richer agentic environment than standard leaderboards. Guests Roger Jin and Bhavesh Kumar joined the show to explain how it measures agent behavior and decision-making under uncertainty rather than chasing another leaderboard point.
Nvidia commits up to $100B to OpenAI for 10GW of compute
Nvidia and OpenAI announced a letter of intent under which Nvidia would invest up to $100 billion in OpenAI as the two deploy at least 10 gigawatts of Nvidia systems for OpenAI's next-generation infrastructure. The episode's big-company segment centered on this deal as evidence that money and infrastructure, not just models, now drive the AI race.
Anthropic raises $13B Series F at a $183B post-money valuation
Anthropic closed a $13B Series F round at a $183B post-money valuation, one of the largest private AI raises to date. The panel treated the round as part of a wider story where capital and capability are accelerating together at the frontier labs.
OpenAI fundraises $10B at ~$500B valuation with employee buyback
OpenAI was reported to be raising around $10B at a roughly $500B valuation, structured in part as a share buyback for employees. Together with Anthropic's Series F, it underscored the episode's theme that frontier-lab funding has reached unprecedented scale.
CoreWeave acquires OpenPipe to expand its AI training stack
CoreWeave acquired OpenPipe, the fine-tuning and reinforcement-learning platform behind the ART trainer. Covered in the This Week's Buzz segment, the deal brings OpenPipe's model-customization tooling under the same roof as CoreWeave's GPU cloud and Weights & Biases.
OpenAI acquires Statsig and Alex for $1.1B+ to bolster applications team
OpenAI acquired experimentation platform Statsig and the Alex coding tool for a combined $1.1B+, a move aimed at strengthening its applications team. Statsig's founder reportedly takes on a senior product role as OpenAI invests in shipping consumer and developer products faster.