Episode Summary
ThursdAIβs first December episode was a full firehose: DeepSeek V3.2 dropped with gold-medal-level reasoning results, Mistral returned to Apache 2.0 with new large and edge models, and Arcee joined to talk about building US-trained MOEs from scratch. The panel unpacked what these releases mean for open-source momentum, inference cost, and real enterprise adoption constraints. On the closed-model side, OpenAI reportedly hit a βcode redβ response to Gemini 3 pressure while Amazon rolled out Nova 2 across text, speech, and multimodal stacks. The show closed with rapid updates across eval tooling, video generation, realtime voice, and low-cost image diffusion.
In This Episode
Hosts & Guests
By The Numbers
π₯ Breaking During The Show
π Open Source LLMs
The panel went deep on DeepSeek V3.2, Mistral 3, Arcee Trinity, and Hermes 4.3 as proof that open models are moving fast on both reasoning and coding utility. They discussed benchmark context, licensing shifts back to Apache 2.0, and why MoE architecture plus efficient post-training is changing the economics of open AI.
- DeepSeek V3.2-Speciale posted gold-level olympiad and AIME results with MIT license
- Mistral Large 3 and Ministral 3 relaunched under Apache 2.0 with strong open-model coding positioning
- Arcee Trinity introduced US-trained open MoEs and previewed Trinity-Large for January 2026
- Hermes 4.3 highlighted decentralized training and RefusalBench performance
π’ Big CO LLMs + APIs
Coverage shifted to the frontier API race: OpenAIβs reported internal βcode red,β Amazonβs Nova 2 suite, Gemini 3 Deep Think, and Cursorβs temporary free access to GPT-5.1-Codex-Max. The discussion emphasized that product integration and latency matter as much as raw benchmark IQ.
- OpenAI reportedly paused side projects to focus on intelligence and speed
- Amazon Nova 2 announced Lite, Pro, Sonic, and Omni with major benchmark jumps
- Gemini 3 Deep Think introduced high-cost parallel reasoning with ARC-AGI-2 gains
- Cursor offered GPT-5.1-Codex-Max free access through Dec 11
β‘ This Weekβs Buzz
Weights & Biases launched LLM Evaluation Jobs to run evaluations against OpenAI-compatible APIs during training cycles, not just at the end. The segment framed this as a practical workflow upgrade for teams trying to move faster without blindly burning compute.
- W&B launched LLM Evaluation Jobs
- Supports evaluating OpenAI-compatible endpoints
- Focus on earlier model quality signals during development
π₯ Vision & Video
Video model updates included Runway Gen-4.5 leaderboard gains and two Kling releases spanning native audio video and image generation. The updates continued the theme that video quality and multimodal consistency are improving week-over-week.
- Runway Gen-4.5 reached top text-to-video leaderboard position
- Kling VIDEO 2.6 introduced native audio generation
- Kling O1 Image expanded image generation capabilities
π Voice & Audio
The show highlighted Microsoft VibeVoice-Realtime-0.5B and its low-latency realtime TTS profile. The segment focused on how sub-second audio response is becoming table stakes for production voice agents.
- Microsoft VibeVoice-Realtime-0.5B shared with ~300ms latency claims
- Voice model availability on Hugging Face
- Realtime speech UX increasingly central to agent products
π¨ AI Art & Diffusion
Image-generation updates centered on speed and cost efficiency, with Pruna P-Image claiming sub-second generation at very low per-image pricing and SeeDream 4.5 adding stronger text rendering and multi-reference fusion.
- Pruna P-Image promoted sub-second image generation at low cost
- SeeDream 4.5 emphasized multi-reference fusion
- Text rendering quality remained a key differentiator
Hosts and Guests
Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
Co Hosts - @WolframRvnwlf, @yampeleg, @nisten, @ldjconfirmed
Guest - Lucas Atkins (@latkins) - CTO Arcee AI
Open Source LLMs
DeepSeek V3.2 and V3.2-Speciale - Gold medal olympiad wins, MIT license (X, HF V3.2, HF Speciale, Announcement)
Mistral 3 family - Large 3 and Ministral 3, Apache 2.0 (X, Blog, HF Large, HF Ministral)
Arcee Trinity - US-trained MOE family (X, HF Mini, HF Nano, Blog)
Hermes 4.3 - Decentralized training, SOTA RefusalBench (X, HF)
Big CO LLMs + APIs
OpenAI Code Red - ChatGPT 3rd birthday, Garlic model in development (The Information)
This Weekβs Buzz
WandB LLM Evaluation Jobs - Evaluate any OpenAI-compatible API (X, Announcement)
Vision & Video
Runway Gen-4.5 - #1 on text-to-video leaderboard, 1,247 Elo (X)
Kling VIDEO 2.6 - First native audio generation (X)
Kling O1 Image - Image generation (X)
Voice & Audio
AI Art & Diffusion