Episode Summary
Two open-source labs sent representatives to the show in the same episode โ Lou from Z.AI debuted GLM-5 (744B params, open-weights coding crown) and Olive Song from MiniMax revealed M-2.5 (80.2% SWE-Bench Verified with only 10B active params at 1/20th the cost of Opus). Then Google dropped Gemini 3 Deep Think with an 84% ARC-AGI 2 score โ the biggest single-week jump ever on that benchmark โ and OpenAI answered with GPT 5.3 Codex Spark on Cerebras for real-time coding speeds. Oh, and ByteDance's Seedance 2 shattered video generation reality with 15-second multi-shot clips that feel like stepping into the future.
In This Episode
- ๐ฐ Intro & Highlights of the Week
- ๐ฐ TLDR - This Week's AI News Rundown
- ๐ Interview: Lou from ZAI on GLM-5
- ๐ Panel Discussion: GLM-5 Reactions
- ๐ฅ BREAKING: Minimax M-2.5 Drops Live
- ๐ Interview: Olive Song from Minimax on M-2.5
- ๐ Panel Discussion: Minimax & Open Source Momentum
- ๐ฐ This Week's Buzz - W&B Inference
- ๐ข XAI Restructuring & SpaceX Acquisition
- ๐ฐ Matt Schumer's Viral AI Article & The Acceleration
- ๐ฅ BREAKING: Gemini 3 Deep Think - 84% on ARC-AGI-2
- ๐ฅ BREAKING: GPT 5.3 Codex Spark on Cerebras
- ๐ฅ Seedance 2 - ByteDance's Mind-Bending Video Model
- ๐ค Agent Psychosis & The Sleep Problem
- ๐ฅ Bytedance SeeDance 2.0 - shattering reality
- ๐ฐ Wrap-Up & Goodbye
Hosts & Guests
By The Numbers
๐ฅ Breaking During The Show
๐ฐ Intro & Highlights of the Week
Alex opens with the biggest open-source week in memory โ GLM-5 and MiniMax 2.5 both dropped with representatives joining live. The panel shares their highlights: Wolfram picks GLM-5, Alex picks Seedance 2, and Yam is funding Anthropic's snack budget.
- Both Z.AI and MiniMax sent reps to the show for live interviews
- Open source competing directly with Opus 4.6 on benchmarks
- Seedance 2 from ByteDance breaking everyone's brains
๐ฐ TLDR - This Week's AI News Rundown
Alex runs through all the week's releases: GLM-5 and MiniMax 2.5 competing with Opus, XAI restructuring after SpaceX acquisition, Anthropic's sabotage risk report, OpenAI's deep research upgrade, and ByteDance's Seedance 2 shattering video generation.
- GLM-5: 744B params, open-weights coding crown
- MiniMax 2.5: 80.2% SWE-Bench with 10B active
- Seedance 2: 15-second multi-shot video with sound
๐ Interview: Lou from ZAI on GLM-5
Lou from Z.AI joins at 1 AM Shanghai time to discuss GLM-5's architecture, the new SLIM reinforcement learning framework, and adoption of DeepSeek's sparse attention mechanism. She summarizes the model in four words: bigger, faster, better, and cheaper.
- SLIM: new asynchronous RL framework for post-training
- DeepSeek sparse attention for reduced deployment cost
- GLM-5 trained on Huawei chips, not NVIDIA
๐ Panel Discussion: GLM-5 Reactions
The panel reacts to GLM-5 โ Nisten notes it uses DeepSeek architecture, Ryan highlights the dream of running open-source models locally for Open Claw, and Yam emphasizes it's a model that can run general computer use at close to free.
- Trained on Huawei chips, restricted GPU serving capacity
- 50% Humanities Last Exam, beating Opus 4.5 and Gemini 3 Pro
- 34% lowest hallucination rate on AAA benchmark
๐ฅ BREAKING: Minimax M-2.5 Drops Live
Breaking news during the show โ MiniMax releases M-2.5 just 30 minutes before airtime. Alex brings Olive Song from MiniMax to announce the model live.
- 80.2% SWE-Bench Verified
- 10B active parameters, 200B total
- Dropped live during the show
๐ Interview: Olive Song from Minimax on M-2.5
Olive Song discusses their Forge RL framework, how they trained efficiency into the model (less tool calling, less thinking tokens), and reveals the model is actually still training โ they cut a checkpoint to release because developers were asking.
- Forge: decoupled RL framework training diverse tasks without interference
- Model optimized for end-to-end task time, not just benchmark scores
- Still training โ cut a checkpoint for early release
๐ Panel Discussion: Minimax & Open Source Momentum
The panel discusses the jaw-dropping pace of open-source progress. Nisten notes benchmarking concerns but acknowledges the model's real utility for multi-agent orchestration. LDJ highlights the cost-per-intelligence advantage.
- MiniMax 2.5 beats Gemini 3 Pro on SWE-Bench
- Can run on a Mac Studio M3 Ultra at 80+ tps
- Open source now one week behind frontier on benchmarks
๐ฐ This Week's Buzz - W&B Inference
Alex announces day-zero GLM-5 support on W&B Inference service powered by CoreWeave, with MiniMax 2.5 and Kimi K2.5 coming soon. Free credits available for testing.
- GLM-5 live on W&B Inference day zero
- Free credits for testing via @wandb on X
๐ข XAI Restructuring & SpaceX Acquisition
Multiple XAI co-founders departed after SpaceX acquired XAI. The company restructured into four buckets: LLM/Voice, Coding, and Macro Hard (data centers). Grok 4.2 is nowhere to be found, and they're talking about putting GPUs in space.
- 300,000 GPU Memphis training cluster โ largest in the world
- Jimmy Ba (co-author of Adam) left, said recursive self-improvement coming this year
- Restructured into 4 divisions including Macro Hard
๐ฐ Matt Schumer's Viral AI Article & The Acceleration
The panel discusses Matt Schumer's viral article (74M views) about the speed of AI progress, the gap between AI-native people and everyone else, and Ryan shares a real-world case study of end-to-end AI engineering.
- 74 million views on Matt Schumer's article
- Feb 5 models made everything before feel like a different era
- Harness Engineering case study on Codex in production
๐ฅ BREAKING: Gemini 3 Deep Think - 84% on ARC-AGI-2
Breaking news mid-show: Google drops Gemini 3 Deep Think with 84% on ARC-AGI 2 (up from Opus 4.6's 68% just one week prior) and 48.4% on Humanities Last Exam without tools. The biggest single jump in ARC-AGI history.
- 84% ARC-AGI 2 โ up from 68% (Opus 4.6) one week ago
- 48.4% Humanities Last Exam without tools
- Biggest single-week jump in benchmark history
๐ฅ BREAKING: GPT 5.3 Codex Spark on Cerebras
Another breaking news: OpenAI releases GPT 5.3 Codex Spark, a smaller version of Codex designed for real-time coding, in partnership with Cerebras for insane inference speeds. Available to ChatGPT Pro users.
- First OpenAI model on Cerebras hardware
- Designed for real-time coding at 100+ tokens/sec
- Available in Codex app, CLI, and IDE extension
๐ฅ Seedance 2 - ByteDance's Mind-Bending Video Model
Alex demos ByteDance's Seedance 2, a video generation model that accepts 9 images + 3 videos + 3 audio clips as reference. The multi-shot consistency, native audio, and physics are at a level that makes the original Sora feel like a different era.
- 15-second high-quality multi-shot with native stereo audio
- 9 images + 3 videos + 3 audio clips as input references
- 45-second internal test mode available
๐ค Agent Psychosis & The Sleep Problem
The panel gets real about the mental health impact of running AI agents 24/7. Multiple panelists report sleep disruption, FOMO about underutilizing their agents, and the paradox that tools meant to reduce work are creating more anxiety.
- Ryan wakes up at 2 AM regularly worried about agents
- Wolfram worries about shutting down agents for security
- The primitives for managing agent teams don't exist yet
๐ฅ Bytedance SeeDance 2.0 - shattering reality
Continued deeper dive into Seedance 2 demos โ showing multi-shot character consistency, anime style generation, and native audio with environmental sounds. Available on BytePlus platform.
- Character consistency across multi-shot sequences
- Anime and realistic style modes
- Available on BytePlus platform
๐ฐ Wrap-Up & Goodbye
Alex recaps an insane show: two open-source lab interviews, two breaking news drops (Gemini 3 Deep Think and GPT 5.3 Codex Spark), and Seedance 2 demos. Over 2000 listeners tuned in.
- 2000+ live listeners
- 4 breaking events in one episode
- Coming up on 3 years of ThursdAI
Hosts and Guests
Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
Co Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed) @ryancarson
Olive Song - Lead RL at Minimax @olive_jy_song
Open Source LLMs
Big CO LLMs + APIs
XAI cofounders quit/let go after X restructuring (X, TechCrunch)
Anthropic releases Claude Opus 4.6 sabotage risk report, preemptively meeting ASL-4 safety standards for autonomous AI R&D (X, Blog)
OpenAI upgrades Deep Research to GPT-5.2 with app integrations, site-specific searches, and real-time collaboration (X, Blog)
Gemini 3 Deep Think SOTA on Arc AGI 2, HLE (X)
OpenAI releases GPT 5.3 Codex spark, backed by Cerebras with over 1000tok/sec (X)
This weeks Buzz
Vision & Video
ByteDance Seedance 2.0 launches with unified multimodal audio-video generation supporting 9 images, 3 videos, 3 audio clips simultaneously (X, Blog, Announcement)
AI Art & Diffusion & 3D
Alibaba launches Qwen-Image-2.0: A 7B parameter image generation model with native 2K resolution and superior text rendering (X, Announcement)
Tools & Links
Entire raises $60M seed to build open-source developer platform for AI agent workflows with first OSS release ‘Checkpoints’ (X, GitHub, Blog)
Chrome 146 introduces WebMCP: A native browser API enabling AI agents to directly interact with web services (X)
RyanCarson AntFarm - Agent Coordination (X)
Steve Yegge’s “The AI Vampire” (X)
Matt Shumer’s “something big is happening” (X)