Episode Summary

ThursdAI’s pre-holiday episode is a speedrun through one of the most stacked AI news weeks of 2025. The crew breaks down Google’s Gemini 3 Flash leap in price-to-performance, OpenAI’s rapid-fire releases (GPT Image 1.5, ChatGPT App Store, and breaking GPT 5.2 Codex), and NVIDIA’s major open-source Nemotron 3 Nano drop. They also cover the rapidly commoditizing voice stack with xAI Grok Voice, Chatterbox Turbo, and Meta’s SAM Audio. With Kwindla joining the panel, the conversation stays practical on what actually matters for builders shipping agents right now.

Hosts & Guests

Alex Volkov
Alex Volkov
Host Β· W&B / CoreWeave
@altryne
Kwindla Hultman Kramer
Kwindla Hultman Kramer
Co-Founder & CEO Β· Daily.co
@kwindla
Wolfram Ravenwolf
Wolfram Ravenwolf
Weekly co-host, AI model evaluator
@WolframRvnwlf
Yam Peleg
Yam Peleg
AI builder & founder
@Yampeleg
Nisten Tahiraj
Nisten Tahiraj
AI operator & builder
@nisten
LDJ
LDJ
Weekly co-host of ThursdAI
@ldjconfirmed
Ryan Carson
Ryan Carson
AI educator & founder
@ryancarson

By The Numbers

per 1M Gemini 3 Flash input tokens
$0.50
Google’s frontier-tier model pricing that resets the cost/performance baseline
SWE-bench Verified
78%
Gemini 3 Flash coding benchmark score highlighted as beating larger models in some agentic tasks
SWE-Bench Pro
56.4%
GPT 5.2 Codex benchmark on specialized coding evaluation
Terminal-Bench 2.0
64%
GPT 5.2 Codex terminal workflow benchmark
Grok Voice Agent API
$0.05/min
Flat-rate voice API pricing from xAI
Nemotron 3 Nano
30B (3B active)
NVIDIA hybrid Mamba-MoE architecture emphasizing efficient active parameters

πŸ”₯ Breaking During The Show

OpenAI drops GPT 5.2 Codex during ThursdAI
Near the end of the episode, OpenAI released GPT 5.2 Codex live during the recording, prompting an immediate benchmark and capability discussion by the panel.

πŸ”“ Open Source LLMs

The panel highlights NVIDIA Nemotron 3 Nano as the most consequential open release of the week, not only for performance but for releasing full training data and recipes. They also cover Allen AI’s BOLMO and OLMO multimodal progress plus Mistral OCR 3’s aggressive pricing and document performance gains.

  • NVIDIA Nemotron 3 Nano: 30B params, 3B active, hybrid Mamba-MoE
  • NVIDIA released weights, reports, recipes, and 25T-token data details
  • BOLMO: byte-level parity breakthrough from Allen AI
  • OLMO multimodal video models (4B/7B/8B)
  • Mistral OCR 3 claims 74% win-rate over OCR v2

🏒 Big CO LLMs + APIs

Google and OpenAI trade major launches in the same week. Gemini 3 Flash stands out for frontier capability at flash-tier price, while OpenAI pushes GPT Image 1.5 and then drops GPT 5.2 Codex during the show as breaking news.

  • Gemini 3 Flash: 78% SWE-bench Verified at flash pricing
  • Google tool-calling scale: up to 100 simultaneous function calls
  • OpenAI GPT Image 1.5: 4x faster, 20% cheaper
  • GPT 5.2 Codex: 400K context with context compaction
  • ChatGPT App Store submissions opened via MCP app model

⚑ This Week’s Buzz

A community moment: Alex announces Wolfram joining Weights & Biases/CoreWeave as an AI Evangelist and β€˜AIvaluator.’ The segment frames 2026 as a more benchmark-driven era for the show and the broader AI community.

  • Wolfram Ravenwolf announced as joining W&B/CoreWeave
  • Focus on deeper public evals and model benchmarking
  • Weave highlighted for practical AI evaluations

πŸ”Š Voice & Audio

Voice AI competition tightens with lower prices and more capable real-time stacks. xAI ships Grok Voice Agent API with Tesla integration and strong audio benchmark positioning, while open-source Chatterbox Turbo and Meta SAM Audio push accessible audio generation and separation.

  • Grok Voice Agent API: $0.05/min pricing and Tesla integration
  • Big Bench Audio leadership claim for Grok voice stack
  • Resemble Chatterbox Turbo: MIT-licensed 350M open TTS
  • Meta SAM Audio: source separation with multimodal prompting

πŸ› οΈ FunctionGemma & Edge Agents

Google’s tiny FunctionGemma release gets a dedicated discussion for what it signals about on-device agents. The model is small enough for constrained hardware and points toward privacy-first local function-calling assistants.

  • FunctionGemma at 270M parameters
  • ~500MB RAM footprint for edge usage
  • Strong improvement after fine-tuning for mobile actions
  • On-device tool use for private assistant workflows

πŸ“° Year-End Coverage Preview

The episode closes by setting up the full 2025 recap planned for the next show. This installment is framed as week-of-news triage, with the year-in-review coming as a separate deep retrospective.

  • Dec 18 episode intentionally focused on weekly drops
  • Full 2025 month-by-month recap queued for next week
  • Team emphasizes pace and acceleration of releases
TL;DR and Show Notes

Hosts and Guests

Open Source LLMs

  • NVIDIA Nemotron 3 Nano - 30B-3A hybrid Mamba-MoE model (X, HF, HF FP8)

  • FunctionGemma - 270M parameter function calling model (X, Blog, Docs)

  • Mistral OCR 3 - Document intelligence model with 74% win rate over v2 (X, Blog, Console)

  • BOLMO from Allen AI - First byte-level model reaching parity with regular tokenization (X)

  • OLMO 2 from Allen AI - Multimodal with video input (4B, 7B, 8B sizes) (X)

Big CO LLMs + APIs

  • Google Gemini 3 Flash - Frontier intelligence at $0.50/1M input tokens, 78% SWE-bench Verified (X, Announcement)

  • OpenAI GPT Image 1.5 - 4x faster, 20% cheaper, #1 on LMSYS Image Arena (X)

  • OpenAI GPT 5.2 Codex - 56.4% SWE-Bench Pro, 64% Terminal-Bench 2.0, 400K context (X, Blog)

  • ChatGPT App Store - MCP-powered apps submission now open (X)

This Week’s Buzz

  • 🐝 Wolfram joins Weights & Biases / CoreWeave as AI Evangelist and AIvaluator!

  • Try Weave for AI evaluations

Voice & Audio

  • xAI Grok Voice Agent API - #1 Big Bench Audio (92.3%), $0.05/min flat rate, powers Tesla vehicles (X)

  • Resemble AI Chatterbox Turbo - MIT-licensed 350M TTS, beats ElevenLabs in blind tests (X, HF, GitHub, Blog)

  • Meta SAM Audio - Audio source separation with text/visual/temporal prompts (X, HF, GitHub)

Show Links

  • Full 2025 Yearly Recap - Coming next week!