ThursdAI · January 8, 2026

ThursdAI - Jan 8 - Vera Rubin's 5x Jump, Ralph Wiggum Goes Viral, GPT Health Launches & XAI Raises $20B Mid-Controversy

From Weights & Biases: NVIDIA CES blowout, the Simpsons-named coding technique everyone's talking about, and AI prescribing medication in Utah

By Alex Volkov

107 min

YouTube Spotify Apple Podcasts Substack

Episode Summary

NVIDIA dominated CES 2026 with the Vera Rubin platform — delivering 5x inference over Blackwell and 75% fewer GPUs for trillion-parameter training — while XAI raised $20B at a $230B valuation amid Grok's bikini-gate scandal. Ryan Carson broke down the Ralph Wiggum autonomous coding technique (1.2M views on X) that lets agents ship features while you sleep, marking the death of "vibe coding." The panel also covered Upstage's Solar Open 100B, Liquid AI's on-device LFM 2.5, NVIDIA's Nemotron Speech ASR with 24ms latency (demoed by Kwindla Hultman Kramer of Daily.co), and OpenAI's GPT Health launch alongside the first US pilot for AI-prescribed medication.

In This Episode

📰 TL;DR - This Week's AI News Rundown
🔓 Open Source: Solar Open 100B
🔓 Miro Thinker 1.5
🔓 Liquid AI LFM 2.5
🔓 Zhipu AI IPO & NousCoder
🏢 NVIDIA CES & Vera Rubin Platform
💰 NVIDIA Groq Acquisition
🔊 Nemotron Speech ASR
🤖 Alpha Mayo Self-Driving
🏢 Grok & XAI: $20B Raise Amid Bikini-Gate
🛠️ Alexa Plus on the Web
🏢 GPT Health & AI Medicine
🤖 Ralph Wiggum: The Autonomous Coding Loop
📰 Wrap Up & Goodbye

Hosts & Guests

Alex Volkov

Host · W&B / CoreWeave

@altryne

Kwindla Hultman Kramer

Daily.co — Co-Founder & CEO

@kwindla

Ryan Carson

AI educator & founder

@ryancarson

Wolfram Ravenwolf

Weekly co-host, AI model evaluator

@WolframRvnwlf

Nisten Tahiraj

AI operator & builder

@nisten

LDJ

Nous Research

@ldjconfirmed

By The Numbers

Vera Rubin vs Blackwell

NVIDIA's next-gen platform delivers 5x inference performance over Blackwell, announced at CES 2026

Fewer GPUs needed

75%

Vera Rubin requires 75% fewer GPUs for 10 trillion parameter MoE training

XAI Series E

$20B

XAI raises $20B at $230B valuation with NVIDIA and Cisco as strategic investors

Solar Open params

102B

Upstage's Solar Open 100B — 102B total parameters, only 12B active per token, trained on 19.7T tokens

Ralph article views

1.2M

Ryan Carson's Ralph Wiggum article on X — autonomous coding technique using atomic user stories

Nemotron Speech latency

24ms

NVIDIA Nemotron Speech ASR — 600M parameter streaming model, 900 concurrent streams on single H100

🔥 Breaking During The Show

Google Gmail Enters the Gemini Era

Breaking during the TLDR segment: Google integrates Gemini 3 into Gmail for 3 billion users with AI Overviews, smart replies, and natural language inbox search.

📰 TL;DR - This Week's AI News Rundown

Alex runs through the week's biggest stories: NVIDIA's Vera Rubin at CES delivering 5x over Blackwell, XAI's $20B raise amid Grok controversy, Solar Open 100B and other open source releases, OpenAI's GPT Health waitlist, Google bringing Gmail into the Gemini era, and the first US pilot for AI-prescribed medication renewals.

NVIDIA Vera Rubin: 5x inference over Blackwell at CES 2026
XAI raises $20B at $230B valuation
Google Gmail enters the Gemini era for 3B users (breaking news)
Doctronic: first US pilot for AI prescription renewals

🔓 Open Source: Solar Open 100B

Upstage releases Solar Open 100B, a 102B parameter MoE model with only 12B active parameters, trained on 19.7 trillion tokens with an innovative data factory approach. LDJ highlights the SNAP PO reinforcement learning technique with a 50% training speedup, and the panel discusses how this model outperforms GLM 4.5 Air on many benchmarks with strong Korean language optimization.

102B params, 12B active, 129 experts with top-8 activation
19.7T training tokens with 4.5T synthetic data
SNAP PO: 50% RL training speedup
Best-in-class Korean language performance

Nisten Tahiraj

"These days, datasets for video or text end up being like in the 20 to 40 terabytes range. There is something to be said about what is synthetic and what is not. This gets very tricky because all the data does have a human source at the end of the day."

🔓 Miro Thinker 1.5

MiroMind AI releases Miro Thinker 1.5, a 30B parameter open source search agent achieving 56.1% on BrowserComp — outperforming trillion-parameter models through 'interactive scaling.' The panel debates the growing importance of agent harnesses in 2026, with Ryan noting that domain-specific harnesses are the bleeding edge and Nisten emphasizing how hard they are to build well.

30B model beating trillion-parameter models on search benchmarks
Interactive scaling: third dimension of scaling beyond params and context
56.1% BrowserComp, 66.8% BrowserComp Chinese
Fine-tune of Qwen 3 Thinking with 147K open training samples

Ryan Carson

"The models are so good now, that people might think, oh, you could just open Chat GPT or open Claude and chat, and you can't, we're still a long way from each model being specifically useful for a specific task."

Nisten Tahiraj

"I just wanted to say it's very hard to make a good harness. It seems easy at first, but it's just like making a tool or a drill or something. It has to be basically perfect."

🔓 Liquid AI LFM 2.5

Liquid AI releases LFM 2.5, a family of tiny ~1B parameter on-device models with text, vision, and audio support, announced at CES alongside AMD's Lisa Su. The models achieve 239 tokens/sec on AMD CPU and 100 tokens/sec on iPhone 16 Pro Max. LDJ highlights the revolutionary end-to-end audio model that skips the traditional ASR-LLM-TTS pipeline entirely.

1.2B params running at 239 tps on AMD CPU, 100 tps on iPhone
End-to-end audio model: no separate ASR or TTS needed
14% on IF-Eval 2025 — impressive for a 1B model
Announced with AMD on stage at CES

Nisten Tahiraj

"That does make it for them, like the best sub two B model right now, the best on device model."

LDJ

"What's really impressive about this too is since it's only 1.5 billion parameters, that means you can run it while having very little ram on your device. Most people have eight gigabytes of ram and it'll be able to run on just that amount."

🔓 Zhipu AI IPO & NousCoder

Zhipu AI (makers of GLM) becomes the world's first major LLM company to IPO on the Hong Kong Stock Exchange, raising $558M. Nous Research releases NousCoder 14B, an open source competitive programming model that achieved a 7% jump on LiveCodeBench accuracy in just four days of RL training on 48 NVIDIA B200 GPUs.

Zhipu AI IPO: $558M raised, first major LLM company to go public
NousCoder 14B: 7% LiveCodeBench jump in 4 days of RL
24,000 verifiable problems used for RL training
Full Apache 2 license with training code and benchmark harness

🏢 NVIDIA CES & Vera Rubin Platform

Jensen Huang unveils the Vera Rubin platform at CES 2026 — NVIDIA's next-gen AI computer delivering 5x inference over Blackwell with only marginally more power draw. LDJ walks through the specs: over 3x the PFLOPS of Blackwell at 1800W, 13 TB/s bandwidth, and 75% fewer GPUs needed for 10T parameter MoE training. Ryan calls it truly astonishing and Nisten marvels at the power efficiency.

Vera Rubin: 50 PFLOPS inference, 5x over Blackwell
3x+ PFLOPS gain while only adding ~200W power
75% fewer GPUs for 10T parameter MoE training
72 GPUs per rack, 20.7 TB memory, 100% liquid cooled
Announced in full production just 4 months after B300

Nisten Tahiraj

"It's three times faster while only adding another 200 watts."

Ryan Carson

"I just wanna say, as someone that spent a bit of time at Intel and had a good time there, just how mind blowing this stuff is, what Nvidia is doing, is truly astonishing."

LDJ

"Keep in mind the B 300 was only announced to be in full production just four months ago and January 6th at CES Jensen announced that Vera Rubin is now in full production."

💰 NVIDIA Groq Acquisition

NVIDIA enters an exclusive licensing deal and acquires most of Groq's team for approximately $20B. Alex explains how Groq's inference-optimized chips, created by former Google TPU lead Jonathan Ross, complement NVIDIA's training dominance — reinforcing the panel's view that there's no AI bubble given insatiable demand for inference.

NVIDIA acquires Groq team and technology for ~$20B
Groq founder Jonathan Ross was instrumental in creating Google TPUs
Inference demand growing exponentially across all AI use cases

🔊 Nemotron Speech ASR

NVIDIA releases Nemotron Speech ASR, a 600M parameter open source streaming speech model with 24ms median latency and support for 900 concurrent streams on a single H100. Alex plays a demo featuring Kwindla Hultman Kramer of Daily.co showing sub-500ms voice-to-voice latency with a three-model pipeline of Nemotron ASR, Nemotron Nano LLM, and Magpie TTS.

600M params — runs on a toaster
24ms median latency, 900 concurrent streams per H100
Sub-500ms total voice-to-voice latency
Demoed by Kwindla Hultman Kramer of Daily.co / PipeCat

Alex Volkov

"Kwindla Kramer from Daily and PipeCat is the guy who Nemotron showed off on stage, shout out to Kwindla, a friend of the pod, basically the expert in everything voice AI."

🤖 Alpha Mayo Self-Driving

LDJ highlights NVIDIA's Alpha Mayo, a family of open source reasoning-based self-driving AI models announced at CES. The model performs end-to-end autonomous driving with explicit reasoning steps like identifying jaywalkers. Alex jokes about whether you want reasoning in a model that needs to make split-second driving decisions.

Open source self-driving model with reasoning steps
End-to-end autonomous drive demo in Mercedes-Benz
Real-time reasoning: identifies jaywalkers, stops accordingly

Alex Volkov

"I don't know if we want the reasoning in my model that drives, decisions to be made fast."

🏢 Grok & XAI: $20B Raise Amid Bikini-Gate

XAI raises $20B at a $230B valuation with NVIDIA as a strategic investor, while Grok faces major backlash over its image model's lack of NSFW guardrails. The panel debates the responsibility of AI products vs tools — Nisten notes guardrails are trivially easy to implement, Wolfram argues for going after bad actors not tools, and Alex draws a sharp line between open-source tools and consumer products embedded in social media.

XAI Series E: $20B raised at $230B valuation
Grok bikini-gate: no guardrails on image model in replies
XAI claimed 600M active users by counting all X users
Panel debates tool vs product responsibility for AI safety

Nisten Tahiraj

"It's not even that hard to put the guardrails, like you just put like a two B VL model and say, hey, is there a minor in this picture?"

Alex Volkov

"There's an absolutely incredible difference. One is a tool, the other one is a product and basically an amplification product that shows this to many people. So there's a big difference and guardrails are important on that product."

🛠️ Alexa Plus on the Web

Alex demos Alexa Plus, Amazon's smart Alexa experience now available as a web chat interface for $20/month. The upgraded assistant supports free-flowing conversations without repeating the wake word, integrates with smart home devices, and can continue conversations across devices. LDJ notes Amazon's earlier Claude partnership and their own Nova model line.

Web-based chat interface for Alexa Plus
Smart home integration with natural language commands
$20/month, text chat only — voice coming later
Continue conversations across devices

🏢 GPT Health & AI Medicine

OpenAI launches a GPT Health waitlist for privacy-first health conversations with connected health records and fitness apps. Nisten explains why LLMs are so good at medicine — only ~2,000 diseases and drugs to master. Ryan asks about Epic/MyChart integration, and the panel discusses Doctronic's first US pilot in Utah where AI can autonomously renew prescriptions at just $4 per renewal.

GPT Health: integrates Apple Health, Function Health, MyFitnessPal, Peloton
LLMs only need to handle ~2,000 diseases and ~2,000 drugs
Doctronic: first US AI prescription renewal pilot in Utah
$4 per renewal, 190 routine medications, excludes controlled substances

Nisten Tahiraj

"There's only about 2000 something prescription drugs. There's only about 2000 or so total diseases. That's nothing for an LLM. This is the most common misconception that people have."

Ryan Carson

"My wife just had an MRI. This is insane. We have to call the hospital and say, can you cut a CD of the images so that we can see it? And we don't even have a CD ROM."

🤖 Ralph Wiggum: The Autonomous Coding Loop

Ryan Carson gives a masterclass on Ralph Wiggum, the autonomous coding technique created by Jeff Huntley that hit 1.2M views on X. The method: write a PRD, break it into atomic user stories with acceptance criteria in JSON, then run a bash loop that tells your CLI agent (Amp, Claude Code, etc.) to pick the next incomplete story, code it, commit, update progress, and loop — shipping features while you sleep. Nisten reveals Ralph's origin story from a San Francisco meetup and how it won a YC hackathon.

Write PRD → atomic user stories in JSON → bash loop agent
Compound learning: agent writes lessons to agents.md each loop
Ryan shipped 5 features in 2 days using Ralph
Won YC hackathon by letting Ralph run overnight on Sonnet 4.5
Works with any CLI agent: Amp, Claude Code, Cursor CLI, Gemini CLI

Ryan Carson

"You would love your agent to build stuff for you while you sleep. Well, how do you actually do that? Models now, especially with Opus four or five agents, are basically able to accomplish a lot of what a junior engineer, even a mid-level engineer could do, with basically no input."

Nisten Tahiraj

"The dumber you make it, the better the results are. All you do with the bash script is you just grab the initial instructions, which are just really simple and stupid. Usually they're just four lines."

Ryan Carson

"This is how real work happens. We don't ever say the word one shot. No real work is done one shot. All work is done through user stories. I think the whole vibe coding term is starting to die, which I think is important."

📰 Wrap Up & Goodbye

Alex wraps the first show of 2026 with over 1,700 live viewers. The episode spanned NVIDIA CES announcements, Ralph Wiggum autonomous coding, GPT Health and AI medicine, and a strong week of open source releases. Wolfram has officially joined Weights & Biases as an AI evangelist focused on evals, and the team teases agentic skills coverage for next week.

1,700+ live viewers for the first show of 2026
Wolfram Ravenwolf officially joins Weights & Biases
Agentic skills and MCP coverage teased for next episode

TL;DR links:

Hosts & Guests
- Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
- Co-Hosts - @WolframRvnwlf, @nisten, @ldjconfirmed
- Special Guest - Ryan Carson (@ryancarson) breaking down the Ralph Wiggum technique.
Open Source LLMs
- Solar Open 100B - Upstage’s 102B MoE model. Trained on 19.7T tokens with a heavy focus on “data factory” synthetic data and high-performance Korean reasoning (X, HF, Tech Report).
- MiroThinker 1.5 - A 30B parameter search agent that uses “Interactive Scaling” to beat trillion-parameter models on search benchmarks like BrowseComp (X, HF, GitHub).
- Liquid AI LFM 2.5 - A family of 1B models designed for edge devices. Features a revolutionary end-to-end audio model that skips the ASR-LLM-TTS pipeline (X, HF).
- NousCoder-14B - competitive coding model from Nous Research that saw a 7% LiveCodeBench accuracy jump in just 4 days of RL (X, WandB Dashboard).
- Zhipu AI IPO - The makers of GLM became the first major LLM firm to go public on the HKEX, raising $558M (Announcement).
Big Co LLMs & APIs
- NVIDIA Vera Rubin - Jensen Huang’s CES reveal of the next-gen platform. Delivers 5x Blackwell inference performance and 75% fewer GPUs needed for MoE training (Blog).
- OpenAI ChatGPT Health - A privacy-first vertical for EHR and fitness data integration (Waitlist).
- Google Gmail Era - Gemini 3 integration into Gmail for 3 billion users, featuring AI Overviews and natural language inbox search (Blog).
- XAI $20B Raise - Elon’s XAI raises Series E at a $230B valuation, even as Grok faces heat over bikini-gate and safety guardrails (CNN Report).
- Doctronic - The first US pilot in Utah for autonomous AI prescription renewals without a physician in the loop (Web).
- Alexa+ Web - Amazon brings the “Smart Alexa” experience to browser-based chat (Announcement).
Autonomous Coding & Tools
- Ralph Wiggum - The agentic loop technique for autonomous coding using small, atomic user stories. Ryan Carson’s breakdown of why this is the death of “vibe coding” (Viral X Article).
- Catnip by W&B - Chris Van Pelt’s open-source iOS app to run Claude Code anywhere via GitHub Codespaces (App Store, GitHub).
Vision & Video
- LTX-2 - Lightricks open-sources the first truly open audio-video generation model with synchronized output and full training code (GitHub, Replicate Demo).
- Avatar Forcing - KAIST’s framework for real-time interactive talking heads with ~500ms latency (Arxiv).
- Qwen Edit 2512 - Optimized by PrunaAI to generate high-res realistic images in under 7 seconds (Replicate).
Voice & Audio
- Nemotron Speech ASR - NVIDIA’s 600M parameter streaming model with sub-100ms stable latency for massive-scale voice agents (HF).

Alex Volkov 0:30

What's going on everyone?