ThursdAI · November 27, 2025

🦃 ThursdAI - Thanksgiving special 25’ - Claude 4.5, Flux 2 & Z-image vs 🍌, MCP gets Apps + New DeepSeek!?

From Weights & Biases, celebrating AI Thanksgiving with Opus 4.5, Flux 2 and Z-image, interview with Ido/Liad leading MCP-apps standard and a Intellect 3, Hunyuan OCR&Video and much more AI news!

By Alex Volkov

81 min

YouTube Spotify Apple Podcasts Substack

What happened in AI the week of November 27, 2025?

Thanksgiving comes every Thursday, and ThursdAI's third annual Thanksgiving special delivered a feast of AI releases to be genuinely thankful for. Anthropic finally brought back Opus 4.5 — and it's reclaiming the coding crown with 80.9% SWE-bench Verified at a third the old price. Open source had its own feast: Prime Intellect's INTELLECT-3 (106B MoE), DeepSeek Math V2, Microsoft's Fara-7B, and BFL's FLUX.2 all dropped in one week. Plus, Ido Salomon and Liad Yosef returned to discuss MCP-UI becoming the official 'MCP Apps' standard adopted by both Anthropic and OpenAI — the foundation of what Alex calls 'the agentic web.'

Open Source LLMs
This Week's Buzz — W&B Serverless LoRA
Interview: MCP Apps & the Agentic Web
Big CO LLMs — Claude Opus 4.5
Vision & Video — HunyuanOCR + LTX Retake

Episode Summary

In This Episode

🔓 Open Source LLMs
⚡ This Week's Buzz — W&B Serverless LoRA
🤖 Interview: MCP Apps & the Agentic Web
🏢 Big CO LLMs — Claude Opus 4.5
🎥 Vision & Video — HunyuanOCR + LTX Retake

Hosts & Guests

Alex Volkov

Host · W&B / CoreWeave

@altryne

Ido Salomon

Monday.com (GitMCP) — AI Lead / Co-creator

@idosal1

Yam Peleg

Weekly co-host of ThursdAI

@Yampeleg

Wolfram Ravenwolf

Weekly co-host, AI model evaluator

@WolframRvnwlf

Nisten Tahiraj

Weekly co-host of ThursdAI

@nisten

By The Numbers

SWE-bench Verified

80.9%

Claude Opus 4.5 — reclaims #1 coding LLM at 1/3 the cost of old Opus

Input tokens (Opus 4.5)

$5/M

Down from old Opus pricing — massive value upgrade for agentic workflows

INTELLECT-3 params

106B

Prime Intellect's MoE model (12B active), 90% on AIME 2024/2025

HunyuanOCR params

Tencent's tiny OCR model beats 72B models with 860 on OCRBench

WebVoyager (Fara-7B)

73.5%

Microsoft's 7B on-device computer use agent beats OpenAI preview

DeepSeek Math V2 params

685B

Open-weights, Apache 2.0, IMO gold-level math reasoning

🔥 Breaking During The Show

Claude Opus 4.5 — Coding Crown Reclaimed

Anthropic dropped Opus 4.5 this week: 80.9% SWE-bench Verified, new Effort parameter, Tool Search, and Programmatic Tool Calling — at 1/3 the old Opus price.

MCP-UI Standardized as MCP Apps by Anthropic + OpenAI

The MCP-UI open standard is now officially 'MCP Apps,' jointly adopted by Anthropic and OpenAI — agents can now render interactive HTML UIs inside chat.

🔓 Open Source LLMs

A Thanksgiving feast of open-source drops: Prime Intellect's INTELLECT-3 (106B MoE) shows a small lab can train frontier-scale models, DeepSeek surfaces a 685B math model with IMO gold performance, and Microsoft's Fara-7B brings on-device computer use to 7B parameters. Z-Image Turbo from Tongyi makes image generation sub-second, and FLUX.2 from BFL enables multi-reference image editing at 32B scale.

INTELLECT-3: 106B MoE, 90% AIME 2024/2025, fully open-sourced training stack
DeepSeek Math V2: 685B Apache-2.0, IMO gold-level — first open-weights math champion
Fara-7B: Microsoft's 7B on-device computer use agent, 73.5% WebVoyager
Z-Image Turbo: sub-second image generation from Tongyi/Alibaba
FLUX.2: 32B multi-reference image editing from Black Forest Labs

Yam Peleg

"It's an incredibly powerful model, open source — large, expensive, open source, heavily, powerful."

Wolfram Ravenwolf

"This amazing actually with the variables you can use, because I've been doing a lot of image editing and you prompt it."

⚡ This Week's Buzz — W&B Serverless LoRA

Alex previews the brand-new Serverless LoRA Inference launch from Weights & Biases on CoreWeave: upload a LoRA adapter to W&B Artifacts, serve it instantly on top of any base model with no cold starts and no dedicated GPU. Alex demos a 'Mocking SpongeBob' LoRA he trained in 25 minutes.

W&B + CoreWeave: upload LoRA adapters, serve instantly via API
No cold starts, no dedicated GPU instances needed
Demo: SaRcAsTiC SpongeBob LoRA on Qwen 2.5 base

Alex Volkov

"Hey folks, welcome to this week's Buzz — the news from this week from Weights & Biases!"

🤖 Interview: MCP Apps & the Agentic Web

Ido Salomon (and Liad Yosef off-camera) return to the show to discuss MCP-UI's transformation into 'MCP Apps' — now an official standard jointly adopted by Anthropic and OpenAI. The pair explain how agents can now render full interactive HTML UIs directly inside chat, ending the era of tool outputs being just plain text.

MCP-UI → MCP Apps: jointly standardized by Anthropic and OpenAI
Agents can now render full interactive HTML UIs in-chat
Avoids 'iOS vs Android' fragmentation: one open standard
mcpui.dev already has demos running with Qwen and Claude

Ido Sal

"MCP Apps, which is the standard that was just released in the weekend, it's actually unification of MCP-UI and what OpenAI was calling Operator Plugins."

Alex Volkov

"LM chatbots stop being just a chat window — and start becoming an operating system for the web."

🏢 Big CO LLMs — Claude Opus 4.5

Anthropic's Opus 4.5 is finally here and it's reclaiming the coding throne: 80.9% SWE-bench Verified, a new 'Effort' parameter for compute control, Tool Search to cut agent overhead, and Programmatic Tool Calling for code-loop data management — all at one-third the old Opus price. Yam and Wolfram both stress-tested it; Yam was blown away by the depth of detail it holds for complex stacks.

Opus 4.5: 80.9% SWE-bench Verified, tops GPT-5.1 (77.9%) and Gemini 3 Pro (76.2%)
New 'Effort' parameter: control thinking depth like o1 reasoning tokens
Tool Search: massively cuts token overhead for agents with many tools
Programmatic Tool Calling: Opus writes and executes code loops
$5/M input, $25/M output — 3x cheaper than old Opus

Yam Peleg

"Opus knows a lot of tiny details about the stack that you didn't even know you wanted. It feels like it can go forever."

Wolfram Ravenwolf

"I chatted with it for a couple of hours actually — it was a monster. Absolutely impressive for reasoning tasks."

🎥 Vision & Video — HunyuanOCR + LTX Retake

Tencent's HunyuanOCR (1B) scores 860 on OCRBench, beating 72B models — a stunning example of task-specialized small models. HunyuanVideo 1.5 brings lightweight open video generation. LTX Studio's Retake enables Photoshop-style editing of specific objects within video frames, and a mysterious 'Whisper Thunder' tops the video arena leaderboard.

HunyuanOCR 1B: 860 OCRBench, beats Qwen3-VL-72B
HunyuanVideo 1.5: lightweight open-source video generation
LTX Retake: video inpainting/object editing — Photoshop for video
Whisper Thunder: mystery model at #1 on video arena

Wolfram Ravenwolf

"What we are seeing is the image editing moment for video. You can take this — Photoshop for video — and change it the way you want."

TL;DR and Show Notes

Hosts and Guests
- Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
- Co-Hosts - @WolframRvnwlf @yampeleg @nisten @ldjconfirmed
- Guests: @idosal1 @liadyosef - MCP-UI/MCP Apps
Big CO LLMs + APIs
- Anthropic launches Claude Opus 4.5 - world’s top model for coding, agents, and tool use (X, Announcement, Blog)
- OpenAI Integrates ChatGPT Voice Mode Directly into Chats (X)
Open Source LLMs
- Prime Intellect - INTELLECT-3 106B MoE (X, HF, Blog, Try It)
- Tencent - HunyuanOCR 1B SOTA OCR model (X, HF, Github, Blog)
- Microsoft - Fara-7B on-device computer-use agent (X, Blog, HF, Github)
- DeepSeek - Math-V2 IMO-gold math LLM (HF)
Interview: MCP Apps
- MCP-UI standardized as MCP Apps by Anthropic and OpenAI (X, Blog, Announcement)
Vision & Video
- Tencent - HunyuanVideo 1.5 lightweight DiT open video model (X, GitHub, HF)
- LTX Studio - Retake AI video editing tool (X, Try It)
- Whisper Thunder - mystery #1 ranked video model on arena
AI Art & Diffusion
- Black Forest Labs - FLUX.2 32B multi-reference image model (X, HF, Blog)
- Tongyi - Z-Image Turbo sub-second 6B image gen (GitHub, HF)
This Week’s Buzz
- W&B launches Serverless LoRA Inference on CoreWeave (X, Blog, Notebook)

Alex Volkov 0:00

everybody to ThursdAI, Thanksgiving special.

0:03

November