ON AIRThursdAI · July 2, 2026Moscone West · San Francisco

Live fromthe floorof the Fair

Two and a half hours from the middle of the Moscone expo — 7,200 engineers, every lab a sponsor, aisles with street names. Fable is back, GPT‑5.6 lands, local.ai launches, and a ThursdAI-record nine guests pull up a chair.

Meet the lineup Run of show

“The whole point of broadcasting from the middle of the expo floor is that you feel like you're sitting at the table with us.”
Alex Volkov

Floor notes

Aisles with actual street namesDaily printed newspaperPuppy cornerFlash mob on the floorToken-billionaire loungeSub-5% talk acceptance⚽ USA beat Bosnia that night — top-five day of all timeFirst AIE: 500 people. This one: 7,200Next stop: AIE TokyoAisles with actual street namesDaily printed newspaperPuppy cornerFlash mob on the floorToken-billionaire loungeSub-5% talk acceptance⚽ USA beat Bosnia that night — top-five day of all timeFirst AIE: 500 people. This one: 7,200Next stop: AIE Tokyo

YouTube Spotify Apple Podcasts Substack

What happened in AI the week of July 2, 2026?

ThursdAI broadcast live for two and a half hours from the middle of the AI Engineer World's Fair expo floor in San Francisco — 7,200 engineers, every major lab a sponsor, aisles with actual street names. The headline: Fable 5 is back after the ban saga (and Sonnet 5 landed 'meh'). Then a ThursdAI-record nine guests, back to back: Exo Labs' Alex Cheema and Sero launching local.ai (with a surprise NVIDIA crash by Nader Khalil), OpenAI's Dominik Kundel on GPT-5.6 (Sol/Terra/Luna) and Codex, W&B's Zubin Aysola on the Aria auto-research agent (This Week's Buzz), Sakana's Stefania Druga on Fugu, Google DeepMind's Philipp Schmid on OmniFlash and NanoBanana 2 Lite, Darya Volkov's on-air debut running her agency on eight agents, and Swyx closing on what AI Engineer has become.

LIVE from AI Engineer World's Fair
Fable is back (and Sonnet 5 is… meh)
Open source: LongCat-2.0 unmasked (Meituan's Owl Alpha)
The Etched ASIC debate
Exo Labs launches local.ai (+ a surprise NVIDIA crash)
GPT-5.6 with Dominik Kundel (OpenAI)

Credential check

Fourteen lanyards, one table.

Broadcasting from the middle of the floor means guests get grabbed hallway-track style — some scheduled, one crashing the set on purpose. Every badge below opens a profile.

Alex Volkov Weights & Biases / CoreWeave Host · AI Evangelist @altryne Host

Wolfram Ravenwolf CoreWeave Co-host · WolfBench @WolframRvnwlf Co-host

Nisten Tahiraj AI builder & operator Co-host · AI operator & builder @nisten Co-host

LDJ Nous Research Co-host @ldjconfirmed Co-host

Peter Gostev Arena Co-host · model evals @petergostev Co-host

Alex Cheema Exo Labs Co-founder · local.ai launch @alexocheema Launch day

Sero Exo Labs / local AI REAP pruning @0xSero Launch day

Nader Khalil NVIDIA Surprise set-crasher · ex-Brev.dev @NaderLikeLadder Set crasher

Dominik Kundel OpenAI GPT-5.6 & GPT-OSS · DevRel @dkundel Guest

Zubin Aysola Weights & Biases / CoreWeave Aria auto-research agent @zubinaysola Guest

Stefania Druga Sakana AI Fugu router model @Stefania_druga Guest

Philipp Schmid Google DeepMind OmniFlash & NanoBanana 2 · DevRel @_philschmid Guest

Darya Volkov Geeks360 Token-billionaire debut · 8-agent marketing agency @geeks360 Token billionaire

Swyx AI Engineer / Latent Space Founder · closed the show @swyx Built the city

Run of show

Eleven segments, zero dead air.

Fable 5 prepped this run sheet — and shuffled the guest order for no reason. Each segment links to the full chapter notes below.

SEG 01
🎪 LIVE from AI Engineer World's Fair
Broadcasting live from the floor so it feels like you're at the table — guests get grabbed hallway-track style.
SEG 02
🏢 Fable is back (and Sonnet 5 is… meh)
Restored globally July 1 after export controls lifted (June 12 pause) — with new cybersecurity classifiers; it prepped the show's entire run sheet.
SEG 03
🔓 Open source: LongCat-2.0 unmasked (Meituan's Owl Alpha)
LongCat-2.0: 1.6T MoE trained entirely on Chinese ASICs (no NVIDIA), 59.5 SWE-bench Pro, $0.038/M tokens with free cache hits.
SEG 04
🔩 The Etched ASIC debate
Etched announced its LLM ASICs — model weights physically on the chip for major speed and power gains.
SEG 05
🧩 Exo Labs launches local.ai (+ a surprise NVIDIA crash)
local.ai tracks best-model-for-your-hardware, cloud trade-offs, and cost vs. API tokens — early access + signup codes now.
SEG 06
🌞 GPT-5.6 with Dominik Kundel (OpenAI)
GPT-5.6 = Sol (frontier), Terra (~5.5 intelligence at half cost), Luna (small & fast) + new Ultra 'Max' reasoning mode.
SEG 07
💛 This Week's Buzz: W&B Aria goes GA
Aria went GA on Monday — an auto-research agent inside the W&B UI ('Just Ask Aria').
SEG 08
🐡 Sakana Fugu with Stefania Druga
Fugu is recursive, not a dumb dispatcher — it rewrites prompts and verifies outputs before picking a model.
SEG 09
✨ Google DeepMind: OmniFlash + NanoBanana 2 Lite
OmniFlash: first of the any-to-any Omni family — conversational multi-turn video editing (editing Elo 1087, $0.10/second, up to 10s) via the Interactions API.
SEG 10
💙 Darya Volkov's token-billionaire debut
Runs eight agents (each with sub-agents) operating her marketing agency, Geeks360 — two more added live on air.
SEG 11
🫶 Swyx closes: what this whole thing is
First AI Engineer: 500 at Hotel Nikko. This one: 7,200, sold out, sub-5% talk acceptance.

Run sheet — and this page — prepped by Fable 5. It's good to be back. Read the transcript →

Episode Summary

In This Episode

🎪 LIVE from AI Engineer World's Fair
🏢 Fable is back (and Sonnet 5 is… meh)
🔓 Open source: LongCat-2.0 unmasked (Meituan's Owl Alpha)
🔩 The Etched ASIC debate
🧩 Exo Labs launches local.ai (+ a surprise NVIDIA crash)
🌞 GPT-5.6 with Dominik Kundel (OpenAI)
💛 This Week's Buzz: W&B Aria goes GA
🐡 Sakana Fugu with Stefania Druga
✨ Google DeepMind: OmniFlash + NanoBanana 2 Lite
💙 Darya Volkov's token-billionaire debut
🫶 Swyx closes: what this whole thing is

Hosts & Guests

Alex Volkov

Host · AI Evangelist

@altryne

Alex Cheema

Co-founder · local.ai launch

@alexocheema

Sero

REAP pruning

@0xSero

Nader Khalil

Surprise set-crasher · ex-Brev.dev

@NaderLikeLadder

Dominik Kundel

GPT-5.6 & GPT-OSS · DevRel

@dkundel

Zubin Aysola

Aria auto-research agent

@zubinaysola

Stefania Druga

Fugu router model

@Stefania_druga

Philipp Schmid

OmniFlash & NanoBanana 2 · DevRel

@_philschmid

Darya Volkov

Token-billionaire debut · 8-agent marketing agency

@geeks360

Swyx

Founder · closed the show

@swyx

Wolfram Ravenwolf

Co-host · WolfBench

@WolframRvnwlf

Nisten Tahiraj

Co-host · AI operator & builder

@nisten

LDJ

Co-host

@ldjconfirmed

Peter Gostev

Co-host · model evals

@petergostev

By The Numbers

engineers at AIE

7,200

AI Engineer World's Fair 2026 sold out Moscone West — up from 500 at the first one.

guests (a record)

The most guests ThursdAI has ever had on a single show, back to back to back.

Nemotron-3 Ultra on 4 Sparks

550B

Sero showed Nemotron-3 Ultra running local on just four NVIDIA Sparks.

per 1K NanoBanana 2 Lite gens

3¢

Google's NanoBanana 2 Lite: sub-4-second generations starting at three cents per 1,000 images, above original NanoBanana quality.

🔥 Breaking During The Show

local.ai launches at AIE

Exo Labs (Alex Cheema, Sero) announced local.ai on the show — best-model-for-your-hardware tracking, cloud trade-offs, and cost vs. API tokens, with early-access codes for signups.

🎪 LIVE from AI Engineer World's Fair

ThursdAI broadcasts for 2.5 hours from the middle of the Moscone expo floor — right next to the OpenAI booth, with a six-person crew. 7,200 engineers, every major lab a sponsor, aisles with actual street names. The vibe versus London ~85 days earlier: all systems go — agents, token factories, software factories, everyone chasing RSI.

Broadcasting live from the floor so it feels like you're at the table — guests get grabbed hallway-track style.
Contrast with AI Engineer London (~85 days prior): American crowd feels the acceleration, less conceptual.
Alex: top-five day of all time — the show, his talk, Darya there, and Team USA beating Bosnia that night.

Alex Volkov

"The whole point of broadcasting from the middle of the expo floor is that you feel like you're sitting at the table with us."

🏢 Fable is back (and Sonnet 5 is… meh)

The biggest story of the week: Fable-5 is back, roughly 82 days after Mythos was announced in London, and less restricted than feared. Fable prepped the entire run of show (and shuffled the guest order for no reason). Meanwhile Sonnet 5 dropped and underwhelmed — LDJ found it less token-efficient than Opus, Wolfram's early WolfBench read put it slightly under Opus 4.6 at higher cost, and Nisten thought it was fine for unimportant stuff.

Restored globally July 1 after export controls lifted (June 12 pause) — with new cybersecurity classifiers; it prepped the show's entire run sheet.
Peter burned ~100 Fable generations before anyone at Arena woke up.
Sonnet 5: 'most agentic Sonnet yet' at intro $2/$10 pricing through Aug 31 — but the new tokenizer can burn up to 35% more tokens; WolfBench's one-run read is under Opus 4.6.

Alex Volkov

"Fable is back — and I celebrated the way any reasonable person would, by having it prep the entire run of show."

🔓 Open source: LongCat-2.0 unmasked (Meituan's Owl Alpha)

The open-source segment had one big reveal: Meituan disclosed LongCat-2.0, a 1.6-trillion-parameter MoE trained entirely on Chinese ASICs — no NVIDIA hardware — hitting 59.5 on SWE-bench Pro at $0.038 per million tokens with free cache hits. It turned out to be the model that had been running anonymously as 'Owl Alpha' (which Wolfram had already been enjoying), and it ranks among OpenRouter's top models by volume. The panel also touched ZAI's new ZCode, a GLM-5.2-based agentic coding environment. The bigger trend: Chinese open-weight models are now ~30% of global usage on major platforms, up from 1.2% eleven months ago.

LongCat-2.0: 1.6T MoE trained entirely on Chinese ASICs (no NVIDIA), 59.5 SWE-bench Pro, $0.038/M tokens with free cache hits.
It was 'Owl Alpha' all along — already a top OpenRouter model by volume before anyone knew whose it was.
Chinese open-weight models are now ~30% of global usage, up from 1.2% eleven months ago; ZAI's ZCode also dropped.

Wolfram Ravenwolf

"Did you try it when it was Owl Alpha? I liked it a lot. It was great."

🔩 The Etched ASIC debate

Etched — the 'weights etched into the silicon' ASIC company — finally announced its LLM chips, and it was the talk of the expo floor. The panel was split between excitement about what weights-on-chip means for speed and power draw, and hard-earned skepticism: Nisten pointed out that Taalas has actually shipped a working product while Etched's famous demo ran on eight NVIDIAs, and until real silicon shows up he isn't buying it.

Etched announced its LLM ASICs — model weights physically on the chip for major speed and power gains.
Nisten's counterpoint: Taalas has shipped working silicon; Etched's earlier demo ran on eight NVIDIAs.
Floor consensus: huge if true, but this crowd wants chips in hands before belief.

Nisten Tahiraj

"Taalas has shipped a working product. Has Etched shipped an actual chip? I don't think so."

🧩 Exo Labs launches local.ai (+ a surprise NVIDIA crash)

Alex Cheema and Sero (0xSero) came on fresh off announcing local.ai — a site that tracks the local-AI frontier: best model for your hardware, the performance trade vs. the cloud, whether it's cheaper than API tokens. Early access is live with codes for everyone who signs up; the Exo CLI ('vLLM for consumer devices, configs figured out for you') follows in weeks. Sero walked through REAP pruning — a GLM 5.2 prune hitting 71% on Terminal Bench 2.1 — and Nemotron-3 Ultra (550B) running on four Sparks. Then Nader Khalil from NVIDIA crashed the set to pull together an impromptu Local AI Summit.

local.ai tracks best-model-for-your-hardware, cloud trade-offs, and cost vs. API tokens — early access + signup codes now.
Exo CLI = 'vLLM for consumer devices, with the configs figured out for you' — shipping in a few weeks.
Sero's REAP pruning: a GLM 5.2 prune hits 71% on Terminal Bench 2.1; Nemotron-3 Ultra (550B) runs on four Sparks.
Nader Khalil (NVIDIA, ex-Brev.dev) crashed the set to organize a mid-conference Local AI Summit.

Alex Volkov

"We talk about why open weights matter every week; this crew is doing something about it. Freedom of intelligence, folks."

🌞 GPT-5.6 with Dominik Kundel (OpenAI)

Smoothest transition ever — local AI to OpenAI via the person behind GPT-OSS. Dominik broke down GPT-5.6 as three models: Sol (frontier), Terra (~5.5-level intelligence at half the cost), and Luna (small & fast), plus a new Ultra mode with a Max reasoning level and heavier sub-agent use. Headline: 5.6 Sol is coming to Cerebras at absurd speed — the same weights as the API model, not a distill. Also: the Codex app is five months old, 100% of OpenAI engineers use it, and a human still reviews every PR. The token-bank feature came from community feedback, and yes — there's a literal physical reset button behind the booth.

GPT-5.6 = Sol (frontier), Terra (~5.5 intelligence at half cost), Luna (small & fast) + new Ultra 'Max' reasoning mode.
5.6 Sol coming to Cerebras at extreme speed — same weights as the API model, not a distill or 'Spark situation'.
Codex app is 5 months old, 100% of OpenAI engineers use it — and a human still reviews every PR that lands.

Dominik Kundel

"You can't do the retro and say Codex did it, or God did it."

💛 This Week's Buzz: W&B Aria goes GA

The sponsor corner — Weights & Biases from CoreWeave — and this week it was a real launch. Zubin Aysola came by with Aria, the auto-research agent that went GA on Monday. It lives in the W&B UI ('Just Ask Aria', top-right), reads your traces, and debugs your loss curves. In Zubin's talk, Aria read its own production traces and updated its own prompts — RSI shipping on shelves.

Aria went GA on Monday — an auto-research agent inside the W&B UI ('Just Ask Aria').
Reads your traces and debugs your loss curves in-product.
On stage, Aria read its own production traces and rewrote its own prompts.

Alex Volkov

"The RSI dream, shipping on shelves. Proud of this one."

🐡 Sakana Fugu with Stefania Druga

We covered Fugu last week without realizing we had a friend inside the lab — so we fixed that. Stefania Druga went deep on the two ICLR papers behind it (Trinity + the conductor), why it's recursive rather than a dumb dispatcher — it rewrites prompts and verifies outputs before picking a model — and announced on the pod that Fugu now works in Codex and OpenCode. Plus routing between numerical models and fuzzy reasoning for typhoon prediction, a SHEEFs teaser, and a riff on Socratic AI for kids: answer machines make lazy kids; question machines make curious ones.

Fugu is recursive, not a dumb dispatcher — it rewrites prompts and verifies outputs before picking a model.
Announced on air: Fugu now works in Codex and OpenCode.
Socratic AI for kids: answer machines make lazy kids; question machines make curious ones.

Stefania Druga

"Answer machines make lazy kids; question machines make curious ones."

✨ Google DeepMind: OmniFlash + NanoBanana 2 Lite

A first for the show — Alex took his first-ever mid-stream bio break and Wolfram ran the interview solo. Philipp Schmid covered OmniFlash (the first of the Omni any-to-any family: 10-second video generation with precise conversational editing — 'make it daytime' and it redoes light, sky, and shadows) and NanoBanana 2 Lite (under 4 seconds per generation, starting at three cents per 1,000 images, quality above the original NanoBanana). The Interactions API also hit GA. Google is shipping.

OmniFlash: first of the any-to-any Omni family — conversational multi-turn video editing (editing Elo 1087, $0.10/second, up to 10s) via the Interactions API.
NanoBanana 2 Lite: sub-4-second generations starting at ~3¢ per 1,000 images, above original NanoBanana quality.
Interactions API hit GA; Wolfram ran the whole interview solo.

Alex Volkov

"Three and a half years of live streams, and I took my first-ever mid-show break during this segment. That's how much I trust Wolfram."

💙 Darya Volkov's token-billionaire debut

After years of Alex mentioning her — girlfriend, then fiancée, then wife — listeners finally met Darya Volkov. She came to AI Engineer in her own right, walking the floor with the media crew, and earned her own token-billionaire badge: she runs eight agents (each with sub-agents; she installed two more that Alex found out about live on air) that operate her actual marketing agency, Geeks360 — client platforms, billing systems, built practically overnight. Her wishlist: agents that learn progressively so you can grow trust, and one unified brain instead of a new model to chase every week.

Runs eight agents (each with sub-agents) operating her marketing agency, Geeks360 — two more added live on air.
Earned her own token-billionaire badge on the floor.
Wishlist: progressively-learning agents you can grow trust in, and one unified brain instead of chasing a new model weekly.

Alex Volkov

"She runs eight agents that operate her actual marketing agency — client platforms, billing systems, built practically overnight."

🫶 Swyx closes: what this whole thing is

We closed with the man who built the city. Some wild numbers: the first AI Engineer was 500 people at Hotel Nikko; this one was 7,200, sold out, sub-5% talk acceptance, a daily printed newspaper, a puppy corner, a flash mob, and a token-billionaire lounge. A month out only 3,000 tickets were sold. Swyx calls the conference 'the highest loop — the one that creates all the other loops,' and the expansion is real, with AIE Tokyo next. On the record: ThursdAI got its official start because Swyx was the first person to believe in Alex.

First AI Engineer: 500 at Hotel Nikko. This one: 7,200, sold out, sub-5% talk acceptance.
Daily printed newspaper, puppy corner, flash mob, token-billionaire lounge — and AIE Tokyo is next.
Swyx was the first person to believe in Alex — 'the highest loop, the one that creates all the other loops.'

Swyx

"It's the highest loop — the one that creates all the other loops."

ThursdAI broadcast 2.5 hours live from the AI Engineer World's Fair expo floor — 7,200 engineers, every lab a sponsor, and a ThursdAI-record nine guests.

🏢 Big CO LLMs + APIs

Fable 5 is back — Anthropic restored the model globally on July 1 after US export controls were lifted, with new cybersecurity classifiers as safeguards. The June 12 pause had affected both Fable 5 and Mythos 5; access resumed without ID verification, though new content filters may temporarily block some routine coding tasks.
Claude Sonnet 5 — 'our most agentic Sonnet yet', near-Opus 4.8 performance at introductory $2/$10 pricing through August 31. Reception split: power users saw near-Opus costs for slightly inferior output at high effort, casual users liked the value. The new tokenizer may consume up to 35% more tokens.
Steganographic fingerprinting disclosure — Anthropic acknowledged a March 2026 experiment embedding hidden signals in Claude Code's system prompt targeting Chinese proxy users. Dormant for custom-endpoint users only, no separate exfiltration channel — but the obfuscated approach raised trust questions about agent access.

🔓 Open Source LLMs

LongCat-2.0 revealed — Meituan's 1.6T-parameter MoE, trained entirely on Chinese ASICs without NVIDIA hardware: 59.5 SWE-bench Pro at $0.038/M tokens with free cache hits. It had been running anonymously as 'Owl Alpha' and ranks among OpenRouter's top models by volume.
Chinese open-weight models are now ~30% of global usage on major platforms, up from 1.2% eleven months ago.
local.ai launched live on the show — Exo Labs' tracker for the local-AI frontier: best model for your hardware, cloud trade-offs, and cost vs. API tokens. Early access codes for signups; Exo CLI ships in weeks.

💛 This Week's Buzz

W&B Aria went GA — the auto-research agent in the W&B UI that reads your traces and debugs your loss curves. In Zubin's AIE talk it read its own production traces and updated its own prompts.
Alex's AIE talk: 'Should AI Engineers still read code in 2026?' — decomposition, verification, and engineering principles.

🤖 AI Coding & Agents

ZCode — Z.ai's agentic coding environment on GLM-5.2: 1M-token context, a /goal verification protocol with independent success checkers, 173 tok/s output and 1.4s time-to-first-token.
Base 1 — Base44 (Wix, $150M ARR) launched a proprietary LLM trained on tens of millions of real app-building interactions — the first vibe-coding platform with an internal model, already auto-routing tasks.
GPT-5.6 — Sol (frontier), Terra (~5.5-level at half cost), Luna (small & fast), plus an Ultra mode with Max reasoning. 5.6 Sol is coming to Cerebras at extreme speed — same weights, not a distill.

🎵🎬 Voice & Vision

NanoBanana 2 Lite — image generations in under 4 seconds starting at ~3¢ per 1,000 images, above original NanoBanana quality.
Gemini OmniFlash — conversational multi-turn video editing via the Interactions API (now GA): editing Elo 1087, $0.10/second for videos up to 10 seconds.

Alex Volkov 0:00

Good morning. Good morning, and welcome to Thursd AI, folks. Today is July 2nd. This is Alex Volkov and Wolf from RavenWolf coming to you live from the AI Engineer World's Fair in San Francisco for the year 2026. Uh, if you're a follower of the podcast, you know that we have been covering Thur- uh, like, AI engineer for, I think, three years. Uh, we started-

Nisten 0:26

From the beginning.

Alex Volkov 0:26

We've b- we were there from the beginning, from the very first one. I recorded it from my hotel room. Uh, we're a little bit, we're a little bit in a different spot right now.

Nisten 0:34

Little upgrade from

Alex Volkov 0:35

the first. L- lit- little upgrade. Uh, there's, uh, one, two, three, four, five, six crew members over there, uh, helping us out. Uh, incredible folks. Guys, thank you so much for this. Uh, set us up in just a very quick second. Uh, if you're tuning in, uh, we would love to know where you're tuning in from. Hopefully... Uh, well, it's early enough, so you can, even if you're attending the show, you can tune in with us. But if you're tuning in from somewhere else, please send a comment. Let us know where you're tuning in from. We have an incredible show for you, incredibly packed. We have I think six g- uh, we're, we're gonna try to beat last time, Wolfram

Wolfram Ravenwolf 1:13

Mm-hmm.

Alex Volkov 1:13

Yeah.

Wolfram Ravenwolf 1:14

Uh- This one is bigger than London, so we have to have more guests.

Alex Volkov 1:17

Yes. I think, uh, let's, let's, let's make sure that folks can hear us well. Yeah, you closer mic. Um, and we have, uh, a little bit of time to get started with our friends on here. So let me add Peter to the show. Peter, what's up? Can you hear us well?

Peter Gostev 1:37

Yeah, yeah. You're looking good, guys. Uh, it's, uh- Thank you, man ... yeah. Um- We're s- You know, c- I can't be there with you, but I think, you know, the way I'd like to think about it is, you know how with the president and the vice president, they can't fly them together in the same plane?

Alex Volkov 1:55

Oh,

Peter Gostev 1:55

th- this office

Alex Volkov 1:56

is- So, you

Peter Gostev 1:57

know, there's... Yeah, so I'm creating contingency. If you guys have a big earthquake or something, you know, I'll still be here.

Alex Volkov 2:03

Yeah, there is a bus back to the one. The, the thing is, uh, last live stream, my laptop shut down, and this was, like, very great that, like, folks, uh, held it together. Um, so we will talk about AI Engineer a lot. AI Engineer is the biggest conference for folks like us. Everyone's here. Folks, I, I wish I could, like, take the camera. Maybe, maybe I can actually... You have the camera, right?

Nisten 2:25

Yeah.

Alex Volkov 2:25

You're tuning in to the live stream? I turn it on. Yeah.

Nisten 2:27

Yes.

Alex Volkov 2:28

Uh, enable your camera, please. Okay. And then we will do, we will do this, and we will do this, and, uh, you can show very slowly if you- Yeah, the great thing ... if you w- very slowly. Very slowly. Okay. And like that. Horizontal. Yes. And it should switch for us. Yeah, there we go. Okay, folks. Oh, somebody's tuning in from Miami Beach. Uh, everyone's here. You should, you should realize the size of this expo hall. We're seeing Microsoft. This is the OpenAI booth that we recorded a video together with Romain Ret from OpenAI, and, um, we have just, like, an incredible team in front. Can you show the guys that are helping us do this, Wolfram?

Wolfram Ravenwolf 3:13

Oh.

Alex Volkov 3:13

Hey, guys, can you wave?

Wolfram Ravenwolf 3:15

Hello.

Alex Volkov 3:15

Thank you so much for, for, for doing this. Uh, we have folks tuning in from Canda- Canada, everywhere. Uh, maybe at some point I'll send Wolfram doing this to the booth. Uh, we'll see. So, uh, all right, I think we're, we're, we're good there. I'm gonna take you off. Uh, I wanted to say that, again, uh, we've been covering AI Engineer for, for the longest time. Not only is, like, we're getting super excited about the scale of this, AI Engineer this time is 7,000 people. We'll talk about AI Engineer, folks. The thing that we must talk about the most Fable is back.

Wolfram Ravenwolf 3:54

Yes.

Alex Volkov 3:55

Fable is back.

Wolfram Ravenwolf 3:56

And it's not as restricted even-

Alex Volkov 3:57

Yeah ... as we had feared. Uh, so Peter, I would love to hear from you, uh, how did you celebrate, uh, Fable's back date?

Peter Gostev 4:06

By burning all of my credits and, uh- Are you done? Are you, are

Alex Volkov 4:10

you maxing out on your credits?

Peter Gostev 4:12

You know, well, actually, what I was doing is, uh, 'cause through Arena, like, I can test models. So before they find out and, uh, see how much money I spent, uh, y- uh, before everyone wakes up, I just did about, like, 100 3G generations. So I had so much fun doing this. This is honestly... I know there's, like, not everything is good about it. There's, like, cost, rate limits, blah, blah, blah. But it's just so cool to just be able to go do stuff, and it works so well, and it's just-- it's so nice. So yeah, it's a, it's a good day.

Alex Volkov 4:50

Fable is back. It was an incredible day. Uh, I don't know, uh, mo- most of our listeners probably follow me on X. Uh, X may not show you this, but I had one of the most memorable days of my life yesterday. Specifically, Fable was a very small part of it, but otherwise, I'll tell you about this. This is the shirt I'm wearing and everything. Um, and Wolfram was showing me this silently. Wolfram, you wanna show yours? Uh, folks, if you, if you tune in, uh, this-- You can see the gold card? Okay. AI Engineering printed gold cards for token billionaires, okay? Uh, we'll have Swyx on here, and he'll tell you why and how. But basically, uh, if you remember, we talked from London. There was two talks there, which was b- basically also the thesis for my, um, my, my talk on stage here. Ryan Lopopolo from OpenAI came up with the concept token billionaire, and he went on stage and said, "Hey, I'm a token billionaire. I want one of you to be also." So they turned this into a whole concept with, with, uh, Cerebras and, um, the other sponsor, I don't remember. They have a lounge in here that if you are a token billionaire, and Peter, I think that you are, uh, and you can prove to them that you're spending at least a billion tokens a week, then you get access to this, like, Amex Gold line. There's a masseuse, there is some champagne, there is food for you. It's very chill, and everyone who's in there is basically a person who uses AI to the max, okay? So obviously, we're here. Uh, this is-- uh, Peter, this is for you as well. Um, and I wish you guys were here to get this. Um, Fable is back. Also Sonnet Meh. W- w- LDJ, uh, say hi to the folks as well and, um, have you seen Sonnet 5 release and what's your thoughts on this?

LDJ 6:37

Yeah. Hello, everybody. Uh, I have seen Sonnet 5's release, and it is, um... it's, it seems that the overall sentiment right now, including myself, is a bit underwhelming. It seems, like, not as token efficient as Opus in a lot of things, and for a lot of things it's literally just cheaper for Opus to do it because just the cost per task overall ends up lower many times with Opus.

Alex Volkov 6:59

Yep. Yep, yep. Uh, I, I asked a bunch of people, including Fable with research, uh, and, and the, the vibes are not great, folks. The vibes are not great. Uh, Anthropic basically released something un- unclear. Wolfram, you tested this.

Wolfram Ravenwolf 7:15

I tested it. It was about the performance of, uh, Opus 4.6. Now, this would have been excited before we had Fable, but we expect a lot more now and, um, yeah, that is why, from my perspective from benchmarking, it wasn't that amazing that I would be super excited about it.

Alex Volkov 7:31

No. The thing is, and I think we have this in the notes, there's a new tokenizer, and it's, like, uses 35% more tokens. Nisten, welcome to the show, by the way. How are you doing, sir? Did you see the tokenizer stuff? Like, they use- Uh- They use more tokens for Sonnet, so it, like, it costs almost as Opus. Like, what's the point of that?

Nisten 7:48

I think that happened with 4.7 as well.

Alex Volkov 7:51

Yeah.

Nisten 7:51

Uh, I, I don't know. I actually found it all right. I thought it was pretty- Really? ... pretty okay. I ran the usual visual tests and I just kept it on medium, on default, and it was okay. It's fine. I think I might even default to it for most of the stuff that's not that important. Uh, yeah. I, I know for benchmarks and long-running tasks, yeah, it can get stuck in loops and stuff, but, uh, yeah

Wolfram Ravenwolf 8:20

So I did look at the cost as well, and, um, the performance is a little less than Opus 4.6, but- You wanna show?

Alex Volkov 8:26

We have- ... cost was much higher ... we

Wolfram Ravenwolf 8:27

have time.

Alex Volkov 8:27

If you wanna present and you're still logged in from there, or send me- Uh,

Wolfram Ravenwolf 8:30

I wasn't linked

Alex Volkov 8:31

send me the link in, in Moon- Okay ... in Slack. Uh, we will show you guys Wolfbench results, um, for this thing. By the way, uh, can I give a shout-out to Wolfram, uh, real quick? Folks, uh, this was your first time doing a workshop. Yeah. Like, in the big one-

Wolfram Ravenwolf 8:47

Big one ...

Alex Volkov 8:47

big place. Big workshop, not just a talk I think, like, 250, 300 people went and sat in the room to check out how to build a evaluation benchmark. So dude- Yeah ... you, you killed it. Thank you. And I'm so, so proud that we're representing inside the engineer from, like, all, all points.

Wolfram Ravenwolf 9:05

And- It's a great thing. It's the talks afterwards when the people come to you and have ideas. And someone used the traces I uploaded to Weave to analyze them on, in his own way.

Alex Volkov 9:14

Yeah.

Wolfram Ravenwolf 9:15

And it's amazing. I got some new ideas, uh, how to improve the whole thing. The time dimension is at a new, suggested by a guest.

Alex Volkov 9:22

Yeah.

Wolfram Ravenwolf 9:23

So great. Uh, it's on wolfbench.ai.

Alex Volkov 9:24

Wolfbench.ai. Just

Wolfram Ravenwolf 9:26

updated the page.

Alex Volkov 9:27

Yes, of course. Let's, let's add Wolfram, wolfbench.ai. Uh, and we'll talk about Sonnet f- uh, actually, we, we should talk about Fable meanwhile, but, like, I'll pull up the Sonnet stuff. Uh, we will go here. Yes. And then we'll go to wolfbench.ai. Uh, and then you guys can tell me if you're seeing Wolfbench.

Wolfram Ravenwolf 9:53

Uh, let's filter. Let's filter to Terminus 2 and, uh, use Opus 4.6 compared to, uh,

Alex Volkov 9:58

Sonnet

Wolfram Ravenwolf 9:59

Alex Volkov 9:59

4.6 and Sonnet 5 and Sonnet 4 just for good measure.

Wolfram Ravenwolf 10:05

Mm-hmm. Okay. And then turn on the cost over there. Click it twice. Once more. Okay, now we see that. All right.

Alex Volkov 10:13

So Wolfram, walk us through this. What are we seeing?

Wolfram Ravenwolf 10:16

Yeah, so what I, I'm showing here is the token consumption, that is a, a colored bar, and the gray shadow, that is the cost and it's also on the bar, so you can see what it is.

Alex Volkov 10:26

Mm-hmm.

Wolfram Ravenwolf 10:26

Maybe zoom in a little bit. But what we see, that token consumption has not been much more, but the price compared to Opus, what it's been doing over there- Yeah ... um, it's, it's, it's more expensive.

Alex Volkov 10:39

Read out the, the prices. So we see Sonnet 5 cost us to run 870... Am I, uh, uh that's okay. Uh, we have budget for Wolfbench, and the reason why we have budget, because folks, uh, Wolfbench, in addition to just being our way of telling you how models perform, uh, wolfbench.ai is also an incredible way for us to show these model makers how their new models perform in real time. So we met Anthropic folks, we met OpenAI folks. Wolfram is pushing Wolfbench to all of them. All of them know about Wolfbench. It's kind of incredible. So you guys should also know. Uh, in the comments, please, please let us know if you have used token, uh, Sonnet 5. I see Milo saying that it's a token guzzler. We actually don't see this based on this, right? Like, it does- it doesn't seem like it's using more tokens than 4.6.

Wolfram Ravenwolf 11:28

Yeah. I just uploaded it so we can take a look at it. Yeah. I still have to do a full analysis. Yeah. So take this with a grain of salt for now.

Alex Volkov 11:35

Yeah.

Wolfram Ravenwolf 11:35

I also only have one run for Claude Code and one run for, I think it was OpenClaw today.

Alex Volkov 11:40

Yeah. B- All right. So- More to come ... more to come. Um- So Sonnet, unclear release. Uh, what else big in the ag- fuck, Fable. Fable is it, is this. Fable is back. We've been missing this. Uh, we're gonna talk about what happens in the world of AI engineering with a bunch of folks here. Um, as a reminder, Mythos was announced. Peter, you remember this? Mythos was announced back when we were in London. It was like 82 days ago. Mm. And since then, there was like a whole thing. Uh, the, like, uh, uh, Anthropic announced Mythos-5. Then they said about Fable-5. Then, uh, the US government decided that, "Hey, somebody from AWS showed us an exploit, a jailbreak, and this thing can start being, running wild and, and, and, you know, breaking shit on the internet." So the US government actually used the, uh, the, the Citizenship Act or something like that. Anybody remember the exact quote? LDJ, you probably remember exactly what they used, um, to prevent- I'm not actually

LDJ 12:38

sure what

Alex Volkov 12:39

the quote was ... to prevent exposure of, of Fable to foreign nationals, making Anthropic taking it off for everyone because they didn't have any mechanism to protect against, you know, who's a US-based citizen, who's not. Even, uh, like, uh, uh, folks even talk about Karpathy not being able to access it internally because l- he is not fully, like, a citizen. And, uh, it looks like after a long debate, discussion, and after... I don't know if you guys saw this. After they replaced Dario with somebody else th- that was easier for the government to talk to, with, uh, then Fable is finally back, and hopefully GPT-5.6 is not far behind. Although we asked OpenAI folks, and GPT-5.6 is probably not gonna come very soon. They're all taking a break after this. Uh, OpenAI is taking a reset week and taking a break.

Wolfram Ravenwolf 13:27

Are they res- resetting the rate limits as well?

Alex Volkov 13:29

They did reset the rate limits.

Wolfram Ravenwolf 13:31

Yeah, they did it, yes.

Alex Volkov 13:31

Uh. Uh-huh. Can you put the camera on? Can I send you on a mission? Mm-hmm. Okay. Turn on the camera. Folks, you have to see this. Uh, do, do you know where the reset button is? I... It's behind. It's, it's just literally behind the booth right here. Um, so the OpenAI booth, you guys have to see this. For real? Yes, please. Uh, just turn on the camera- Guys,

Nisten 13:51

this-

Alex Volkov 13:51

and walk slowly ...

Nisten 13:52

this, this free week of Fable is basically just Anthropic going out and being able to hire anyone they want. They'll just be like, "Hey, you want unlimited Mythos tokens?" And they, they could just poach literally anybody-

Alex Volkov 14:07

That's true ...

Nisten 14:07

at this

Alex Volkov 14:08

point. Um, all right, Wolfram is, uh, showing the back of the OpenAI booth. Folks, do you know, uh, when you reach the end of your, uh, Codex limits and then Tibco- Is this AI

Nisten 14:20

generated?

Alex Volkov 14:21

No, sir. This is a real button that they hid yesterday in the middle of the booth, and they reset everybody's tokens. Wolfram, don't break the glass. Do not break the glass, sir This is, this is ridiculously cool. Uh, a- actually, the, they reset, they reset the limits yesterday, so you should all have your rate limits reset, uh, today.

Peter Gostev 14:42

Are you guys, uh, storing them up, or are you using the resets temporarily? Can we talk about this,

Alex Volkov 14:47

Peter? Can you talk about the bank thing? I think it's a genius thing that they're doing.

Peter Gostev 14:51

Yeah.

Alex Volkov 14:51

And I, I would love to- for you to cover this a little bit.

Peter Gostev 14:54

The, the kind of the silly thing about this was that when they were resetting the limits, if you were trying to save your tokens This is really annoying that you're trying to save your tokens and then they reset the limits. Yeah. And you're like, "Ah, damn it," like, "What the, what was I doing?" So they kinda used this, uh, as an opportunity to create a feature so you can actually bank them. So I, I think by default they actually don't get your, um, your limits don't get reset straight away, but they kind of just, you get this bank. So you can click a button and, and your limits re- reset when you want it. So this kind of removed a lot of a- anxiety I think from people who are just, like, wondering. And this was me, like, I was wondering, do I need to, like... You're kind of looking out for issues on Twitter and it's like, oh, is someone, like, is someone saying, oh, there's, like, some issues? Are they gonna reset? So you're trying to use it up and then they don't reset, then you look like an idiot. So- So now it's a feature. Uh, yeah, now I've got... Like, the thing is I, I've got like three accounts and then I've got, like, uh, I think I used one- Mm ... so I've got like eight, uh, resets left. Um, so but now I've got new anxiety of, like, when, when they're gonna expire, so do I need to like try and keep track of it? Not for days. And they didn't. Yeah. Yeah.

Alex Volkov 16:12

Uh, we, I, I actually interviewed Romain Huet, who's the head of DevRel at OpenAI, at the OpenAI booth yesterday, and we're gonna post this video very soon, um, likely on the Thursd AI that's just before my birthday or one after that. Uh, speaking of, in two weeks I think, Wolfram, you're gonna run the show. Uh, I'm taking a little bit of a time off. Uh, and then we chatted with Romain Huet, who's like head of DevRel, and they're in charge of Codex and everything. Uh, I asked him about this bank thing. He said that this came from the community. They gave them feedback about exactly the tension that you're talking about, Peter, where like, "Hey, I'm saving my tokens just for the right task just before I kick off," and then they reset but I was already full, so it, like it, it, it, it got over me. Uh, so I think it's like a great feature. The bank is a great feature, absolutely. Uh, all right, folks, let's talk about open source. We're gonna have two folks. Uh, Nisten, you may know them. ExoLab, the folks who connect MacBooks to MacBooks to MacBooks to like run local inference, and maybe we'll have a surprise guest, maybe he'll come or not, from NVIDIA, and they're gonna announce something live today during the conference and we have a, like a, like an express preview for you guys. That's gonna happen in 10 minutes. We're gonna have Alex Chima and OX Sero, if you guys know him from Twitter. O- O Sero, I don't know how to pronounce it. Was it Sero? Oh,

Nisten 17:29

yeah. I have no idea what he looks like. That'll be good to see.

Alex Volkov 17:33

Yes. So Sero's gonna come- His one

Nisten 17:34

of the anime

Alex Volkov 17:35

accounts. Uh, Sero's gonna come and also, uh, another Khalil, I think his last name, uh, from NVIDIA, and, uh, basically arranged this, like, local AI summit in the middle of Ai.engineer just on the fly. We love open source. We love OpenAI. I wish I had my buttons to, like, applaud- From the beginning ... so we can, we can applaud like, like this. Uh, so they're gonna show up in, in just 10 minutes. Uh, and then after them, we have a bunch of guest folks. We have Dominic Kundel from OpenAI, repeat guest. Dom was on the show, I believe, a few times already. We have Philippe Schmidt, previously from Hugging Face, and now he is with Google DeepMind, to talk to us about their new releases. It's gonna be super exciting. Then we have our friend of the pod, longtime friend of the pod, Stefania Druga. She was a research scientist at Google, and now she works at Sakana AI. And do you guys remember we covered Fugu, like, what? A week ago? Two weeks ago?

Peter Gostev 18:26

Yeah.

Alex Volkov 18:27

Only here it's connected that we have a friend that worked on Fugu, and we're like, "We, we should have brought her on." So she's gonna come on. Uh, so we're gonna cover Fugu despite the fact that it didn't release this week. We're gonna come on. Um, we're good on audio? Yeah? Okay. Um, and we have two more special guests. One of them is a CEO of a marketing, uh, media company that's, uh, also walking around this conference to help us cover the conference. Her name is Daria. You may know her as my wife. Uh, she's here also. She's walking around independently, and she hears different things than I, what I hear. Uh, so I talked about her multiple times since she was my girlfriend, then my fiancée, so you guys will see Daria on stage as well. It'll be super fun, I think. And then lastly, to close out the show, we'll have the man himself, Swyx, the organizer of Ai.engineer, the guy who made... who built this city, basically, help us close out for the sixth time, I believe, in a row. So it's gonna be a banger show. A banger show, and I'm super, super excited about this. Um, before the local folks come, there was a bunch of open source news. Meituan and Lonecat, and do you guys see all this? Also ZAI released ZCode. Uh, do you guys wanna pick any of these topics up? We have, like, like eight minutes to talk about until the next, the next folks come. I would say we can talk about, uh, Meituan and ZCode. Anybody wants to pick this up? Let's chat about this.

Nisten 19:57

I don't think any of us- Well- ... tried this. One is their harness or... Oh, 'cause you- Did- Did you try any of them? I-

Wolfram Ravenwolf 20:05

So the Mesh-1 model, yeah, um- I mean, long before it, uh, did you try it when it was, uh, all Alpha?

Alex Volkov 20:09

Yeah, so Owl Alpha- I, I did

Wolfram Ravenwolf 20:11

and I, I liked it a lot. Yeah,

Alex Volkov 20:12

yeah.

Wolfram Ravenwolf 20:12

It was great.

Alex Volkov 20:13

Let's, let's, uh, i- introduce Owl Alpha and, like, give the whole segment topic You have it?

Wolfram Ravenwolf 20:20

Yeah. So, um, it was known as All Alpha when it was, uh, when it appeared on, um, where, where- Open

Alex Volkov 20:27

Router ...

Wolfram Ravenwolf 20:28

on Open Router we tested it.

Alex Volkov 20:29

I believe Open Router is.

Wolfram Ravenwolf 20:31

Yeah, it was Open Router. Mystery model on Open Router and it's, uh, now we know what it is. It's called, from, from Meituan, it's Long- LongCat 2.0 and it's a 1.6 tera- uh, um, trillion mixture of experts with 48 billion active parameters. It has one million context, which is great, and, um, MIT license, fully open source. Always applauding this.

Alex Volkov 20:55

Yeah.

Wolfram Ravenwolf 20:56

Open source must win. Open, uh, uh, free AI, local AI. Uh, the weights are on Hugging Face, of course. Uh, it was two months that it was in a stealth period and, uh, it appeared on number three on Open Router by tokens, so it was a really well-used model.

AI 21:12

Mm-hmm.

Wolfram Ravenwolf 21:12

Number one on Hermes Agent even and yeah, it, when I tested it, it was, it felt like it had Opus quality from my perspective. That was great and it's, the special thing, it was trained entirely on fi- over 50,000, uh, Chinese ASICs and zero Nvidia, um, for 35, uh, trillion tokens, even more, and no rollbacks. The benchmarks, 59.5 on Sweet Bench Pro and what do we have? 70.8 on Terminal Bench 2.1, which is a, a very well, very good score and, um, LongCat sparse attention. It has an NCRAM embedding module and from... There's another information about Open Router because by volume, Chinese open weights moved from 1.2% to 30% of global usage within a year.

Alex Volkov 22:05

I think, I think the thing, the highlight of Meituan, Nisten, this is literally, dude, I, I sen- I send this. Uh- This is

Nisten 22:11

the grocery store, right?

Alex Volkov 22:12

Yeah. Uh, Nisten, Fable sent this. Fable said, okay, Fable helped us prep for the show. He said, "Hand to Nisten/LDJ," 'cause Fable knows what your guys' expertise is. Uh, "What did Meituan actually train on? Uh, folks saying that they trained on fully, uh, uh, zero Nvidia. They trained it on Chinese ASICs." Do you see this at all? I think it's, like-

LDJ 22:36

Yeah ...

Alex Volkov 22:36

crazy to see this level of, of model trained fully on, like, Chinese ASICs with no NVIDIA, right?

LDJ 22:42

Yeah, but they were pretty vague about it, I think, right? Like, I guess... I, I guess it depends on what we would say is their, their integrity and, and things like that. But I, I feel like it could technically be, like, TPUs and they're just calling TPUs ASICs and-

Nisten 22:58

No, they're, they're all ASICs. All, all the GPUs, uh, like, A100s are technically ASICs, so-

LDJ 23:02

Yeah, I mean, like, even that, exactly. Yeah-

Nisten 23:04

It's probably just the- ... like, just that mole

LDJ 23:05

ASIC. Yeah.

Nisten 23:06

Yeah. It's probably just the Huawei, uh, Ascend Su- Super Powder or something like that. Yeah. That'll-

Alex Volkov 23:12

I think... But, but, like, 1.6 trillion MOE is also one of the biggest open source we've seen, right? I think, uh, not only that, it's having to train this on a, on, like, a different subset that are not, not NVIDIA. This feels to me like newsworthy, at least to mention, that this is coming, folks. NVIDIA doesn't have, uh, uh... It has a beautiful mode, but, uh, we should also mention... Nisten, this- You mentioned this last time when it came out. Etched finally announced their, like, ASICs to run LLMs. I, I don't know if we have this in notes, but I don't know if you guys saw this news, but Etched is kind of the ASIC, um, hardware company. This is also the talk of, of the, the floor here, where, um, basically when we covered Etched, this was custom silicon that, that has... That's why it's called Etched, right? Like, the, the weights of the model is on the, on the chip, and this is why they're running significantly faster and with significantly more, like, reduced power. And they finally... And we thought maybe it's, like, a VC hype. Nisten, you remember that conversation we had about Etched?

Nisten 24:15

Uh, yeah. Well, Talas has shipped a working product. Yeah. Has Etched shipped an actual chip? I don't think so.

Alex Volkov 24:24

I, I saw an announcement and, uh- 'Cause- Let's go take a look. Let's go take a look ...

Nisten 24:28

they've done that a lot, and they did the demo, which was supposed to be on their chips, but that was just on eight NVIDIAs, which was, like, the live thing. So I don't personally trust whatever the heck Etched said. Sorry. That's just my opinion.

Alex Volkov 24:40

Etched-

Nisten 24:41

I'm not saying it's-

Alex Volkov 24:41

Okay, let's take a look ... business. Yeah No, no, dude, this is, this is... N- Nisten, uh, if anything, if I, if I can, uh, relay some feedback of the folks who meet us are fans of the show, they, they love this. They love that, like, we're not sure, we're, like, pushing against each other, we're, like, figuring out stuff live. So folks, uh, feel free to leave comments what you think about Etched and, like, that whole business. Uh, but let's take a look at their, um, their announcement and then see. 'Cause we're talking about ASICs, we're talking about NVIDIA's mode, right? And, uh, uh, Etched is saying, uh... Can you guys see my X with Etched? Not yet. Uh, I should probably add this to the stage now. Okay. Uh, Etched is saying we're coming out of stealth with eight hundred million dollar raised, and we built our first racks after a successful A zero tape out. They have one billion in customer contracts already. This is kinda what it looks like. And, uh, they have four hundred engineers, NVIDIA, TPM, Broadcom, et cetera. Jane Street is backing them. I don't care about the backers. What I wanna see is speed and stuff. Okay. They're saying our inference system are built to push the entire Pareto curve of frontier models, including many trillion parameter MoEs, which is exactly what Mateuan LongCat is, a one point five trillion parameter MoE, and we're only gonna go up. And we know that the frontier models are big as well. So they're saying low voltage inference for high throughput workloads. Uh, da, da, da, da. What else is interesting here? And cluster scale memory for low latency workloads

Nisten 26:11

Yeah, it's all like what abouts. It's not an actual thing. Like, with Talas and Tensor and they're like, "Here's the card. Here's how it runs. Try it." Uh, I don't see anything here saying, "Okay, they taped it out." Bro, how about, how about

Alex Volkov 26:23

Karpathy saying, um, "I was impressed to learn about some of the engineering wizardry that goes into token what maxing the state-of-the-art LLMs at interactive token sec user." I think, I think Karpathy is, like, an investor, uh, or at least an advisor. Are we taking, um... I know, uh, since last week our S- S- Senpai has gotten some heat on Facebook. LDJ, please go ahead. Just jump in there because I didn't see your hand.

LDJ 26:51

Yeah. I, I was gonna say, they did actually mention, I think it was on their, their blog or somewhere in the announcement information that they have a two megawatt, uh, data center already of, of this compute at the, at their, like, office, offices or headquarters. So I'm, I'm not sure, like, exactly how they're linking- How much did you say?

Alex Volkov 27:08

Two, two megawatt of compute?

Nisten 27:10

They

Alex Volkov 27:10

have-

LDJ 27:10

Two, two megawatts, which, yeah, like, that isn't that much. That's like, I don't know, that's like, um- Yeah, but they have a lot of H100s ...

Nisten 27:16

that's

LDJ 27:16

like 20. Th- that's like, I think 20 Blackwell, uh, racks basically worth of compute or, or worth of energy, rather.

Alex Volkov 27:28

So-

Nisten 27:28

Yeah, but is that all H100s which they're actually running stuff on? Uh, it's like I still don't see an actual product here. Like, okay, you taped out the chip, but what's the card? What does it run? Uh, what's the, the speed like? Uh-

Zubin Aysola 27:43

Yeah, I think

Nisten 27:44

we still wait a while ... everyone else, Groq has a working chip. SambaNova has a working chip. You can try it. You, you can, uh, Talas has a working one, chatjimmy.ai, and Tensorin has, is shipping actual cards and they're having issues. This one I think is just still up in the air trying to get more investment in. Sorry, I don't see an actual chip. There's no chip here. You can't run it. I don't know what people are saying. But I don't see a chip, guys.

Alex Volkov 28:10

Nisten, do you see the one billion plus in commitments from big companies or that also you don't see?

Nisten 28:15

Uh, commitments? Yeah. So that means they haven't paid? Sorry, I am very skeptical of this. I don't trust any of this right now.

LDJ 28:24

I guess we'll find out in summer, 'cause they say they're shipping out the first units in summer, and but, and when they're shipping it, I'm pretty sure they're pro- probably gonna end up having to, like, e- either they'll publicly announce the flops and memory bandwidth and those details or it'll probably get leaked if they don't. So I guess we'll just wait a few more months.

Peter Gostev 28:43

I would say the, the bigger point here is that whether this is, like, real or not is that this is what bubbles are good for. Like, some of it will be kind of semi-fake or whatever. Uh, no, no comment on them. I've no idea. Um, but the fact that new companies are getting started and new ideas, it is being tried, um, and I'm gonna assume it's not, like, complete scam, right? The people are actually trying new, new things. Um, and, uh, this just wouldn't be possible. You know, no one's gonna give them $800 million like five years ago, right?

Alex Volkov 29:20

Yeah.

Peter Gostev 29:20

So whether it works out or not, hopefully, you know, if they do well, maybe N- NVIDIA will acquire them, and they will just get folded in, and everyone gets the benefit. So, like, that's a, that's a good thing. So hopefully they'll manage to get their tech out there in some way.

Alex Volkov 29:36

Yeah. So definitely a, a- On the complete opposite spectrum of what we're about to talk about, which is local AI, which is AI that you can run locally without the big labs, I think the breakthroughs in speed and performance are just beginning. There was a talk, uh, w-we don't cover the, the stock market too much. We also, just as disclosure, both me and Wolfram work at CoreWeave, but you guys know our stance. I think everybody who's listening to ThursdAI knows our stance. Whatever's happening in the stock market with big companies and their prices, like, you know, CoreWeave obviously, Nebulous, like all, all the-- They have no idea what the fuck is going on. They have no idea that all these people are running agent loops twenty-four/seven and spending billions of tokens per person, and they have no idea that this is coming to every person in the world, either via locally or via cloud. In order to be able to send those tokens, we need innovations not only from Nvidia, we need innovations to push Nvidia from both, like from all sides, from ASICs sides, from local AI side, from everywhere, because this is coming to everyone. And, uh, we'll definitely chat more about this with the local AI folks. Nisten,

Nisten 30:45

go ahead. Wolfram, you might have to go get OX0 in person. I don't know why I'm the one coordinating. They're not letting them in.

Wolfram Ravenwolf 30:52

Oh.

Nisten 30:52

You m- you might have to go smuggle him in or something, but...

Alex Volkov 30:55

Yeah. You have the media badge?

Wolfram Ravenwolf 30:57

Yeah. Uh, no, I don't have the media badge, but, uh, Natalie, I will talk to her.

Alex Volkov 31:00

Nader Khalil 31:00

got you, man.

Wolfram Ravenwolf 31:01

Oh, you got one?

Nisten 31:03

There you go.

Alex Volkov 31:04

Take my badge, tell them we're live- Okay ... and they need to come in right now. Uh, yeah, folks, this is the this is the benefits of the live show. The expo hall is opening a little bit later than our live show is opening, and so there is a chance that our guests are waiting at the door to get in. Uh, meanwhile, why don't I tell you about the main sponsor of the show called Weights & Biases CoreWeave? We have a big-ass booth here, and we announced a new product this week as well called CoreWeave Aria, which is a agent that helps you run automatic research loops on top of Weights & Biases and Weave. So if you are a user of Weights & Biases models, or if you are a user of Weights & Biases Weave, we now have a built-in agent kinda like chat thing, but it's not just a chat bot that answers from the documentation. No, no, no, no, no, no, no. We're in the agentic era, and so we may have, uh, a also a sur- uh, I, I think I, I need to stop counting how many guests I invited to the show and hopefully that all work out. But, like, we have a surprise guest that built, uh, Aria, and they will come to talk about this very, very soon as well. If you are using Weights & Biases models, if you are using Weights & Biases Weave, you should definitely, definitely check it out. It's, uh, it stands for Author Research Iteration Agent, and it helps you build your models faster. And we'll take a look at that Later. Meanwhile, let's check the comments. Folks are saying that Karpathy is a backer. He's on the list. Uh, but, uh, DefDog does not like Karpathy's post about this. All right. Uh, there's folks posting on Link- Oh, there's more than one person on LinkedIn. Hey, LinkedIn folks. I'm so happy that you're here. We don't usually... We don- we see one person on LinkedIn. Should I go live on Instagram, folks? Yeah. Meanwhile-

Nisten 32:52

No, no, no. Don't go there. Don't go there. We're gonna get death threats. Just...

Alex Volkov 32:57

Oh, yeah. Nisten, you want to talk about the fact that, like, you're sending me Instagram, uh, reels of people who hate us every time. You want, you wanna just, like, talk about the- Uh- ... the vibes on Instagram? It's,

Nisten 33:07

it's not just voluptuous Instagram models now, it's also Alex Jones. Uh, people are saying... They're going back to, like, the whole 5G, 6G talk. Uh, they're saying entire lakes are drying up. It's gonna dry up the entire lake in Ontario. Uh, by the way, the entire city here, all the AC is liquid cooled from the lake, and so is the entire nuclear thermal mass of the nuclear reactors. Like 70% of this place, which is, like, almost twice bigger than Texas, is cooled by the lake, by the nuclear reactors, the entire thermal mass. So, that 400 megawatt data center just makes no difference. Anyway, people think it's gonna just dry up the lake, and they're posting AI generated images of just lakes drying up. Uh, yeah, they've, they've just gone completely, uh, that insane. Uh- Dude- I, I... Look, there are some takes to that. Like, I-

Alex Volkov 34:04

The amount of data center hate that we're going to see very soon is going to explode. I'm... I keep thinking about the next kinda, like, election cycle in the US, and this will be the hot topic. We already saw Alisa, uh, uh, AOC holding, like, a glass of Blackwater jar and saying that data centers did this, which is some bullshit. Uh, and so now I think once the, the election comes, it's gonna be, like, even worse. Uh, meanwhile, folks, I wanna shout out, uh, before our guests come, and they're, like, setting up and sitting down. I can see them over there. all righty, folks, I think it's time for us to actually go to our next segment, and I'm very excited. Very excited. Yeah, uh, somebody said my screen isn't sharing 'cause I wasn't able to find this, but if you go to dev.two and you can... y- you literally can see us on, on the stream. We're simulcasted to the Pragmatic Dev community right now, in addition to all our sources as well. Uh, so I think it's, like, very cool, and thank you so much for folks for featuring us. All righty. Let's go to the Whitechat and let's introduce our guests. Sero?

Sero 35:06

Hello. Nice to meet you.

Alex Volkov 35:07

That's how you're going by, Sero?

Sero 35:08

Yeah.

Alex Volkov 35:08

And your real name is?

Sero 35:09

My name is Sharif.

Alex Volkov 35:09

You need to be, uh, very close to the mic. Yep. And Sharif- Yes ... nice to meet you, man. Alex- Nice to meet you

Sero 35:14

as well.

Alex Volkov 35:14

And Alex. Nice to meet you, man. Uh, Alex Shima- Thanks for having me

Sero 35:17

on ...

Alex Volkov 35:18

uh, from XLabs. Both of you

Alex Cheema 35:20

XLabs. Yes. Yeah.

Alex Volkov 35:21

And, uh, we haven't... We, we, so we talked about you guys on the show I think a couple of times. I think there was a few things. Ray Fernando called you out a little bit. I know you know Ray. Um, you guys are doing, like, very cool things that enable running models by connecting Macs together. That's how I found about you guys. So maybe Alex, the CEO of XLabs, can you tell us, like, what, what do you guys do besides connecting Macs together?

Alex Cheema 35:44

Yeah. So we're most well known probably on the internet for doing these, like, crazy demos with connecting different devices like Macs, building, like, stacks of Mac Minis to run really large models. Uh, but broadly speaking, like, we started Exo about two years ago. Around that time, um, export restrictions were starting to come in on GPUs and, um, you know, we've kind of realized that the Freedom of intelligence is not guaranteed. So, you know, this amazing technology that we thought was, you know, super impactful and was going to be this, you know, next big technology wave was at risk because, you know, of, um, certain political forces at play or certain sentiment around AI. There were certain narratives being, uh, portrayed around AI safety being used, you know, by certain companies to tell a story that AI is dangerous.

Alex Volkov 36:39

Yeah.

Alex Cheema 36:39

And, you know, this is very damaging, kind of, this can be a very damaging story, and we've seen that play out now. And so, like, we kind of looked at, like, okay, you know, what can we do to guarantee that, you know, people in the future will be able to run, you know, really capable models? Yes. And we saw a gap in terms of, you know, um, being able to run models yourself on your own hardware. So like, there was always, like, whenever we looked at, like, how you would run a really large model, like, at the time it was, like, Llama 4o 5B. Yeah. It's like you have to go to, you know, you have to go to the cloud, or you have to go through this person- Yeah ... or you have to go through this API.

Alex Volkov 37:13

Yeah,

Alex Cheema 37:14

yeah. And it's like, you know, if you want true sovereignty, if you want really the guarantee that, you know, you will be able to run these models- Yeah then, you know, you need, you need to be able to run them locally.

Alex Volkov 37:23

Okay.

Alex Cheema 37:23

Alex Volkov 37:23

that's what we said. Uh, let me, let me move a question to Sara if you don't mind. Why does it matter? Like, obviously... Okay, folks, I don't know how familiar you are with ThursdAI. We've been on the, on the air for, like, three and a half years. We've covered every open source model. Like, we literally, I have a... Should, should we, should we actually do the transition for open source AI? Let's get it going. No, it's fine. Uh, we're already in open source AI. We've covered every open source model since I don't know, the very early ones, okay? Yeah. The early Qwen ones. We were on DeepSeek before, like DeepSeek blew the stock market and everybody's like, "Oh, DeepSeek will change the world," and nothing happened. It was fine. Um, so I'll, I'll turn my question to Zero. Our audience probably knows this, but like what's your guys' take on why it's important, uh, to be able to run this locally? Like why?

Sero 38:07

Yeah. So right now, if you wanted to get any work done in the digital space- Yeah ... you are most likely using AI, whether that y- your company is asking you to build features very quickly, whether you need it to like manage the increasing demand on your time, uh, everybody is using AI everywhere.

Alex Volkov 38:25

Yeah.

Sero 38:25

So now we have a few providers that are making it possible for all of this to happen, but we've seen, again, time and time again, that these providers, I mean, they are private companies. They have the right to do whatever they want-

Alex Volkov 38:37

Yeah ...

Sero 38:37

with their own company. So they can shut you out, and if they shut you out, it's going to be very economically difficult for you to, to recover from that. Um-

Alex Volkov 38:43

They can also, uh, silently change the weights on you like they do with Fable. You guys remember this?

Sero 38:48

Yeah.

Alex Volkov 38:49

The fuck was that? Like honestly. Hey, uh, our model will sometimes tell you that it doesn't wanna do what you want, but sometimes when you're doing very critical, important stuff, we'll not tell you anything. We'll just make it dumber, so we'll like actually, uh, stick, like sticks in your, in your wheels. Yes. Like what the fuck was that, man?

Sero 39:07

Yes. I mean, you can see them doing this, like they're fear-mongering Chinese providers, they're fear-mongering open-weight providers. Yeah. They wanna regulate. Uh, and so we think you should run it on your device because that's going to give you a freedom that you currently do not have- Yeah that is very important for you. And the technology's there

Alex Volkov 39:25

Technology is there. Um-

Wolfram Ravenwolf 39:27

I fully agree. I joined, uh, our local Llama two weeks after it, uh, was created. I was already posting evaluations there. So Open AI, um, Open with the space AI, the local AI, that has been to my heart from the beginning. That's why I did the evaluations, to see how they work on my own hardware and not just how the providers benchmark it. So I'm fully on this, and I think this will be ever more important to have your own AI. It may cost a lot, but you can put it in your basement like a central heating, and it will be AI for the whole family. This is a, a better future than if everything is rented from

Alex Volkov 40:02

the cloud. And you guys who listen to Thursd AI, you know what I'm looking for. Me and Nisten came up with this idea, like, a long time ago. We all need a personal cognitive firewall running locally on our, like, AI at home. I want my mom and my grandmas and whatever, I want AI to see what they see, read what they read, and in real time flag me if they're trying to get manipulated by, like, other models, bigger models. I cannot do this in the cloud. I have to do this locally and store things privately. All right, folks, so the interest here, we're all excited about open AI, like open source, open weights, open models, local running AI. Um, and so I... Alex, when I met you, you shook my hand and was like, "Yo, we're about to announce something." You, you wanna give us a little sneak preview? 'Cause I think it's like a, like, like a whole track of today Yeah. Tell us.

Alex Cheema 40:48

Yeah, yeah. So just to give some background, so, um, like you said, like that's kind of the why behind this whole thing, but like how do we actually make that practical? And there's really like two axes that we look at for how this technology becomes, um, mass adopted, right? Which is the goal here. The goal isn't for like, you know, these hobbyists to be running Mac clusters, these like few people who can afford that, but it's like for everyone to be able to run local AI as the default. And for that, there's, there's two dimensions. So there's the sort of pushing the frontier, and we're doing a lot of research. Um, we are coming out with a website called Local.ai, which basically tracks the frontier. So it tells you, you know, if you have a MacBook, if you have a DGX Spark, what is the best model that you can run? Yeah. What kind of performance am I gonna get compared to the cloud? Am I making a trade-off there? Is it going to be cheaper than paying for tokens through an API? Uh, and then the second aspect is, is usability and the user experience around it. So, you know, right now it's actually-- I don't know if anyone has tried running stuff locally, but it is quite painful. And you have certain things like vLLM, which are great tools, but they're designed for the data center. So, you know, this is-- these tools are designed for a team to go in- Mm-hmm and configure, um, all the settings to be optimized for the hardware that you're running on. Yeah. They're not made for consumers to use out of the box. Yeah. So we are releasing also quite soon, not today, uh, so Local.ai will be today, uh, in early access, and then we have the Exo CLI, which is coming in a few weeks, which makes that really easy. So it's essentially, uh, an easy way of running vLLM on your consumer devices, so DGX Spark, MacBook. With

Sero 42:26

the optimal configs for whatever you want to

Alex Volkov 42:28

run. Cursor, Cursor, please,

Alex Cheema 42:28

Cursor. Yeah. Without- So it pulls data-

Sero 42:29

Yeah, yeah, yeah. Go ahead. So it's just with the, with the optimal configurations- Yeah ... for your device, so you don't have to go and look for the best configs.

Alex Volkov 42:36

Yeah.

Sero 42:37

And it will give you like different models, so like fast, smart, medium, balanced. Sima, let

Alex Volkov 42:41

me get you as like a rockstar.

Sero 42:42

Yeah, yeah,

Alex Volkov 42:43

yeah. You can just hold this and like you

Sero 42:44

can-- Yeah. Yeah, yeah. I-- Well, so- Tell

Alex Volkov 42:46

us like what's gonna-- Are, are you involved in like the re-release of this? Yes. Yes. We're working on

Sero 42:49

it.

Alex Volkov 42:49

Like what, what would folks use this for, yeah?

Sero 42:51

Yeah. So for example, a lot of people might be, uh-- They might know LM Studio. So LM Studio uses-

Alex Volkov 42:56

Oh, shout out to Yags the GOAT, man.

Sero 42:58

Yes.

Alex Volkov 42:58

We love

Sero 42:59

LM Studio. It's, it's a beautiful product, right? Yeah. It's really good. It's very- Friend of the pod,

Alex Volkov 43:01

Yegor, with LM Studio. Uh, yeah, go ahead.

Sero 43:05

So, uh, LM Studio uses something called Llama CPP- Yeah ... and it's really designed for single user. So-

Alex Volkov 43:10

Shout out to another GOAT, Georgi Gerganov. Yeah. Man, they changed the world with Llama CPP. It

Sero 43:14

works on everything.

Alex Volkov 43:15

Just incredible.

Sero 43:15

Yes. Yes

Alex Volkov 43:17

Georgi, we love you, man

Sero 43:18

Well, now- nowadays we can actually, like, get much, much better performance using something like vLLM and SGLang if we wanted to serve an entire household- Yeah or an entire company- Yeah ... from a single computer. So we're trying to push that forward and we're trying to make the user experience for that a little bit better, and hopefully work as an option along the side of, like, something like LM Studio-

Alex Volkov 43:36

Yeah ...

Sero 43:37

and, uh, make it possible for people to run really performant AI just by clicking a button.

Alex Volkov 43:42

I love that. Okay, folks, so my next question for you is, it's not a just Exo. There's, like, a consortium of friends. Uh, Nader from NVIDIA told me yesterday that he, like, he, he, he got all of you in the room together. Let's talk about, like- The initiative and the panels, like, and how big this is becoming here, especially in the middle of Ai.engineer. We're sitting right by the OpenAI booth. Microsoft is over there. Anthropic's, like, somewhere around as well. Uh, but this time we also have Z.ai with GLM that's exploding. Everybody loves GLI, uh, GLM, and we, uh... Wolfram, this is like the best open source model that we've ever tested. Yeah, the best

Wolfram Ravenwolf 44:22

ever.

Alex Volkov 44:22

Uh, Minimax folks were on stage doing keynotes as well. So like this... I feel like this conference is, like, very balanced. And then out of nowhere there's, like, a whole local AI track. So talk to me through, like, who's in there, who's involved, and why folks should pay attention. Yeah.

Alex Cheema 44:36

Yeah, so if, if I told people listening right now that there is a real risk of open source models being banned, I think a lot of people would be surprised. I think most people would be surprised by that statement. And so, you know, right now this is still quite a niche topic. It's still quite on the fringe, right? But the risk is real, so-

Alex Volkov 44:57

The... Fable was just banned.

Alex Cheema 44:58

Fable was just

Alex Volkov 44:59

banned. It's happening. I, I don't think it's that far away. It's happening. They're looking at GLM point two and they're like, "Okay, in six months, GLM is gonna be Fable level," let's say, like close-ish to. But every company, like we work at CoreWeave, we literally like serve, uh, GLM from our inference service, and if the government can say, "Hey, you can download the weights, whatever," but, uh, "Hey, CoreWeave, you cannot serve this to anyone. Hey, uh, you know, Nebulous, hey," uh, Together is also here. Like, all of the folks who are, like, serving, like, uh, inference. "You literally cannot serve this model because it's national secret." What can we do? We'll say yes. Like, it's not even my decision. I, I would try to say no, but no.

Alex Cheema 45:36

Yeah, it's, it's, you know, it's the natural progression- Yeah ... here is for open source to be the next thing, GLM 5.2 being banned. Yes,

Alex Volkov 45:43

sir.

Alex Cheema 45:43

Uh, Kimi being banned, whatever.

Alex Volkov 45:45

So who's involved?

Alex Cheema 45:46

So, yeah, so right now it's like a small community of people that are talking about this. I would say, like, even a lot of people that are thinking about this are not talking about it, right? Yeah. Because maybe they're like, you know, maybe they're scared to talk about it. Maybe it's just, like, not really something they're gonna get a lot of pushback on. Um, in general, like, AI has been wrapped together, like even, even the people in the open source AI community are wrapped together in this, you know, AI narrative. Yeah. And there's a really bad public sentiment right now around AI, data center buildouts. Um, so, you know, it's something that not a lot of people are talking about. Uh, but in SF where most of these things start- Yeah ... there is a community forming, um, of people who are talking about this and trying to make change. There was just an article in the Wall Street Journal, uh, that came out, um, where a bunch of journalists covered, um, a, like one of the early meetings of Freedom of Intelligence, uh, which is something that was just set up, um, to address this. And, you know, there's starting to be more talk about it, and I think, you know, this is, this is how things start, right? And, you know, I don't want to name people because I don't know exactly, you know, if, if they've talked about this publicly yet. Who

Alex Volkov 46:51

was in the room with you with Nader? That was public

Alex Cheema 46:55

Who was in the room?

Sero 46:56

I can share.

Alex Volkov 46:56

Yeah.

Sero 46:57

Yeah. So, uh, the, uh, CEO of AMP, uh, I'm not sure if you're familiar with AMP, it's, uh, it's one of the more popular agent, um, harnesses. Uh, there were people-

Alex Volkov 47:07

Oh, this guy.

Sero 47:08

Yeah.

Alex Volkov 47:08

Come, come around, come around. I want you here. I want- Oh.

Nader Khalil 47:11

Hi. How you guys doing? What's up? I saw you guys on Twitter. I

Alex Volkov 47:13

had to come say hi. We need you on camera. Please sit, please sit here. Please sit here. Yeah,

Nader Khalil 47:15

let's go.

Alex Volkov 47:16

Uh- And

Nader Khalil 47:17

maybe Nader can share

Alex Volkov 47:17

more. He's- So, so at least one person- He's really

Nader Khalil 47:19

set.

Alex Volkov 47:19

I'm waiting I, I told you guys we're gonna have a surprise guest. At least one person was- You guys

Nisten 47:23

all look related. Uh, there

Alex Volkov 47:25

was a group of cousins a little bit involved is, is here. Nader Kolio.

Nader Khalil 47:28

Just a, just a little bit,

Alex Volkov 47:29

yeah. Uh, just a little bit. Just a

Nader Khalil 47:30

little bit.

Alex Volkov 47:30

The mustache man. Yeah. Uh, from-

Nader Khalil 47:33

Really excited for today. We have the Local AI Summit, so I hope you guys are here. If you're not, shoot us a DM. We gotta get you in this room. It's gonna be super exciting. Um, local models, local harnesses- Yes ... we've hit an inflection point, right? Yes. These things have got wildly good this year, and not just that, the things that we do with them has changed, right? We've been giving them access to tools. Enterprises are giving them IP, and you're seeing very colorful conversations about that. Yeah. Users are doing the same thing. I wanna give it my medical records. I wanna give it my healthcare data. Well, the- The more that you do, the better it gets ...

Alex Volkov 48:02

just before you jumped on- Yeah ... I shared, like, a use case that we want for local AI, me and Nisten, you know, from the show as well. Um, I want a cognitive firewall- Mm that looks at everything that I'm looking at, listens to everything I'm listening on the internet, not for me, for my mom, for my grandma-

Nader Khalil 48:16

Yeah ...

Alex Volkov 48:16

to see if they're trying to get one-shotted by, like, a different better AI. And I want-- This cannot run everywhere else. It has to run locally on, like, my- Totally ... by my modem. Yeah. So I want

Nader Khalil 48:25

you to do this for me, Nader. Come today.

Alex Volkov 48:26

All

Nader Khalil 48:26

right, cool. We actually, we have GLM 5.2 running on a station. We have an amazing demo. I don't know if you guys wanna, wanna talk about it a little bit.

Alex Volkov 48:32

You wanna talk about that?

Alex Cheema 48:33

Yeah. We have, we have, like, GLM 5.2 running on a station. Did some work on, like, optimizing that. Um, it's, it's surprisingly fast. Yeah. Uh, we have also Nemotron-3 Ultra running on four sparks.

Alex Volkov 48:44

Yes.

Alex Cheema 48:45

So 550 billion parameter model The first time I think- Yeah. Yeah, it's 550 billion parameter model It's bigger than the

Sero 48:49

biggest Llama. It's huge.

Alex Volkov 48:50

Um,

Alex Cheema 48:50

Alex Volkov 48:51

wanna say, so, so far, this guy CEO, very high level. Bro, Nisten on here is asking you, how long does it take to do a REAP prune for various models you have done? Yeah. Can you talk about the-- We have a technical audience. Like, can we- Yeah ... go down to brass tacks for just a second?

Sero 49:06

So I, I'm not sure. Is the one that's running on the station the REAP?

Dominik Kundel 49:09

Yes.

Sero 49:09

Yeah. So, uh, in order to, like, do a-- So the model has to be of, of a certain size. If it's larger than 200 billion parameters- Yeah uh, you can get, like, a very good performance out of it. It takes about, like, a day maybe to just do, uh, a basic prune of it. Uh, but you're going to have issues, like the m-last token's gonna be repeating over and over. Um, you're gonna see it going into thought loops. It might not know anything about history. So you can spend extra compute to make it better and better. So right now I've, I have the GLM 5.2 with, uh, three checkpoints. The first one was two days, the second one was a week, and this third one is still cooking right now. So I'm, uh, basically distilling it back into itself with the larger GLM 5.2. Uh, and I'm getting really good performance. We got 71% on Terminal Bench 2.1-

Alex Volkov 49:56

Let's go ...

Sero 49:57

which is one of the highest scores. Um, it's, uh, yeah, not that far away from its l-larger, you know, version. Yeah. Uh, so it takes about a week if you wanna do it properly. The longer you do it, the better it gets.

Alex Volkov 50:10

This is, uh, Nisten, were you satisfied? Let me, uh, you guys won't hear him, but, like, is this a- Is this what you wanted from, from Sero?

Nisten 50:17

Yeah, yeah. So Sero is basically giving us, uh, fair trade, organic witchcraft-

Alex Volkov 50:24

Witchcraft

Nisten 50:24

and wizardry ... and trained, and pruned models -

Alex Volkov 50:27

All right, uh- ...

Nisten 50:28

that just work right eventually.

Alex Volkov 50:29

Folks, we know you guys are busy. We know you jumped on, like, super quick, and I really appreciate it. Thanks for having me. Thanks for letting me crash the party. Um, uh, I just wanna say, uh, are you guys followers of soccer at all? Did you guys see that we won 2.0 against Bosnia? Let's go. Uh, I know it's not related to anything. Alex is like, "What is he talking about?" I

Alex Cheema 50:45

saw, I... No, I saw Anthony,

Alex Volkov 50:46

he mentioned the game.

Alex Cheema 50:47

Are you

Alex Volkov 50:47

Bosnian then?

Alex Cheema 50:47

Oh, okay, cool. How, how

Alex Volkov 50:48

was it? Oh, the, the game was incredible. We're gonna talk about this at the end of the show. But, uh, folks, please check out local.ai. Please check out Exo. Alex is here, and Sero, and we have Nader. And you guys probably know Nader 'cause we talked about, uh, multiple NVIDIA stuff. Uh, check out local.ai. I think there's gonna be a live stream of the whole thing, and you guys are gonna be part of it. So if you-- if not, the website is live right now, I believe.

Alex Cheema 51:10

Uh, it's, it's in early access.

Alex Volkov 51:12

Early access?

Nader Khalil 51:13

Yeah. So we'll share a code, um, for the community. Yeah,

Sero 51:16

everybody that

Nader Khalil 51:16

signs up is

Sero 51:17

gonna get a code.

Nader Khalil 51:17

Yeah. Guys- And one thing I wanna say, we've done some amazing work together. I really appreciate you guys and everything you've done for local AI and, uh, you know, NVIDIA's really enjoyed working with Exo, and so start to something great. Let's just keep ripping.

Alex Volkov 51:27

Let's go. Thank you. And, uh- Appreciate it ... if you're on the pod, you're, uh, now considered the friends of the pod. So when you release big new things-

Alex Cheema 51:33

Finally made it on

Nader Khalil 51:34

We weren't

Alex Cheema 51:34

friends

Nader Khalil 51:35

until today.

Alex Volkov 51:35

Nice meeting you in face to face. Thank you. Alex, thank you so much for joining the call. Nader, thanks so much, man. Appreciate it. Wolfram, let's get you back, Okay, folks, uh, it's time for us to introduce our next guest. You know him, friend of the pod, Dom-

Dominik Kundel 51:47

Hey,

Alex Volkov 51:47

how's it going? ... from OpenAI. Dominic Kundel from OpenAI. How you doing, man? How you doing? Good. Uh, what I need you to do is to remember that the microphone is, like, close and- I'm gonna like lean like in- Yeah, yeah, you're gonna lean the,

Dominik Kundel 51:56

the main, the main issue is that there's like a gap here,

Alex Volkov 51:58

so- I know, sir. I know. I know ... I'm

Dominik Kundel 51:59

just gonna like-

Alex Volkov 52:01

I think, I think we should be all right if you move a little bit to there in terms of camera, yeah? And then you can, you should be able to feel comfortable. Cool. Um-

Dominik Kundel 52:08

And just, like, this, this seems, like, very casual.

Alex Volkov 52:10

I really like this. This should be very casual, bro. The whole thing about Thursd AI, why we love doing this- Yeah ... is just conversation.

Dominik Kundel 52:14

No,

Alex Volkov 52:14

Dominik Kundel 52:15

love it.

Alex Volkov 52:15

Me and you, we had the conversation before.

Dominik Kundel 52:16

I know.

Alex Volkov 52:17

Um-

Dominik Kundel 52:17

The only hilarious thing is I have no idea what you all are talking about.

Alex Volkov 52:20

Dude, so I'll tell... This, there's a- There's a gap here there's a gap here, so we can't put the microphone on here.

Dominik Kundel 52:25

Okay? All right, so now you want me to lean like this. All right.

Alex Volkov 52:27

As long as it's, it's comfortable for you- All right ... you can even hold this. I'll,

Dominik Kundel 52:29

I'll, I'll be fine.

Alex Volkov 52:30

Dominic, you're the perfect person to transition us from local to kind of like big bang AI.

Dominik Kundel 52:36

All right.

Alex Volkov 52:36

The reason why you're perfe- perfect person with this is that, uh, last year at DevDay-

Dominik Kundel 52:40

Yeah ...

Alex Volkov 52:41

um, this was just after you guys released GPT OSS.

Dominik Kundel 52:44

Right.

Alex Volkov 52:44

And I was walking out of DevDay, and somebody, I don't remember who, told me I have to meet you.

Dominik Kundel 52:49

Yeah.

Alex Volkov 52:49

And like, here's the guy who's, like, pushing for OSS stuff. Yeah,

Dominik Kundel 52:51

yeah.

Alex Volkov 52:51

And so we just had Nader Khalili from NVIDIA, which obviously you guys in OpenAI were tight friends with, obviously, uh, also we in CoreWeave. Uh, and the folks from Exo Labs are running, like, models. They're probably gonna, like, try, uh, GPT OSS models as well. I want you to maybe open with why at all OpenAI has released GPT OSS, although it has been a year.

Dominik Kundel 53:13

Yeah.

Alex Volkov 53:14

I'm not gonna press you towards any, uh, future releases. I know how the thing works. I

Dominik Kundel 53:18

was, I was g- I was gonna say, you know what the answer is gonna be. Yeah,

Alex Volkov 53:20

yeah, yeah. Uh, but yeah, let's talk about, like-

Dominik Kundel 53:23

Yeah ...

Alex Volkov 53:23

uh, OpenAI and, and open source AI.

Dominik Kundel 53:26

Right.

Alex Volkov 53:27

Yeah.

Dominik Kundel 53:27

Um, I mean, overall, like, we believe in, like, a, a big open ecosystem, and, like, one of the things with, like, specifically the open models is, um, like, we know that there are situations where you just, like, can't rely on, like, a hosted model for a multitude of reasons, right? Like, it might be a latency situation, it might be a, like, data privacy situation- Mm-hmm ... where, like, you have some use cases where you just can't rely on, like, a hosted frontier model. And so our goal with GPT OSS was really, like, we spend a lot of time talking to the community. We actually did, like, a roadshow back then, like, um, with, like, folks from around the world, talking to them with, like, Sam and Greg and everyone, trying to understand what were the things that they were looking for with, like, open models. And one of the things we did with GPT OSS was we wanted them to be able to, like, easily switch between, like, an open model and a, and a hosted model from OpenAI without having to feel like they're, like, two different worlds, right? Like, there's a, there were already a lot of customers who are using- Uh, back then, like Llama and, and like open model, uh, and, and like OpenAI models. And so we wanted to, like, have a good combination of these models.

Nisten 54:33

Yeah.

Dominik Kundel 54:33

And so GPT-oSS always kind of felt like it had a lot of the same capabilities as like an o4 mini deep research had, right? Yeah. So like it had built-in web search, it had, um, reasoning, uh, it had all of the sort of like capabilities that you were familiar with from that. Mm-hmm. So it was very easy for you to, like, swap over. The other part that, um, we did with it is, like advancing, um, uh, advancing research. So, like the models were actually, um... They're, they have a, uh... We didn't put any pressure during the training on the chain of thought, and so that means that, um, like researchers can use it for like interpretability research- Yeah for example, by looking at the actual chain of thought.

Alex Volkov 55:14

And we were actually very excited about this. Uh, and we're looking forward to more. I will just say this-

Dominik Kundel 55:20

I'll, I'll, I'll forward it to the team.

Alex Volkov 55:21

Forward this, please. All right. So, uh, from local models, uh, from OpenAI, let's go to actual OpenAI. Cool. Um, Fable was released to us.

Philipp Schmid 55:29

Yeah.

Alex Volkov 55:30

And GPT 5.6 was announced. Yes. And, uh, we're not gonna talk about when and how. I know you guys are on a reset break, like, uh, I think it was publicly stated. But talk to me about GPT 5.6 and kinda how it feels to use. Yeah. I would love to know, you guys talked about this publicly- Yeah ... so it's cool with the PR of OpenAI. We checked. Uh, talk to me how it feels to use like this, like, like, uh, is it, is it a point release or does it feel like more? 'Cause Swyx said that he had the early access and does not feel like .1 release.

Dominik Kundel 55:56

So, um- I think like, so one for context, 5.6, we actually released three models. Yes. So Sol, Terra, and

Alex Volkov 56:04

Luna. Congrats on the naming, by the way. Thank you. Whoever may... I, I really hope that the model named itself, because like the naming is finally good. Daybreak is great. Sol, Terra, and Luna is also awesome. Love it. So like finally OpenAI is doing some great naming for us.

Dominik Kundel 56:16

Stellar naming. I'm- I see what you did there. Yeah, no, uh, so for, for context, So- uh, Sol is our frontier model. It's the largest model.

Alex Volkov 56:24

Yeah.

Dominik Kundel 56:25

Terra should give you like an intelligence that feels similar to, um, like a 5.5, but at like, you know, half the cost. And then, um, Luna is sort of our smaller model for like, um, especially like more basic tasks basically. But all of them are incredibly capable. Um, honestly, like Sol, like the biggest thing for me with like going from 5.5 to like 5.6 Sol is like, it feels like it gets you even more. Mm. Like, it just gets the job done. Like you give it a problem and like, especially if you use it with like computer use or if you use the new ultra mode that we're introducing, like, um-

Alex Volkov 57:01

That's the Cerebras, uh, exposure?

Dominik Kundel 57:03

No. So, um, we will-- We're, we're going to bring 5.6 Sol to Cerebras. Yeah. We can talk about that. Yeah. But like, um, the ultra mode is basically, one, we introduce a new reasoning level called Max.

Alex Volkov 57:14

Oh, let's go.

Dominik Kundel 57:14

Um, and then Ultra also leverages like, uh, sub-agents more, uh, more extensively.

Alex Volkov 57:20

Oh.

Dominik Kundel 57:20

And so it, uh, you can see it in our benchmarks that we released in the preview.

Alex Volkov 57:24

Yeah.

Dominik Kundel 57:24

Like, it, it pushes it even more. I feel,

Alex Volkov 57:26

I feel like a competitor of yours has like a, like a ultra think mode as well that I use often. Yeah. Uh, and I think that's very dope that like Codex is also getting this. Um, can I just say that I love Codex hovering up all the best things. At the beginning of the show, Peter Goste from Arena- Yeah this, uh, also a co-host of ours, he talked about the banking of the resets. Yeah. Which, bro, just, just fucking thank you.

Dominik Kundel 57:50

I, I'm glad you love it.

Alex Volkov 57:51

Thank you. 'Cause like folks are like saving their tokens and then h-boom, Thibault comes with the reset button. Yeah, yeah. It's like, why was I saving this? Like, all of this is, is, is on zero.

Dominik Kundel 57:59

Well, and, and like not all of us are like chronically online like we are. And like there-- we've had reactions from some folks in the community who are like, "Who is Thibault and what is a reset button?" You know? Like-

Alex Volkov 58:10

Come on,

Dominik Kundel 58:10

man ... um, because they're like more on Reddit or other things, and so like- Yes ... this also makes it much easier for them who like didn't know that a reset might be coming up.

Alex Volkov 58:18

Well, I say this for Thursd AI, um, unlike other live shows that stream only on Twitter, we're live on LinkedIn. Hey, LinkedIn folks. So if you don't know who Thibault is and what the reset button is, now you know, 'cause we talked about this, uh, earlier before. Um, so, uh, Sol, Terra, and Luna-

Alex Cheema 58:33

Yep

Alex Volkov 58:34

And, and Codex is in the mix. Codex is your harness. Yeah. You guys are pushing hard on Codex, seeing insane internal adoption as well. Just recently released a, um, kinda essay of how it's being adopted internally. Yeah. Talk about this, dude. Like, uh, people don't understand how, like, legal folks are adopting this. Like- Yeah ... just talk

Dominik Kundel 58:53

about that. I mean, I mean, Codex, like, I think sort of there's, like, a natural reaction then to people being like, "Oh, Codex is our coding agent." But, like, we've, like, especially with the app and, like, over the last couple of months, really sort of made it, um, significantly more approachable for non-developers or, you know, realistically, like, the role that we fulfill as, like, developers has changed as well, right? Yeah. Like, there's, like, so much more work that you do that is not just cranking out code.

Alex Volkov 59:17

Yeah.

Dominik Kundel 59:18

And so, um, especially in the Codex app, which is sort of what, like, the majority of the company uses, um, it's just become sort of your daily agent, right? Like, your, like, CoWork equivalent- Yeah ... essentially. So, like, we have an in-app browser, we have computer use. We brought in a bunch of plugins, including, um, actually for those roles, like, um, additional, like, role-specific plugins. So, like, we have a, like, data analyst, uh, plugin that you can install if you work in data science. We have, like, a designer plugin that you can install and, like, all of those sort of plugins that install, like, the most common plugins like a sa- like a sales, um, uh, you know, plugin that will ins- set up all of the tools that you commonly use- Yeah when you work in sales. Yeah. So, yeah.

Alex Volkov 59:59

And the legal folks, what do they use? Is there, like, a specific, uh, legal thing intern- No. I'm specifically calling out legal because, um, I, I wish I could g- show you guys, but, um, um, I don't have this, uh, to pull up super quick. The adoption curve... So first of all, 100% of engineers use Codex. Yeah. Like, so something, like, crazy, like, there's no one in OpenAI that's, like, typing code anymore. Which by the way, the thesis of my talk at Ai.engineer and the backing of this, like,, uh, token billionaire concept called by your coworker Ryan Lupopolo- Yeah ... AKA Yolopopolo AKA Lupopolo. He's going through multiple iterations. Haven't heard that

Dominik Kundel 1:00:34

one yet.

Alex Volkov 1:00:34

This is new. This is new- All right. All right ... from, from this conference.

Dominik Kundel 1:00:36

Nailed it.

Alex Volkov 1:00:36

Uh, and, uh, like, uh, obviously engineers are picking up Codex, but, like, other folks as well. Yeah. And the super app was also announced that, like, moving, I don't know if away from ChatGPT, but like t- like, mm, tell me how Codex looks differently for my mom if I want to bring her on- Yeah to AI versus, like, engineer. Yeah. Tell me about, like, that difference.

Dominik Kundel 1:01:00

One, yeah, we, like, we're bringing together more, like, ChatGPT and Codex, sort of like- Yeah ... Tibco has been saying the sneaky thing is we're doing it in public. Yeah. Uh

Alex Volkov 1:01:08

Very sneaky. We, we didn't notice at all. Yeah

Dominik Kundel 1:01:10

But, um, yeah, one of the, one of the big things we're, um, doing is, like, we're just, like, slightly adapting the UI. Yeah. Like, a lot of it is already, like, we've designed the UI not as, like, a, as a hardcore engineering thing. Like, it has to be something that, like, developers love, and that's very important, and we spend a lot of time thinking about it. But there's, like, with a lot of, like, smaller tweaks, it's actually very friendly for non-developers as well. So, like, realistically, the main thing is, like, if you, if Codex writes files, you know, like, we're not gonna show you the diff view by default- Yes. Yeah ... or, like, the review pane. And similarly, like, you know, it might, like, you know, if it runs a command, like, in Codex, you'd normally see, like, what shell command it ran. Like, we abstract some of that away in the UI for someone, like, who has no idea what Bash or Z Shell is- Yes ... or, like, why it ran rip crap, you know? Um, so we're, like, simplifying some of the language there, but it's still the same capable agent. It's the same Codex behind the scenes, the same Codex agent that is actually open source.

Alex Volkov 1:02:08

Yeah.

Dominik Kundel 1:02:08

Um, so it's the same, it's the same capabilities.

Alex Volkov 1:02:11

I love that. Wolfram-

Wolfram Ravenwolf 1:02:12

I like that you can switch between the modes. Yeah. So you can choose which one you want to use, and, uh-

Dominik Kundel 1:02:16

Yeah ... choice is always a good thing You just go into settings and switch around, but honestly, like, um, I don't switch a lot because it's literally the same capabilities. So, like, there's no need for you to, like, really switch.

Wolfram Ravenwolf 1:02:27

Yeah.

Alex Volkov 1:02:27

Well done. Um, Dominic, I hold in my hand, as I said before, the Token Billionaire-

Dominik Kundel 1:02:34

Yeah ...

Alex Volkov 1:02:34

card, which is, like, the, the gold. Wolfram has one.

Dominik Kundel 1:02:37

I know. I have one at home. I left it there. I don't wanna scratch it.

Alex Volkov 1:02:40

Uh, yeah, that's true. I'm like, "I'm using it," and I'm like, "I'll regret this in a decade." Um, the concept was coined by Ryan Lopopolo on stage at the Engineer of Europe, and then after him, Mario Zechner, the creator of- Yeah ... PHI came on and said, "Hey, slow the fuck down. Everything's broken. You guys are pushing code." Not you- Yeah, yeah ... openly AI, but, like, people just like, hey- Yeah ... YOLO into production. And so I coined this, uh, I don't know if you saw my essay called The Zeal Continuum. Yeah, yeah, I saw that. Uh, Zechner on the one side and Lopopolo on the other side, and then I started asking people. Basically, my talk here at the Engineer was that, um, the, the continuum is not about people, it's about tasks.

Nisten 1:03:13

Mm-hmm.

Alex Volkov 1:03:15

Should you tell me, especially in the context of you saying Codex, uh, the simplified mode doesn't even show you the diff anymore What tasks are you still reading code for, and whether or not are you still reading code in 2026? And I will, uh, uh, preface this with the fact that most people who are tuning into this maybe have a- early access to Fable 'cause it was finally released yesterday. Um, but, like, Fable and 5.6, let me put them on the same pedestal, let's say, uh, uh, capability, um, are different from what folks are used to use. Yes. Right? So you guys are already, like, looking to the future, and I consider Ro Popolo and you and Anthropic, Torika and Anthropic, like all of you are like lighthouses pointing where we're all gonna go in like six months. So tell me about, like, do you still read code? Yeah. For which task, if you do, you will read code, and does, did it change from 5.5

Dominik Kundel 1:04:06

to 5.6? Yeah. So yeah, I mean, like I, I think you're right in terms of that, like, I think it's a spectrum of tasks, right? So, like, I'm doing a talk later at, like, 1:30, um, and it's the most overengineered tas- uh, talk I've ever done. Let's go. And it was, like, built by 5.6. I haven't read any code of it. Um- Let's go ... but, like, that's like-

Alex Volkov 1:04:22

Is this, uh, on the main stage?

Dominik Kundel 1:04:23

Uh, no, it's on the a- Agent Engineering track.

Alex Volkov 1:04:25

Okay, cool.

Dominik Kundel 1:04:25

Um, but, uh, for that one, for example, I haven't read any code and, like, that's fine because, like, I've been testing it. I, like, I know Codex has been actually testing it pretty extensively. Um, like you've seen it sort of like use a bunch of computer use and browser use tasks to, like, verify things, and that part is fine, um, because it's for my, myself. But, like, when we're talking about things like, you know, contributing to the Codex app, for example, we still have strong, one, strong code ownership rules, right? So, like, every PR goes through review, um, and it has to be reviewed by someone on the team. So, like, even if you're like PRing it in from across- Yeah ... the company, which people are welcome to, like, there's, like, specific folks that are responsible for reviewing.

Alex Volkov 1:05:05

Yes.

Dominik Kundel 1:05:06

And they will still read the code, but we're also gonna have first Codex review it. Yes. Right? So, like, only if there's zero reviews, then we're actually gonna review it.

Alex Volkov 1:05:13

So am I hearing correctly, then in July of 2026-

Dominik Kundel 1:05:17

Yeah ...

Alex Volkov 1:05:18

there's still a human reviewing all code that's committed into OpenAI codebase?

Dominik Kundel 1:05:23

Yeah.

Alex Volkov 1:05:24

Wow. Okay. Uh, do you expect- I mean, we,

Dominik Kundel 1:05:26

we keep, we keep improving the things to make it faster

Alex Volkov 1:05:27

to review- Of course. Yeah, yeah. That's what Brian was talking about, the, the, the machine that, like, builds the machine. Exactly. Um, do you, Dominic Kundel, personally expect this to still be the case in one year when we're in, uh, AI Engineer World Fair 2027?

Dominik Kundel 1:05:41

I th- I think ultimately it's about responsibility, right? Yeah. Like, it's, like the human should still be responsible for, like, the code that gets shipped, right? Yes. You can't go afterwards and be like, "You know, the retro is Codex did it," or you know, like, "God did it" or something.

Alex Volkov 1:05:53

I had, I had, I had one of those for my notes today, folks. I'll just say this, Fable did an amazing job doing notes today but, uh, for some reason, uh, Dominic is here and you should be just before. Like, it literally just, like, screwed up the order even though it put the timings correctly. So-

Dominik Kundel 1:06:10

And so, yeah, I think that's the important part, right? Like, it's like the responsibility is there. It's hard to tell, like, what do the tools then look like, right? Yeah. Like, so for example now, like- A lot, like for example, in order to speed things up, like we have like skills to finalize the skill, or we use like videos and other ways to review functionality as well, and then like make sure that like everything passed through Codex code review, and that catches a tremendous amount of stuff, right? But like it's ultimately someone who's responsible for what was created, and that is still important.

Alex Volkov 1:06:41

Yep. Um, Dominic, I think that's most of what I have for you. Cool. The, the thing I also wanted to make sure that we're covering AI engineer is kind of the vibes.

Dominik Kundel 1:06:49

Yeah.

Alex Volkov 1:06:50

You are talking to other folks, they're coming up to you. AI engineer, and this was part of my talk as well, it's changed dramatically since the first one. Just dramatically. Like I-- this is, uh, not, not even the conference and the size of it- Yeah ... our job has changed. We babysit models. You guys are now giving us tools to just like, and reset buttons, et cetera. What are you hearing from folks who are not like part of the big labs? What are you hearing from folks who come up to you? What are the type of concerns they raise? What are the type of things you hear in the hallway track? This is like the most important thing I, I think for me, for the conference.

Dominik Kundel 1:07:21

Yeah, I mean also just like very quickly too, like level set on how quickly things change.

Alex Volkov 1:07:25

Yeah.

Dominik Kundel 1:07:26

Let me remind you, the Codex app is five months old. Five months. Like today.

Alex Volkov 1:07:29

The app itself.

Dominik Kundel 1:07:29

Um, the app is five months old today.

Alex Volkov 1:07:31

Incredible job.

Dominik Kundel 1:07:32

Um, so like things are changing so like ridiculously fast. Um, I think one of the biggest things that I keep hearing from conversations is sort of like things are moving so fast that people just don't know what is the right way to use things, right? And there, there al- there's almost this level of like FOMO, um, of people like feeling like, oh, they're not doing it right. Um, and I think like that has been the biggest thing from my conversations where like there's been a big appetite from people joining our talks, trying to understand like, "How do I use these tools," right? And then like the reality is like we can show you how we use it, but like we also constantly learn from the community, and we bring that back in. Um, and so like I think that is the biggest-- like the, the fun part of this conference is like talk to people- Yeah ... and like get an understanding of like how do, how do they use the tools? Like I've learned ways that other people do things and, and sort of, um, like just like their level of willingness to like rely on the agent, just like go off and do things- Yeah ... that might be different to mine. Um-

Alex Volkov 1:08:34

So one last thing before we let you go. We know you're a busy dude. You have a talk soon. Uh, good luck, by the way. Thank you. Break a leg, man. I, I'm sure you're gonna kill it Um, Roma said on stage yesterday, head of DevRel, uh, interviewed by me yesterday at the OpenAI booth, by the way, the video is coming, folks. I'm super excited to show you this. Uh, the Codex, uh, uh, uh, sessions can talk to other sessions, like spawn other- Yeah can you ta-tell me about this, like, insane ability- ... and how would I use this? And tell people also, it's your camera. It,

Dominik Kundel 1:09:01

it's, it's actually, like, a fascinating, uh, feature that, like, is, is relatively simple in its functionality, but, like, super powerful once you understand it. Um, so in the Codex app, any agent, uh, like, any conversation that you have, you can ask Codex to spin up new threads, and it can also inspect and, um, move around other existing threads. And that means that we've seen a lot of people on the team sort of move to a chief of threads model, where they have one thread, uh, that they then ask to, like, delegate additional tasks. And the interesting thing is, like, compared to, um, like, sub-agents, it's a different relationship model, where, like, with sub-agents, you still have your, like, your main thread that is in charge of, like, the life cycle of the others.

Alex Volkov 1:09:46

Yeah.

Dominik Kundel 1:09:46

But with the, um, with Codex just spinning up a new thread, it's able to, um... That one is, like, the moment it's been spun up- Yeah ... it's independent, right? So you can just, like, keep going. So it's

Alex Volkov 1:09:57

not locked into the UI of, like, a sub-agent anymore. Exactly. I can, like, pick it up and just-

Dominik Kundel 1:10:00

Yeah ...

Alex Volkov 1:10:01

go with it.

Dominik Kundel 1:10:01

And, and so, like, I, for example, I have one main thread for AI Engineer World's Fair, and based on that- Yes ... I spun out one for my breakout talk- Nice ... one for my expo talk.

Alex Volkov 1:10:09

Nice.

Dominik Kundel 1:10:09

But, like, they're, like, I spin them up when I need them to, and then, like, get rid of them again. Um, and so, like, a lot of people started doing this for PR reviews, for example. Yeah. Like, you go through your ent- you have one thread that goes through your entire PR backlog and then spins up separate, uh, threads for every single PR review.

Alex Volkov 1:10:25

I need that. Um, Dom, I keep saying this is the last question, and I keep lying. That's fine. So I have another one from Peter Gosta from Arena AI. Cool. Hi, Peter. A cohost of ours. Peter asks if the one, the, the model that's gonna run on Cerebras-

Dominik Kundel 1:10:40

Yeah ...

Alex Volkov 1:10:40

which, uh, let me just say, 'cause we fully didn't say this, uh, one of the models, is that Sol?

Dominik Kundel 1:10:44

Yeah, it's 5.6 Sol.

Alex Volkov 1:10:45

5.6 Sol is going to run with collaboration of, on Cerebras. Cerebras is going to y- like, stream this in an insane, like, 750 tokens per second speed. Yeah. Is it gonna be the same weights as the Sol I'm getting through OpenAI's API? Do you know? Yeah, it's the same model. It's the same exact model?

Dominik Kundel 1:11:01

Yep.

Alex Volkov 1:11:04

Folks who understand-

Dominik Kundel 1:11:05

It's not a Spark situation. It's like five, six whole

Alex Volkov 1:11:07

Yeah, this is incredible. Peter, were you satisfied with this?

Peter Gostev 1:11:11

Yeah. This is insane, because, uh, I was trying to use the Spark model when it came out, and it just like, it's, it's not very good- Yeah unfortunately. So, and now to see the actual real model, like, I think this didn't get enough hype. Like, it- This

Alex Volkov 1:11:26

did not get enough hype. I agree with

Peter Gostev 1:11:27

you ... I think the...

Alex Volkov 1:11:28

Yeah. Uh, Peter

Peter Gostev 1:11:28

saying this

Alex Volkov 1:11:29

is insane. This is

Peter Gostev 1:11:29

gonna be amazing.

Alex Volkov 1:11:30

Yeah. Mm. And w- w- we're used to Spark being kind of the lesser model. Ah, will it be multimodal on, on servers as well? Yeah. 'Cause Spark is like- Five,

Dominik Kundel 1:11:37

six, so.

Alex Volkov 1:11:38

It's fi-... Okay. Dam, uh, always a pleasure, dude. I consider you a friend of the pod. Thank you so much for coming. Uh, thank generally to OpenAI as a concept for bringing us the token billionaire as a concept. I love it. I'm assuming that you're also a token billionaire. I

Dominik Kundel 1:11:51

am a token billionaire.

Alex Volkov 1:11:52

We know you're busy. We're gonna have to let you go. Uh, please come back to

Dominik Kundel 1:11:55

us. When will we be token billionaires? When will that be a thing? Next

Alex Volkov 1:11:57

year? Yeah.

Dominik Kundel 1:11:57

Good question. Right. Maybe next year.

Alex Volkov 1:11:59

Uh, before you jump off- ... can we ask, uh, for a picture from, like, over there, from the folks, uh, with just the three of us? Thank you so much. Uh, guys Good

Dominik Kundel 1:12:12

Cool.

Alex Volkov 1:12:13

Thank you. No- Thanks for having me ... thanks. Break a leg, man. Cool. All right, folks.

Dominik Kundel 1:12:16

Cheers.

Alex Volkov 1:12:17

Wolfram.

Nisten 1:12:18

Cheers.

Wolfram Ravenwolf 1:12:18

No, thanks.

Alex Volkov 1:12:19

Um, all righty, folks. As you see, back to back to back to back to back here at The Ai Engineer. And, um, can we get the two-person shot? Thank you. Uh, back to back to back here at The Ai Engineer. Uh, and it's kind of working, right? Uh, how's the show for you folks? If you're listening, if you're tuning in, there's almost, like, 1,000 of you right now. Please comment and tell us, let's say, um, l- l- let's say, tell us which segment you enjoy better, the open AI one or the local AI one. Meanwhile, I will say, and, uh, folks who followed the show know very well, that the show has one sponsor. It's Weights & Biases by CoreWeave. And, uh, we have a corner here we call This Week's Buzz, that talks about everything that happens in the world of Weights & Biases by CoreWeave. So let's go to This Week's Buzz real quick while we set up our next guest. Can we get the, the watch out? Thank you. Folks, with us on stage, a colleague of ours. Yes. A friend from the office. Zubin Aisa, uh, a, let's say, the, the maybe one of the more AGI-built folks in- Oh, most certainly ... in, in Weights & Biases. I want you to the microphone a little bit. Please, yeah.

Zubin Aysola 1:13:48

Happy to. Yeah. I'll move it to myself.

Alex Volkov 1:13:50

Yeah.

Zubin Aysola 1:13:50

Yeah, awesome.

Alex Volkov 1:13:51

Uh, Zubin? Uh- Yeah ... first of all, uh, super quick question. How is Ai.engineer for you?

Zubin Aysola 1:13:54

It's been fantastic. I'm amazed at how technical some of the talks have been. Yeah. I, it's been absolutely sublime. Yes. Also, just meeting everybody is exciting. The US won the World Cup game yesterday. Oh,

Alex Volkov 1:14:03

let's go.

Zubin Aysola 1:14:03

I didn't get to go. Oh, let's go. I heard you did, Alex. But, uh, quite excited. What did I say? What did I say? So it's a great time to be here.

Alex Volkov 1:14:08

Yes, it's an amazing time. Um, Zubin, we launched a thing. Yeah. And we obviously talk about Weights & Biases. We wear the yellow jackets. We always cover.

Zubin Aysola 1:14:18

Yeah.

Alex Volkov 1:14:18

The more fun parts is when colleagues of ours ship and we get to, like, present this on the show. Love having them. So the last time we were here with, um, Adrian- Yeah ... and Sean, we, uh, sorry, with CVP, we talked about Hivemind.

Zubin Aysola 1:14:33

Excellent.

Alex Volkov 1:14:33

And I actually used Hivemind to get this. Folks, I told you about Hivemind before. Hivemind is an agent that sits on your laptop or multiple laptops or SSH, uh, uh, VMs, and basically counts tokens from Cursor, from Cloud Code, from, like, everywhere, and when I had to prove that I'm a token billionaire, I went to Hivemind and looked throughout all of my tokens.

Zubin Aysola 1:14:53

Oh, it's awesome.

Alex Volkov 1:14:53

And we told you-

Zubin Aysola 1:14:53

Plus, I get to talk to Alex with my Hivemind skill, where I can just talk to you.

Alex Volkov 1:14:57

Yeah. Use

Zubin Aysola 1:14:57

your persona.

Alex Volkov 1:14:58

I- It's excellent ... I, I don't know that I'm the best at prompting. I don't know if you should. So this was the last ship, and I actually used this tool. Uh, and Zubin, you have been cooking something that's called Aria?

Zubin Aysola 1:15:09

Aria. Aria's

Alex Volkov 1:15:10

the name. Tell us about Aria, man.

Zubin Aysola 1:15:11

Yeah. Aria is our Weights & Biases agent. It's like, the ultimate goal is to own the entire end-to-end flow of AI research entirely in the Weights & Biases platform. Yeah. I think we've been pretty staunch at a company I joined a year ago, but even before then certainly, Weights & Biases builds the best tools for AI engineers or people training machine learning models, tracing agents, et cetera. Uh, and so we built Aria using our own platform, but the exciting part is that Aria's just a co-pilot assistant in the, you know, to use every possible buzzword, in the Weights & Biases UI, and all it does basically is do auto research for you. Yes. So at my talk at 11:10 today, I'm gonna have Aria-

Alex Volkov 1:15:44

Wait. The, the Karpathy named-

Zubin Aysola 1:15:47

Oh, yeah

Alex Volkov 1:15:48

self-improvement in the, in the home loop auto research?

Zubin Aysola 1:15:51

Oh, yeah. The holy grail of actual AI research and the direction that certainly the labs are moving in, and we just want to give it to everybody else. Yes. You know? So that's the product that we maintain- Yes ... and I lead the science side for, and it's awesome. It's, it's exciting. We're giving it out to everybody.

Alex Volkov 1:16:03

Yeah. It

Zubin Aysola 1:16:04

landed at GA on Monday.

Alex Volkov 1:16:05

How do folks access it, Zubin?

Zubin Aysola 1:16:06

You know, in the Weights & Biases UI, you just click this little blue button at the top right corner. Yes. It might be a different color if you're in light mode, but I'm a dark mode user.

Alex Volkov 1:16:14

Yes.

Zubin Aysola 1:16:15

You click this little top right button, you chat with Aria, and you can have it do anything for you. Yeah. So at my talk today, I'll present Aria, where it reads its own production trace logs, updates its own prompts live, and runs offline evaluations in our RL simulation environment for itself to improve. Yeah. Uh, but that's, you know, maybe a more sophisticated version of it. If you wanna do anything else, like reap your loss curves or figure out where your gradients are, you know, exploding or, you know, imploding, you should just ask Aria

Alex Volkov 1:16:40

Should... Folks, you should just ask Aria. Uh-

Zubin Aysola 1:16:43

I think Just Ask Aria is really gonna be the tagline It's

Alex Volkov 1:16:45

a good tagline, dude.

Zubin Aysola 1:16:46

Yeah, right?

Alex Volkov 1:16:46

Okay. You wanna look at the camera-

Zubin Aysola 1:16:48

Please ...

Alex Volkov 1:16:48

and say?

Zubin Aysola 1:16:49

Just ask Aria.

Alex Volkov 1:16:50

Let's go. All right. Zubin, thank you so much for coming up. Thank you. We know you're a busy dude. We know you, like representing- No, this is excellent Break a leg.

Zubin Aysola 1:16:56

Thank you.

Alex Volkov 1:16:57

Uh, give them hell. Thought I won't. Show them that we're token billionaires here at Weights & Biases. Token billionaires. I gotta get

Zubin Aysola 1:17:01

me one of those. That's

Alex Volkov 1:17:02

a- Dude, you need to... You need HighMon- 'Cause if you're

Zubin Aysola 1:17:03

pumping token billionaires, I'm pumping token

Alex Volkov 1:17:05

billionaires. Like- Oh, I'm, I'm 100% sure that you deserve this, man. Um, and also, uh, thank you for being such an energetic and passionate-

Zubin Aysola 1:17:14

You too ...

Alex Volkov 1:17:14

teammate that brings a lot of, like, AI excitement. We,

Zubin Aysola 1:17:18

you know- There are few better places. I've worked at a couple of jobs in my time. Yeah. I think there's few better places that I could imagine working. It's been sublime.

Alex Volkov 1:17:24

We also have some tokens also. So

Zubin Aysola 1:17:26

that's- We

Alex Volkov 1:17:26

also

Zubin Aysola 1:17:26

have some tokens ...

Alex Volkov 1:17:27

another good... Oh, maybe tell me, te- tell me about kind of the, the... We have, like, one or two more minutes. Please.

Zubin Aysola 1:17:31

Yeah, yeah.

Alex Volkov 1:17:31

Tell me about the backend of Aria. Like, what is it running? Is it running on CW Inference? Is it running, like, on the main models?

Zubin Aysola 1:17:36

Yeah. I think, I think the exciting part about Aria at present is that it's running on some of the main models. But the thing that are cool, that, that is cool about our pipeline is that, you know, with the traces that you generate with Aria itself, if you use it a bunch, we can fine-tune models for exactly your workflows using CoreWeave serverless training- Yeah ... or Weights & Biases serverless training, I suppose. Yeah. Training running on CoreWeave sandbox infrastructure, et cetera. Like, the full pipeline is all CoreWeave homeworld, right? Uh, which is, I think, very exciting. Yeah. We're using a bunch of other tech at the pre- at present, but I think increasingly dogfooding and using our own service- Yeah ... in the same way that I use Weights & Biases to train my own agents- Yes,

Alex Volkov 1:18:07

sir

Zubin Aysola 1:18:08

is the dream Um, what's nice for me is that certainly I le- build Aria without ever leaving the platform. Yes. And so we wanna give that to everybody else.

Alex Volkov 1:18:15

Uh, Zubin, last question for you.

Zubin Aysola 1:18:16

Please.

Alex Volkov 1:18:17

Um, my talk was about the ZL continuum-

Zubin Aysola 1:18:19

Yeah ...

Alex Volkov 1:18:20

between Mario Zechner, who read every fucking line of critical code, to Ryan Lopopo, who coined, uh, a token billionaire- Yeah and said, "Hey, build a system that builds a system, and your attention is, should go towards, like, the repeating bugs," et cetera. Yeah. So

Nisten 1:18:32

the ZL

Alex Volkov 1:18:33

continuum is about, like, tasks and less them so people. Uh, you're doing a lot of agentic coding, obviously.

Zubin Aysola 1:18:38

Yeah.

Alex Volkov 1:18:39

How much of that is human readable at 2026? And do you expect that to change significantly with Fable and 5.6 coming up?

Zubin Aysola 1:18:45

You know, I, I don't know if it'll change significantly, although I haven't written a line of code in eight months. But I think that the benefit of, you know, thinking about- You haven't written

Alex Volkov 1:18:52

one, but have you read-

Zubin Aysola 1:18:52

I haven't written one. Oh, but I've pushed a lot of code- Yes ... most certainly, you know. And even that I've, you know, my agents have pushed a lot of code on my behalf.

Alex Volkov 1:18:57

Yeah,

Zubin Aysola 1:18:57

that's true. Yeah. I think the exciting part is you, we get to move upwards in the stack, right? Like, if you think, if I think about my career as an engineer, it's like, first I was fixing little bugs, and I was writing little tasks and running little tests. And then, you know, I expanded to, like, designing little systems or whatever else, and now it's like, okay, can we lead this team to sort of, like, build this agent as a full team?

Nisten 1:19:14

Yeah.

Zubin Aysola 1:19:15

Everybody gets to move up that stack, right? Yes, sir. We all level ourselves up one level.

Alex Volkov 1:19:18

Yeah.

Zubin Aysola 1:19:19

So, you know, I mean, maybe I should read a little bit more code, but I think that it just gets easier and easier- Yeah ... and faster and faster and better and better. That's ultimately-

Alex Volkov 1:19:26

I, I call this, uh, the capability drift. I have an arrow there- Yeah ... and it says, like, the capability drifts us towards L.

Zubin Aysola 1:19:31

And what I love is that if you set them on autopilot and you just let them run- Mm-hmm ... they still write shitty software. So I am still useful. Just a little bit useful, but with Fable, maybe not, you know?

Alex Volkov 1:19:40

Bro, everybody says Fable is amazing. Hold on.

Zubin Aysola 1:19:42

Please.

Alex Volkov 1:19:43

Uh, and I used Fable, and this is, like, the third time I repeated this, but, like, for every guest, I will tell this, uh, again. Ah, it's good. So Fable have prepped the show. This is, like, the, the run of show for Fable.

Nisten 1:19:52

Yeah.

Alex Volkov 1:19:52

And, uh, I can show you that it placed, like, the, the timings are correct. Nice. But these people should have been here, and, uh, literally Dominic comes at 9:30, and 9:45 is This Week's Possible. You showed up.

Nisten 1:20:06

Here.

Alex Volkov 1:20:07

You should have been after Dominic based on the order. Yes, sir. So despite Fable being back, I'm so happy that it's back, and despite it being the next capability, read the important stuff, folks. It's important. Your attention is not going anywhere. If anything, it's more important now to know where to place your attention. Yeah,

Zubin Aysola 1:20:23

you just wanna do more.

Alex Volkov 1:20:24

Yeah.

Zubin Aysola 1:20:25

But, you know, you still need to be in the loop.

Alex Volkov 1:20:26

Yeah.

Zubin Aysola 1:20:27

It's just a different kind of loop.

Alex Volkov 1:20:28

And you need to look at, uh, CoreWeave Aria- Yeah ... which Zubin launched. Zubin, thank you so much. Thanks so much. Congrats on the ship, bro. Thank you. I'm so happy that the labs are shipping something incredible.

Zubin Aysola 1:20:37

Thank you, guys.

Alex Volkov 1:20:37

Thank you for coming up. Thank you for having me. And, uh, folks, definitely check out Zubin's talk, and check out, uh, uh, CoreWeave Aria, and come by our booth as well. We have a booth here. We have a, like, a humanoid robot walking around in the, one of these yellow things. We

Zubin Aysola 1:20:48

got a pretend humanoid robot walking around.

Alex Volkov 1:20:50

Oh, that's true.

Zubin Aysola 1:20:50

Come say hi.

Alex Volkov 1:20:51

Zubin, I can see you're a friend of the pod.

Zubin Aysola 1:20:53

So good.

Alex Volkov 1:20:53

Anytime.

Zubin Aysola 1:20:54

Exciting.

Alex Volkov 1:20:54

Thank you, man. Thank you for coming up. All right, folks, a short break. Uh, as you may have seen, I know I talk fast.

Zubin Aysola 1:21:04

Say.

Alex Volkov 1:21:04

This motherfucker - But much faster ... is it 200 words per minute or something? Uh, so this is Zubin Isola from the Weights & Biases CoreWeave team, uh, in charge of Aria. Have you used Aria already?

Wolfram Ravenwolf 1:21:17

I have used Aria already.

Alex Volkov 1:21:19

Yeah.

Wolfram Ravenwolf 1:21:19

And I'm excited to use my own agent to connect to Aria to do the stuff on the backend that my agent has no access to directly- Yes because I think that's where we are going. You have your personal agent that knows about what you are doing, and you have all these specialized agents, like your assistant hiring a data scientist to do stuff.

Alex Volkov 1:21:36

Yeah. So folks, definitely check it out. Uh, and our next guest is already here. I'm so excited. Stef. Good morning. Thank you so much for coming. Uh, we do need you to sit closer to the mic. Yeah. So folks will arrange this for you. Uh, and you should feel comfortable.

Stefania Druga 1:21:51

Perfect.

Alex Volkov 1:21:51

Um-

Stefania Druga 1:21:52

Good morning.

Alex Volkov 1:21:53

Good morning. How are you?

Stefania Druga 1:21:54

I'm good. I got a coffee. Uh, and yeah, I'm so happy to meet you in person again. Does it work?

Alex Volkov 1:22:00

Uh, hold on. Yes. Let's, let's do a wide shot. Let's keep it on wide, okay? And then when she talks, let's zoom. Um, Stefania Druga. Yes. Uh, you've been on the pod maybe in audio form. I don't remember if we put you ever on video. No. So welcome to the video edition of ThursdAI. This is a bit different from the last time you were on And we're having a lot of fun. Yes. Uh, and we're also... AI engineering is different. We, we've met at the AI Engineers previously. Uh, you've just been to the recent one in Sing- Singapore as well. And we've had you on the pod, and we've talked to you about, uh, education for children and bots and, and stuff like that. And, uh, since then, uh, you're no longer a research scientist at DeepMind, you are now with Sakana.

AI 1:22:42

I'm with Sakana.

Alex Volkov 1:22:43

And when we covered Sakana AI multiple times, and we talked about David, then we talked about, um, um, uh, Leon? Yep. Yeah, Leon James, the co-transformer.

Dominik Kundel 1:22:52

Mic. It's

Alex Volkov 1:22:52

closer. Yeah.

Stefania Druga 1:22:53

Yeah.

Alex Volkov 1:22:54

You, you can look there when you answer and, and just... Yeah, I, I'm sorry it's not comfortable, but it's okay. It's okay. Live show. Um, so we talked about Sakana. We talked about, uh, multiple models. And last week, or two weeks ago, or last week, remind me. Last week. Last week. Time moves so fast. A lot has happened since last week. Yeah. Um, we talked about Fugu.

Nisten 1:23:11

Yes.

Alex Volkov 1:23:11

And Wolfram was not able to pronounce it, so it sounded like we're doing, uh, superlatives on air, and we talked about Fugu. And then it clicked here that you're working at Sakana. We have a friend in the lab that shipped this, and I was like, "Ah, we should have Steph." But because you're based in Japan, it wasn't super practical, and now you're here, and I'm super excited. Thank you so much for coming. What is Fugu?

Stefania Druga 1:23:31

Thanks for having me. Uh, it's so nice to see you both. Uh, Fugu is a new router model which we released last week. Yeah. Uh, and... But I'm gonna announce new things today just for the pod. Yeah. So, uh, it's a router model that al- Yes ... allows you to basically pick the best model for the job that you are doing. Yeah. So it's an orchestrator, and it has very, very good performance. So, uh, we tested it on all sorts of benchmarks. People are actually building incredible demos with it, like we had auto research running with Fugu and discovering new, uh, optimization for GPU implementation. We had Rubik cube solving. I used it a lot for robots, like, uh, controlling robots and- Um, I'm having a lot of fun. People are having a lot of fun with Fugu. Um, you can use it now in OpenRouter, like we integrate with Vercel, and we actually just announced today, like, that it's available in Codex and also OpenCode. Yeah. So basically anywhere you work, you can try it out. Um, and I wanna talk a little bit about the technology behind it.

Alex Volkov 1:24:35

Yes, please.

Stefania Druga 1:24:35

Um-

Alex Volkov 1:24:36

Remember that our audience is technical. We love a deep dive. We asked Cyril about pruning models and how fast this takes, so yeah.

Stefania Druga 1:24:41

Perfect. Yeah. So there's two papers that Fugu is based on, and these are published at ICLR this year. Yeah. And people can go and check them out. So there's a paper on Trinity, which is, like, a very lightweight model that is, um, learning how to orchestrate between a thinker, a verifier, and, um, an action model, and it's, it's actually an evolutionary model. So if you are into evolutionary models and wanna learn more about the Trinity conductor piece, um, you should read that paper. And then there's, uh, the conductor, the conductor, um, paper from ICLR, which is basically trained on rollouts from, um, how do you actually look at workflows so you know which model you're gonna call for a different part of a workflow.

AI 1:25:30

Yeah.

Stefania Druga 1:25:31

And that one was trained through reinforcement learning. But both of these papers are available. People can read it. They were peer reviewed.

AI 1:25:37

Yeah.

Stefania Druga 1:25:37

And then Fugu combines the two. Um, so it actually combines the orc- like the Trinity part to learn how to orchestrate, and basically it's not just learning who to t- send the task for, but it's recursive. So it can rewrite the prompt, it can verify the outputs before deci- deciding, like, I'm gonna use this model or this other model.

Alex Volkov 1:25:54

Yeah.

Stefania Druga 1:25:55

And the philosophy behind all of this work, like, I don't know if you knew, but Sakana means fish in Japanese.

Alex Volkov 1:26:01

Yes. Yeah,

Stefania Druga 1:26:01

yeah. And the whole vision for the company was very much built on, like, collaboration, open-endedness. Evolution models, yeah. Evolution models. So I think that's quite unique, uh, and that's also the approach for Fugu. The idea is- Yeah ... that there isn't, like, one model to win it all, but that you can actually get much better performance by picking the which model is good for what task- Yeah ... and orchestrating that.

Alex Volkov 1:26:25

So we're excited about Fugu. We're excited about model routers. It's very funny, uh, just the week before Fugu, OpenRouter released a router.

Stefania Druga 1:26:32

Right,

Alex Volkov 1:26:32

it was- Which I found it hilarious.

Stefania Druga 1:26:34

Yeah, the timing, the timing was funny. What about the name?

Wolfram Ravenwolf 1:26:36

Um, uh

Alex Volkov 1:26:38

Carbon something

Wolfram Ravenwolf 1:26:38

Yeah, mixture of agents ...

Alex Volkov 1:26:40

mixture of agents. Okay. And the timing was great. And then they're like, "Oh, there's more routers, let's add you to, to the, to, to the thing." And have you been working with Fugu then in your harnesses? Like, how does that apply in your work, uh, like process? When do you plug in Fugu?

Stefania Druga 1:26:54

Yeah. I, we actually use it for many of the research projects that I'm working on right now. Yeah. So, um, I'm part of their RSI lab at Sakana. Yeah. It's like recursive self-improvement systems. So a lot of that is, like, in the future of auto research and future of research assistance. Yeah. So of course we use Fugu as part of it. Um, there's a project that I'm leading on AI for science, like predicting typhoons and, uh, predicting El Niño

Alex Volkov 1:27:19

phenomena. Ooh,

Stefania Druga 1:27:20

okay. So for physics, like we actually know that the current models, including like existing world models, are not always very good, especially when you have very chaotic events. So we found that Fugu helped a lot in knowing when to distribute the task for a numerical model versus a more like, uh, fuzzy logic and web search. And we had best results with Fugu for, for that work on typhoon prevention, and actually that demo is included in one of the demos on the public repo.

Alex Volkov 1:27:49

Oh, that's super cool. Um, Stef, I have a bunch of questions for you.

Stefania Druga 1:27:52

Yeah.

Alex Volkov 1:27:53

So we talked about Fugu. Yeah. You, you represented, uh, Sakana great. You're working with legends, right? You're working with Leon, you're working with David. Mm-hmm. Uh, you're working in Japan, way outside of the SF bubble. Yeah. And then you're landing in the Ai.engineer and kinda like y- you feel kind of the, the vibes here, e- et cetera. Tell me about, like, how that's going. Like, how is it working over there? How is it working... H- how do people on the street treat AI? Like, it's very interesting to us because, like, we are kind of in a bubble. Denver's not that much of a bubble, but, uh, ju- just give me your insight into like how is it working over there, and how do people treat AI in Japan? I would love to know more details from just Stef the person and not just Stef the scientist.

Stefania Druga 1:28:31

Of course. Yeah. Uh, thanks for asking that question. Of

Alex Volkov 1:28:34

course.

Stefania Druga 1:28:35

So I moved to Japan a year ago, a year and a half, and it's been like... I, I love it. I love-

Alex Volkov 1:28:41

And also it's getting louder in here. If

Stefania Druga 1:28:43

you could try to speak- So I'm gonna have to speak up. Okay ...

Alex Volkov 1:28:45

yes, directly and

Stefania Druga 1:28:46

louder. Yes. Oh, perfect. Yeah. So I moved to Japan a year and a half ago. People in Japan love AI. They love technology. They have a very hopeful and positive, uh, outlook towards technology, and I think that's very nice. And I know David, um, one of the reasons he wanted to build Sakana in Japan is because of this more positive outlook towards technology- Yeah ... and try to build it in a way that it benefits k- community

Alex Volkov 1:29:10

David is David Ha, and his nickname on Twitter is hardmaru.

Stefania Druga 1:29:15

Hardmaru, yes.

Alex Volkov 1:29:15

And, and he's, like, a very known, very, uh, very celeb, like, type of scientist. Like, a- and also hot takes.

Stefania Druga 1:29:22

Yes.

Alex Volkov 1:29:22

I love David's hot takes. Yeah, so please go ahead.

Stefania Druga 1:29:24

Yeah, so, uh, David Ha, um, l- picked Japan for this, and I- it's also, like, a really big focus on sovereign AI. So I love being outside of the circle here, although I used to live here, because I think it's important for different countries and, uh, different communities around the world to have their own solutions- Yes to take their own stab at, like, pushing the frontier research beyond the transformers. And you know, it's funny because Cleon, who's our CTO, Cleon Jones, was part of the attention is all you need.

AI 1:29:55

Yes.

Stefania Druga 1:29:55

But now he's, like, the biggest proponent for, like, we need new architectures, we need new algorithms beyond the transformer.

AI 1:30:01

Yeah.

Stefania Druga 1:30:01

So that's also, like, we have both of those, like, practical solutions like FUGU for- Yeah ... how do we do a better job with current workflows in more sustainable ways, but also forward-looking, pushing the frontier, like, what comes after the transformer? Um- Yeah ... so I'm having a lot of fun. And, you know, it's, uh, Japanese is a hard language, so it's a good challenge for my brain to try to learn it. Um-

Alex Volkov 1:30:22

How, um, how, how well-versed are you in Japanese at this

Stefania Druga 1:30:26

point? Oh .

Alex Volkov 1:30:29

I, I did not understand any of that. Um, Stef, uh, let's talk about Ai.engineer as a conference, okay? Yeah. Uh, we're here. We met, I believe, also at the Ai.engineer, um-

Stefania Druga 1:30:39

At the first edition ...

Alex Volkov 1:30:40

at the first one, which is, like, significantly, significantly a different beast than it is right now.

Stefania Druga 1:30:45

Than it is now, yep.

Alex Volkov 1:30:46

Um, talk to me about the hallway track, the people that you meet here. What do you talk about with them? What is the concern? What is kind of the, the active process of what's going on?

Stefania Druga 1:30:57

Yeah. So I was very happy to see more work on local models. Yeah. My talk yesterday actually was about memory for long horizon tasks on local models. Yeah. But there were several... And, and I know you had people on the pod this morning.

Alex Volkov 1:31:10

Oh, yeah, we had all of the local.ai

Stefania Druga 1:31:13

folks. That was this morning. Were coming, yeah. Nice. Um, so I'm, I'm very excited to see that now we can have DeepSeek V4 Flash that we can run on a Mac M3 Ultra. It seems like the bottleneck is access to hardware, actually.

AI 1:31:24

Yeah.

Stefania Druga 1:31:24

Um, but hopefully that will get solved. Uh, I know there was some discussion about hardware access as well, or just in general hardware.

AI 1:31:30

Yes.

Stefania Druga 1:31:30

Um, but yeah, like, local models were big. Memory seems to be, like, just starting. Like, there's so much to learn because basically this is the insight. Like, the type of tasks we're g- able to do or tackle now with agentic AI are only becoming longer and longer horizon. So then you start to get, like, the context rot and the context drift. So having better memory systems that are open source is gonna be crucial. And, and the memory track for me was, like, super interesting. Um, and I, I, I loved learning from, uh, from that as well. Um, yeah, what else? Many more, like- Non-sequitur, but, uh, I was so happy to see many more women. Um-

Alex Volkov 1:32:09

Yes ...

Stefania Druga 1:32:10

I was surprised because-

Alex Volkov 1:32:11

I'm happy to see more women on the Thursd AI panel, so thank you for coming Thank you, thank you for breaking this pattern for us, Steph. I really appreciate it.

Stefania Druga 1:32:18

Uh, I ha- happy to always come back. Yes. But in general, like, it's, it's been... And also people from outside the field. Like, I met people who are working, uh, in recycling business- Yeah ... and coming to learn how to use AI to optimize, like, their-

Alex Volkov 1:32:31

Incredible ...

Stefania Druga 1:32:32

um, their efforts. So all sorts of, like, yeah, robotics, local models, memory, reconnecting with old friends, like, and making new friends. Yeah. Um, I haven't slept much in the past three days because I'm super jet lagged, and I was running my experiments on my local machine in Japan and, um, but you know, like you, we, like you said, I think, Wolfram, like, we, we live on tokens and, uh, coffee or what was your quote?

Wolfram Ravenwolf 1:32:59

Yeah. Uh, tokens are the new caffeine for

Stefania Druga 1:33:03

us. Tokens are the new caffeine, yeah Smart tokens

Alex Volkov 1:33:06

Um, yesterday- Mm ... was a big day in AI, uh, because Fable's back.

Stefania Druga 1:33:12

Yes. Have you played with it?

Alex Volkov 1:33:13

Oh-

Stefania Druga 1:33:14

Uh, you used it for this

Alex Volkov 1:33:15

one ... I used it for this. I used it for... I'm probably gonna use it for the, the post-show notes. I used it for the research for the show- Mm-hmm in addition to this. Um, are you a big Fable fan? What's your kind of like go-to right now? And I know, like, Fugu's also using, like, production stuff as well. There's a route to Fable already. Yeah. Can you tell, tell us about, like, your experience with Frontier, where, where you are and what do you use?

Stefania Druga 1:33:34

I got to test it a little bit when it was launched, but then, uh, of course, I lost access. Uh Yoinked

Alex Volkov 1:33:39

away by the US government and, and given back by the US government. Thank you.

Stefania Druga 1:33:43

Um, I think for me it really re-emphasized the need of sovereign AI because it feels like so, um, disempowering, right? Like, to get access to a powerful model, wanna try the capabilities, and then, like, all of a sudden it goes away. Or like you talked about, like the weights being modified behind the scenes, and I really do not like that. Um, I, uh, am running some evaluations on it right now. I don't have the results yet. But, um, yeah, in general, like, I'm a, a little bit disappointed by the fact that it blocks, like, AI questions and AI research because that would be my primary goal for using it. Yes. So-

Alex Volkov 1:34:18

At least it blocks now. Yeah. When they released it, they're like, "Oh, we're gonna make the model dumber without telling you about this."

Stefania Druga 1:34:24

Yeah.

Alex Volkov 1:34:25

Like, what the fuck is that? What the fu- No.

Stefania Druga 1:34:26

So, so, so there's a joke that, like, if it answers your question about research, that means your research is bad, so

Wolfram Ravenwolf 1:34:33

Oh. But it sets a bad precedent that- It's not,

Alex Volkov 1:34:36

it's not frontier

Stefania Druga 1:34:37

enough ... something like that It's a terrible precedent. It's not frontier enough. It's a terrible precedent. Um, and, and I actually think the direction is to have router models that you were mentioning, like, you want your, you know, family, I want my family to start using these things in a safe way, and I think the future is to have a router model that can decide, like, what stays on device and what goes to the cloud and which cloud, and being able to control that better. Yeah. Um, so I didn't answer your Fable question, but like, uh-

Alex Volkov 1:35:02

You're using it and you're testing it.

Stefania Druga 1:35:04

I'm testing it.

Alex Volkov 1:35:05

That's what I'm hearing. Yes.

Stefania Druga 1:35:06

Yes.

Alex Volkov 1:35:07

All righty. Um, I think- The cool thing about Sakana is kind of like you, you guys out of the distribution and also trying, like, new, like, algorithms, approaches. Um, anything we should look out for?

Stefania Druga 1:35:22

Yes.

Alex Volkov 1:35:23

Okay.

Stefania Druga 1:35:24

Uh, check- What

Alex Volkov 1:35:25

should we look out for?

Stefania Druga 1:35:26

Check out the work from, uh, my colleague Jeff Seely on SHEEFs. So the insight there

Alex Volkov 1:35:32

is- C- can you s- spell that? SHEEFs?

Stefania Druga 1:35:33

Um, S-H-E-E-F. SHEEFs. SHEEF.

Alex Volkov 1:35:38

Ooh, okay. Um- I never heard that term. Te- Um- Please educate me ...

Stefania Druga 1:35:41

so it's more like thinking about, um, deep neural networks in geometric terms and understanding the geometry.

Alex Volkov 1:35:48

Mm-hmm.

Stefania Druga 1:35:48

And you wanna, like, work with an architecture where you can synchronize, like, the local information and the global information in real time, and that's what SHEEFs allows you to do. And he published a couple of papers on this. There's more coming.

AI 1:36:02

Yeah.

Stefania Druga 1:36:02

But I think that's a very promising direction. Like, the, the insight is, like, how can you create systems where you more readily update your global information from the local updates in a verifiable way?

Alex Volkov 1:36:15

Yeah. I have a follow-up for you that's unrelated to your last question. I just remembered. Uh, last time we talked, we talked about AI in education.

Stefania Druga 1:36:24

Yes.

Alex Volkov 1:36:25

And- I

Stefania Druga 1:36:26

a lot of... Yes ...

Alex Volkov 1:36:26

and we're moving in an insane pace here. Yes. And, uh, I don't know if, if you participated or saw, um, um, Ai.engineer for the first time had the kids thing, just before the conference.

Stefania Druga 1:36:38

Yeah.

Alex Volkov 1:36:39

And obviously, kids are incredibly important to bring into this world of AI. So would love your thoughts on where we are, how kids are using. Should, should they have access to smaller models or bigger? I, I literally have no idea. Besides the fact that my, my kids know that I'm in AI. Yeah. And sometimes in the car we talk with, like, ChatGPT. Um, they don't like Grok, by the way. Yes. They prefer, they prefer ChatGPT. Oh, that's

Stefania Druga 1:37:05

interesting.

Alex Volkov 1:37:05

Uh, it's very interesting. I have no idea why. Um, talk to me about, like, how you see AI in education right now. Like, how... wh- where should we look at? Should there be a track for AI education next year? Um- Yes,

Stefania Druga 1:37:14

please. Yeah. Swyx, if you're listening.

Alex Volkov 1:37:17

Swyx is gonna come and, like, I, I'll... If you want the message, we'll write it down here. Like, "Hey, Swyx." We

Stefania Druga 1:37:21

need a track. Yes. Yes. Uh, I know the AI for kids, uh, AI engineering for kids this year was very successful. I'm so happy- Yeah ... uh, that folks organized that. I was stuck in immigration, I couldn't make it in time. Oh, no way. But, um, but I made it. So, uh, where it's going, I think there's, like, definitely a need for more Socratic AI tools for youngsters. Because as we know, even for experts, like, we talked a lot this year about do you still read the code or not? Who's on the hook for being responsible for what, what gets, gets shipped?

Alex Volkov 1:37:52

Yeah.

Stefania Druga 1:37:52

How do we all fight, like, the, the brain rot, right? Like, because it's so easy to delegate tasks now. What do you choose to still, like, learn and truly understand? That's much harder to do for young people because they don't know what they don't know.

Alex Volkov 1:38:04

They don't have the alternative. They don't

Stefania Druga 1:38:06

have... Yeah. We,

Alex Volkov 1:38:06

we grew up before internet. Right.

Stefania Druga 1:38:07

Right. So-

Alex Volkov 1:38:08

Not to age us. Oh, y-

Stefania Druga 1:38:11

you... Totally. Yes. Uh, so I think it's, like, for young people, having Socratic tools that don't give them... Like, we don't want answer machines. For kids? Yeah We actually want like a powerful, like, buddy that they can learn with and play with. So it needs to be Socratic and that's why I built like a Socratic copilot for Scratch. Yeah. He never does the code for them, he just asks them questions. Yeah. Helps them get unstuck. Uh, and I know, Nisten, I promised you a repo link, so I will definitely share that. I was holding off because I was hoping Scratch itself, like I was part of the Scratch team at MIT, would launch something like this. Yeah. It's not out yet, so I'm just gonna open source what I built.

Alex Volkov 1:38:46

Please do.

Stefania Druga 1:38:46

Um, yeah.

Alex Volkov 1:38:47

I gave my kid, uh, recently Brilliant.org launched like a education platform. Yes. And they have, uh, Kibo or whatever T, but I don't remember, folks, they didn't pay me enough. Mm-hmm. Uh, they didn't pay me anything. Yeah. Uh, nobody pays me anything, uh, besides CoreWeave. Uh, to give my kid, like, to, to try it out and, like, that AI agent does not give the answer. Mm-hmm. So we engineers, I think it's a very important difference. We as engineers and talking billionaires, we want the, like, I don't care, don't give me code. I wanna know, I wanna test, yeah, yeah. Yeah, don't teach me code. Send me a pull request of showing me how this actually works. Yeah. I don't care about your process, whatever. For the kids learning, it's a completely different thing.

Nisten 1:39:21

Exactly.

Alex Volkov 1:39:21

They need to learn, they don't need to be given answers. If they're given answers, they're gonna be lazy as .

Nisten 1:39:25

Mm-hmm.

Alex Volkov 1:39:25

So I think the Socratic method is incredible, and I think, uh, generally the work you do, I'm very happy that we have a friend in Japan, in the labs working and telling us about this. Um, next time I definitely need to reach out and we'll figure out some time, Stef, 'cause y- your voice is very important on this Thursd AI Panel. I would love to have more of you here.

Stefania Druga 1:39:44

I will wake up in the middle of the night from Tokyo.

Alex Volkov 1:39:46

Yes,

Stefania Druga 1:39:46

uh- Um, and hope you come to AI Engineering Tokyo.

Alex Volkov 1:39:49

I hear-

Stefania Druga 1:39:50

Someone, someone told me that that might come soon.

Alex Volkov 1:39:52

Mr. Swyx, who's gonna join here very soon, is going to be on the hook of answering of all of the next AI engineers. Including the one that's proposed to be in Japan, obviously we'll cover that once we get there. Um, Stef, I talked about the gradient between people who still read code- Mm-hmm ... and very important code, and on the other end, talking billionaires that just cannot afford to read all of it, right?

Nisten 1:40:20

Yeah.

Alex Volkov 1:40:20

Um, and then I kind of talked about it's not Wait, no, we have more. Okay, I'll, I'll come talk to you in a second, okay? Um, a-and then we talked about, um, where people are on the spectrum, and then basically I talked in my talk, just like it's a fake. It's not really where people are based on tasks. Mm-hmm. Still though, in the age of AI in 2026, and maybe looking forward, I would love to hear, like, a forward prediction. Mm-hmm. Do you still read outputs in code? Do you still think it's important? And how will this change with, like, Fable and Fugu 2, let's say?

Stefania Druga 1:40:54

Yeah. I, I, I listened to a podcast, so I, I, I was thinking about this question. Yeah. I definitely think, uh, those two are not mutually exclusive, and it definitely depends on the workflow. I still read code for critical... like, mission critical, uh, projects, but there are aspects of my work that I delegate where I don't look at it. Um, and I know Wolfram, we talked yesterday about how to use, like, agent in pick your favorite open source or non-open source, like, messaging app. And, uh, I've been a really big fan of the Codex app since, like, it's open source, which I really appreciate. Yeah. And then also I can hack it and add things on top of it, same with PI. Um, so I think the, the, the part that is interesting is how much the interface that we're using to will change what we delegate and what we don't delegate. But we definitely need better tools for visualizing the differences, because in GitHub we had GitHub diff, right? Yeah. But you cannot apply GitHub diff when you are looking at thousands of files. Um, so what's gonna be the next level of abstraction for verifying the outputs or inspecting the outputs, or having maybe, like, a summary of core concepts or core decisions- Yeah ... and keeping track of that? Like, I haven't seen anything in that space yet, and I think that's gonna be the next step towards having better, like, human in the loop, better verification. Um, and yeah, like, not delegating where n- or reading everything. Like, we need something in between.

Alex Volkov 1:42:21

Yep. Attention is still very much required. Yes. I, I think the attention is all we need is just, like, a very pernicious title- Mm-hmm ... for human-

Stefania Druga 1:42:29

Attention economy? Yes ...

Alex Volkov 1:42:31

for humans going forward. Yes. Yeah. Stef- Thank you Stefania Dugan, thank you so much for coming up. Thank you. It's always a pleasure to have you here in video form and, and going up. Uh, we have our next guest coming, and, uh, you're gonna talk as well. Uh, so good luck. Thank you. Thank you so much

Stefania Druga 1:42:43

for coming. See you.

Alex Volkov 1:42:43

Thank you. See you s- All right. Next up we have Philipp Schmidt, and I'll let Wolfram take this one, okay? Sit here. Uh, let's move you, like, kind of here. Where should I sit? Uh, you're gonna sit here. Let's just make sure that... And then you're gonna talk to this mic very close, okay? Okay. You can move the mic. Wolfram, you're here. And folks, I will be right back after this interview.

Philipp Schmid 1:43:09

Okay

Wolfram Ravenwolf 1:43:15

So taking over for Alex here. Yeah. Hi, Philipp. Nice to meet you again. Nice to

Philipp Schmid 1:43:19

meet you. Always a pleasure. I mean, we can also do it in German, right?

Wolfram Ravenwolf 1:43:21

Yeah, we could. And then we use, uh, the translation from Google to- Yeah ... live translate everything. We, we did a demo of that recently.

Philipp Schmid 1:43:29

Nice.

Wolfram Ravenwolf 1:43:31

So, um, yeah, nice... welcome to the show, and thanks for having you here. So I've seen you have, uh, two releases that you- Yes ... just made. Please, uh, tell us all about them. Which one do you want to go first?

Philipp Schmid 1:43:43

I mean, we, we had multiple releases this week. I guess the most exciting ones- Ha ... were, like, NanoBanana 2 Lite and OmniFlash. Uh, two new models, both available across all of, like, the Google services from Google Search to AI Studio to Google Cloud. And for me personally, I am very excited about OmniFlash. Uh, I mean, it was announced at Google I/O. It was already available, um, back then, but now it's available in the, uh, Gemini API, so you can code and, like, edit your videos, create new videos. And I've been playing with it a little bit since the release. Like, I mean, now we are on the con- conference, but it's, like, so many cool things you can do. And for me, like, the most exciting thing is really, um, that you can edit, um, videos with, uh, such a high precision, uh, which feels like wasn't possible before. I, like, posted something I think, like, a few hours ago on how you can make... uh, change the, the daytime. So there was, like, a nice video of, like, a US city at night where you had cars driving and, like, a m- a par- um, parking spot, and then I just prompted it to make it daytime, and it changed the light, it changed the sky, it adjusted all of, like, the, the shadows, and it, like, just worked with, like, a single prompt.

Wolfram Ravenwolf 1:45:01

And, uh, Omni is a new model family. Yeah. Is it a completely new model? Is Gemini inside or...?

Philipp Schmid 1:45:07

Yeah, so Omni is, like, our first new model of the Omni family with the goal to create an, as the name suggests, any to any model, starting with, uh, video output. Uh, but Omni already... or Gemini Omni, what we call it, can already, uh, accept text inputs and, um, image inputs or video inputs, so you can already provide different modalities, and currently you can, like, create new videos out of it, uh, hopefully with, with much more in the future.

Wolfram Ravenwolf 1:45:35

And, uh, do you think the, to from-- I'm not sure if you are allowed to talk about it. But do you think there will be a consolidation or more, um, expansion of the model families?

Philipp Schmid 1:45:44

I think it's hard to tell or hard to know. Uh, I'm, I'm very, like, honored, and it's very cool to see that, like, Deep's, uh, DeepMind is, like, doing research in all, all areas. And I think no one really knows what the future will bring and what the future will hold. And for me, like, the Omni model family is, like, very exciting, uh, step forward to, like, what might be possible in the, in the future. And hopefully we can, like, even, like, learn something from it and, like, generalize even more across modalities with it.

Wolfram Ravenwolf 1:46:14

Mm-hmm. And OmniFlash, uh, the, the length of the videos is ten seconds, is that

Philipp Schmid 1:46:19

correct? Yes. Currently ten seconds. Yeah. So with VEO before we-- you had, like-- or were able to generate eight-second videos, and now with Omni in, like, a single request, you can generate ten-second videos. But of course, since you have, like, all of the conversational editing to it or, like, generation to it, you can create a new video ten seconds based on, like, the, the first one to keep character consistency, com- to keep the scene, to keep all of, like, the important details. And then you can, like, create longer videos currently with, like, multiple requests.

Wolfram Ravenwolf 1:46:49

Ah, that makes a lot of sense. So you focus on the editing part, and you can then just edit the videos to make longer ones.

Philipp Schmid 1:46:54

Yeah, but like, so Omni is not only for editing. It can do, like, many, many different things. So, uh, you can, like, directly prompt it with text, with image, with video, and then, like, the model kind of figures out what it needs to do. But you can also, like, s- specify the tasks on, like, what you want the model to do, and it can be, like, very simple, uh, text to video generation or reference to video where you provide images or other, like, references or what we call it to a video. And then of course, like, conversational video editing that you, um, generate a video and then you can, like, just follow up with a new text prompt to say like, "Hey," I don't know, uh, "make the house bigger," uh, "change the color from blue to yellow." And you don't need to provide the previous input. You just, like, chat with it as you would do with a normal, like, uh, um, yeah, chatbot basically. And then the model keeps all of the context, knows what it means to change, and returns a new video from it.

Wolfram Ravenwolf 1:47:46

Ah. Yeah, I wish, I wish that would be working in real life when you just say, "Make the house bigger and change the color." Yeah. But maybe when we have more, uh, AR glasses, you can do it like that. Um, so what, uh, can I already use it from the Gemini assistant? Is that integrated that way? Uh,

Philipp Schmid 1:48:01

that's something I don't know. Uh, if it is not possible today, it should hopefully come in the future. Uh, you can try it in AI Studio directly in the playground. So if you go to ai.studio, and then in the playground you can select the OmniFlash model, or you can use it, uh, via the Gemini API Um, yeah.

Wolfram Ravenwolf 1:48:19

Okay. That's great. So I will definitely give my agent the- Yes ... your X tweet and say integrate that in our setup.

Philipp Schmid 1:48:25

I mean, I think like even for like all of the people using coding agents or agents to like generate new media, we ship, uh, a specific Omni Flash skill. So on our, uh, GitHub repository Gemini skills, there's now a new Omni skill. All credits there to Paul or Fofner. Um, he made that skill, and it teaches like basically coding agents how to use Omni, and it's like really cool because you just install the skill and then get started. Uh, you can provide reference images. Like, all of the things you know it works, um, can you do with the, the model.

Wolfram Ravenwolf 1:48:58

I think n- now skills are the new apps, and you just give your agent an app and a skill, and you have- Yes ... a new app, and you get a lot of new capabilities. So, um, the other model that... or other release is, uh, NanoBanana 2 Flash?

Philipp Schmid 1:49:13

Lite.

Wolfram Ravenwolf 1:49:13

Or Lite? Lite.

Philipp Schmid 1:49:14

Yes.

Wolfram Ravenwolf 1:49:14

Different name.

Philipp Schmid 1:49:15

So yeah, NanoBanana 2 Lite is basically Gemini 3.1 Flash Lite image, so it's based on the 3.1, uh, model family, and it's also like known as like the NanoBanana model, and it's our new super fast, super efficient, uh, image generation model. So think about NanoBanana, uh, but faster, uh, and like cheaper to use as the existing versions and better in quality than like the original NanoBanana.

Wolfram Ravenwolf 1:49:42

Yeah. The pricing was about three cent, right?

Philipp Schmid 1:49:45

Yeah. Starts, starts at three cent for like 1,000 to 1,000, uh, image.

Wolfram Ravenwolf 1:49:49

Mm-hmm. Yeah, and I think both, uh, combined then you can create, um, a setup with NanoBanana and send it over to Omni to make a video out of the image you created before that. Yes.

Philipp Schmid 1:50:00

And I think like even what, what excites me about NanoBanana 2, um, Lite is that now you can generate, uh, images below like two or around two seconds. So it's like not like I prompt it, then get like need to wait 10 seconds, and this like enables new use cases where you really need to go large scale, iterate very fast, create like new user experience because you can create new images, but you can also edit those new images very frequently or fast. And then if you need to like upscale or create higher quality images, you can like take all of the learnings you have and then like move to, uh, the NanoBanana Pro model or like go directly into image editing. Mm-hmm.

Alex Volkov 1:50:37

I'm back. Exciting. And I love NanoBanana 2. Uh, and getting it much faster I think is a very, very cool thing. Um, there's, there's a bunch of you folks here at the Google DeepMind booth as well. Um, and I, I want... I don't know if you asked him already about kind of the vibe of the conference or what he talks about.

Philipp Schmid 1:50:56

No? We were just talking the releases now.

Alex Volkov 1:50:57

Okay. Then go ahead. That

Philipp Schmid 1:50:58

was the thing. So because, uh, the... How many AI engineers have you been to? Uh, I think it's my third. Yes. I was in, uh, Europe in London in April, and then obviously here last year-

Wolfram Ravenwolf 1:51:10

Yeah ...

Philipp Schmid 1:51:10

uh, which was a little bit different, I would say.

Wolfram Ravenwolf 1:51:13

Um. How do you compare it to the one in London?

Philipp Schmid 1:51:17

I mean, it's, like, much, much bigger, right? At the expo's bigger, you have, like, 18 different tracks or something now. Europe was much closer. But, um, what I like about AI Engineering Conference is, like, they always try to get, like, to the local community. So, uh, London was very, like, I would say more European centric and, like, had many people coming from Europe. Here it's, like, the, the biggest one, so you have people coming from all over the world. And what I enjoy the most is that all of the talks are recorded, so even if you cannot join them, you can, like, watch them later online, and they are easily accessible. So that's... Yeah, it's great. I love it to be here.

Alex Volkov 1:51:53

There's one talk that you guys absolutely must watch. It's called the Zeal Continuum by ... You know why? Yours truly. Dude, by yours truly. Um, I wasn't on kind of the main stage, and so I invited all these people. Yeah. I did so much work to get people to come to my talk, and then I was in the leadership track. There's a leadership track for, like, uh, 1,500 people, and only folks with, like, a specific wristband or color can come in there. So many folks that I invited worked so hard to, to convince, 'cause we have 36 tracks. We'll ask Swyx about this. It's like a huge conference, 7,000 people. Uh, you need to kind of work for your talk to be, like, attended as well, unless you're a, like, keynote speaker. And so I work, and then, uh, they were turned away at the door because they weren't leadership, so. They are

Philipp Schmid 1:52:35

very

Wolfram Ravenwolf 1:52:35

strict here.

Philipp Schmid 1:52:36

You almost feel like you're in Germany, right?

Alex Volkov 1:52:38

Very strict.

Philipp Schmid 1:52:39

Yeah. I mean, like, my talk later is in the main area, uh, 3:45 or something. Today? Yes. Yeah. So if you are online, it's, like, on the YouTube live stream, and if you're here, then yeah.

Alex Volkov 1:52:50

Oh, so definitely, folks, definitely tune in for Philipp, uh, giving a talk. Do you guys talk about Hugging Face at all?

Philipp Schmid 1:52:56

No.

Alex Volkov 1:52:56

Not yet. Uh, Thomas Wolf, the co-founder of Hugging Face- Yes ... is, like, walking around here, uh, with a very thick French accent. Bro, you were at Hugging Face for, for a while. Yes. And that's how we met. Yes. I don't know if, if we brought you on the pod before or not, but, like, definitely you were, like, in the, in the area. Um, how was the switch for you from, like, an open company to, like, a bigger DeepMind company?

Philipp Schmid 1:53:15

I mean, it's obviously a change. It's, like, was from a startup to a big tech. Yeah. But I would say DeepMind is not, like, your super typical big tech company, especially, uh, inside the Gemini API or AI Studio. We try to stay lean. We try to ship fast. We try to iterate.

Alex Volkov 1:53:30

Yeah.

Philipp Schmid 1:53:30

We have, like, a great leader with Logan who is, like, super active everywhere. Friend from the

Alex Volkov 1:53:33

pod. A great friend.

Philipp Schmid 1:53:35

And- A great dude ... and so far I really like it. Like, you can make a change. Like, so we ship the interactions API- Yeah ... at GA, which is our new way to use Gemini models and agents, and it's, I would say- less typical on how a normal Google API looks. Google is very known for gRPC-based APIs, and with the Interactions API, we try to go to, uh, or to align with, like, what the industry is doing to be more restful, to be more JSON aligned. And we managed to ship it, we got a lot of excitement internally, we see people liking it. So even if you go to a big company, you can, like, still have an impact.

Alex Volkov 1:54:11

Yep. Wolfram, go ahead. That's

Wolfram Ravenwolf 1:54:12

great. Uh, yeah, um, any other releases? D- anything else? Uh, I mean, it's three already, right?

Philipp Schmid 1:54:21

Yeah, so we had the Interactions API, GA earlier, I think this week or last week, I'm not sure. I mean, I lose track of all of the things going on.

Alex Volkov 1:54:29

Time moves too fast, man. We had Steph from Sakana on here. Yeah. And they released Fugu, the model router. Yeah. And I was like, "Steph, did you release model router two weeks ago or one week ago?" 'Cause so much happened.

Philipp Schmid 1:54:37

Yeah. It's either today or it had happened in the past.

Alex Volkov 1:54:40

Yeah.

Philipp Schmid 1:54:40

Um, no, I mean, we continue working on all of the, the APIs, the agents, the models. I mean, you never stop. Um, nothing today. Hopefully more next week- Yeah ... and then we'll see.

Alex Volkov 1:54:51

Phil, I have a question for you about Spark.

Philipp Schmid 1:54:52

Yes.

Alex Volkov 1:54:54

I, I, I went to Google I/O, we met at Google I/O, it was great, and then, uh, uh, Spark was announced on stage.

Philipp Schmid 1:54:58

Yeah.

Alex Volkov 1:54:59

And first of all, it was great to see how strong AI is getting represented on stage because there was Sundar, and then Demis, uh, and then immediately after them, Varun Mohan from, uh, Windsurf Acquisition that's now Antigravity, right? Yes. It was, like, incredible to see that. And then, uh, Spark was announced. It was like a generalized agent with a sandbox environment that can run stuff computer use as well. I started using Spark.

Philipp Schmid 1:55:18

Okay.

Alex Volkov 1:55:18

Uh, I will give a lot of feedback to the team directly, so I'm not gonna put you on the spot for them. Uh, but I think it's one of the best agents that I have access to that has access to Gmail and Google Drive. It does insane research in terms of, like, okay, we're filling out some government forms. It knows about my forms from 10 years ago and pulls up details and was like, "Hey, Alex, actually you fucked up right here, like, your passport number is actually the incorrect passport number." I was like, "Th- this is exactly what I wanted." When I came to Google I/O and complained about, hey, why isn't there an agent that I can use directly from Google. Why do I need to use third-party agents like Hermes or OpenClaw- Yeah ... that via clunky APIs for which people sometimes get fired, we're not gonna talk about that, uh, is, is accessing the Drive. Now there's a direct one from Google. Do you use Spark at all? What's your experience with that? I would love to hear.

Philipp Schmid 1:56:06

I don't use Spark personally. So I'm, I'm based in Europe, and I think the access is, like-

Alex Volkov 1:56:10

Oh,

Philipp Schmid 1:56:11

you didn't get Spark ... either not rolling out or, like, still rolling out. Come on, Germany

Alex Volkov 1:56:13

represent.

Philipp Schmid 1:56:14

Yeah. I, I have, like, obviously tried it, and, like, dogfooded it, and played around with it. Yeah. And what's great about Spark for me is, like, that, like, maybe people have seen the IO keynote, is, like, that we try to unify all of the different agent harnesses behind the scenes-

Swyx 1:56:29

Yeah ...

Philipp Schmid 1:56:29

so that Spark uses the anti-gravity harness, which is represented in the IDE- Right and also now in the Gemini API with managed agents that we, like, can benefit from all of the different services and together, like, hill climb the harness and make the model better. So that's why I'm most excited about Spark. Yeah. And then, like, hopefully I can use it inside my personal Gmail, uh, soon.

Alex Volkov 1:56:51

It's really cool. So sorry, I, I- If, if that hurt that you didn't get Spark because you're in Europe, so sorry. Uh, hopefully we get it- I mean,

Philipp Schmid 1:56:58

it's, it's rolling out. I'm just not sure where it is- Yeah ... or where they are.

Alex Volkov 1:57:01

Um, all right. Well, so you're talking today at 3:00...

Philipp Schmid 1:57:04

45. 3:45,

Alex Volkov 1:57:05

on the main stage?

Philipp Schmid 1:57:06

Yes, main stage, last session before the-

Alex Volkov 1:57:08

Bro

Philipp Schmid 1:57:09

closing

Alex Volkov 1:57:09

keynote. Break a leg. Good luck. We're rooting for you. Thanks. Hopefully we're gonna be there to root in the crowd. Um, folks who tune in to the live stream, after we finish, go directly to the YouTube of Ai.engineer. There's a bunch of great talks, and definitely once the keynotes hit, this is the, this is the crème de la crème. Dude, thank you so much for coming out. Thanks. Thanks, Yam. We have to move along. I know you're a busy dude. Good luck. Break a leg, man. Yeah,

Philipp Schmid 1:57:27

see you later.

Alex Volkov 1:57:28

Thank you. Wait, before you go, can we take a picture of all three of us? Uh, Steph, can you take a picture? Hold on, dude, we, we have to. We always forget this part, and it's the most important part. No, from, like, the other...

Wolfram Ravenwolf 1:57:40

Next. I'm not hearing anything on these. Can you still check the sound?

Alex Volkov 1:57:44

Yeah, we'll check the sound. Yeah. Thank you so much for those.

Wolfram Ravenwolf 1:57:51

Mm-hmm.

Alex Volkov 1:57:53

Awesome. Thanks. Thank you. Phil, thank you so much for coming up, man.

Philipp Schmid 1:57:56

Thanks.

Alex Volkov 1:57:58

All right, folks, we'll be right back w- with our next guest. Uh, actually, don't go anywhere Guests will sit here, and then we'll talk Ja kom You

Sero 1:58:24

sit over there and talk to this microphone

Alex Volkov 1:58:29

All righty. Wolfram, how was, uh... C- can I say something super quick before we announce our next guest? Um, three and a half years, I think this is maybe the second time that I allowed myself a bio break in the middle of this.

Wolfram Ravenwolf 1:58:44

It was the first time that I have been watching the show that I experienced- That

Alex Volkov 1:58:47

you saw me, like, disappear yeah, that I, uh- So thank you so much for holding, for holding. Uh, our next guest is here, and I will introduce. Do you mind if I introduce you?

AI 1:58:54

Yeah, introduce,

Nisten 1:58:55

please.

Alex Volkov 1:58:55

Yeah. Uh, so folks, I've, I've talked about Daria before. Uh, this is Daria Volkov, my lovely wife. Brand new. Uh, thank you so much for joining. Thank you. And thank you, thank you for joining me at AI Engineer.

Darya Volkov 1:59:08

Of course.

Alex Volkov 1:59:08

Um-

Darya Volkov 1:59:09

Pleasure is mine.

Alex Volkov 1:59:11

Can you tell folks what you do for a living? I think this will be, uh, very cool to-

Darya Volkov 1:59:15

Yes ... for them to hear. I own marketing agency, AI-driven marketing agency, Geeks360, and, um, I'm really happy to be in this conference. Thank you for inviting me- Yeah ... because this shows me exactly gaps and how these two worlds are completely different.

Alex Volkov 1:59:33

Yeah.

Darya Volkov 1:59:33

Yeah.

Alex Volkov 1:59:34

Uh, uh, first of all, personally, I'm very happy that you're here and able to see kind of this. So this is super cool. This is not the usual Thursd AI. Obviously, I sit at home. Yeah. Uh, but this is, like, quite exciting. And the reason why I decided to bring you on, not only 'cause, you know, nepotism, but- I

Nisten 1:59:54

hope so.

Alex Volkov 1:59:54

No, yeah, definitely not. But, but also because, uh, you're having a completely different experience than what I'm having. Yes. I find it incredible. The, the, the, the, the size of this is so vast that while we're together kind of in different rooms, I introduce you to some of my friends, speakers, et cetera. But then kind of on your own, you're watching, uh, workshops, and you're seeing talks that I don't actually go to because I don't have time.

AI 2:00:16

Yeah.

Alex Volkov 2:00:17

I would love to hear your experience, and I would love for the audience also to hear kind of like the, your experience because, um, the experience from the floor and from the hallway track, I think is very, very important. It's maybe the best part about this conference. Can you tell us?

Darya Volkov 2:00:29

Yeah, for sure. So I'm really happy that I visit spe-speakers, like, and, and the sessions. Uh, every session I visited was extremely important for me. I'm not an engineer right now, but I found it super interesting and important. There's, like, a few things that overall all this conference talking about. First of all, obviously loops, uh, the main subject of the conference. And, uh, uh, people do not talk anymore about agents, like should they exist, how they should exist, and if they have value. We, in this all conference talking about how to trust them and how to make their memory better and, uh, other things. So for me specifically, I found, um, besides the fact that I see huge gap in between my clients and people who actually implement this technology and all this world around us, um, I found a lot of answers that help me to migrate these two worlds during the sessions and how we can improve, um, our, like, let's say, technology and AI that we use and agents, uh, et cetera. But funny thing that I did find that all e- every engineer here is actually really extremely excited about harness and evaluations, but in my world, it's something that we use for years. Like, in marketing, you have so many layers of- Yeah ... approvals and regulation, et cetera. So for me, it's like native language. I understand exactly how it can be implemented into agents and how it should be structure-wise. But here, like, people just, "Wow, we need to do that." Yeah, guys, you need to do that.

Alex Volkov 2:02:09

For- Yeah ... for the conference, it, it used to be mostly AI engineers, folks who like build with AI. Um- But if you can show your badge super quick, there's a little attachment there. Uh, can you guys maybe zoom in on this a little bit? It's called Token Billionaire, that you also got. Yes. Thank you. Um, so we have the card. Did I leave my card here? Did somebody steal my card, by the way? Oh, there it is. Token Billionaire. Uh, so we're both Token Billionaires. Yes. Uh, definitely combined between us, we're, like, double the Token Billionaire. Uh, y- y- you're an engineer in your past, but you no longer, like, do... But you write a lot of code. H- how, how is it that you're a Token Billionaire? What are you doing in your day-to-day job that you're getting to sending millions and billions of tokens?

Darya Volkov 2:02:51

I have eight agents and a- all of my- Eight? Eight, yes. I installed another

Alex Volkov 2:02:56

one. Are there two new that I don't know about? What?

Darya Volkov 2:02:58

Yes.

Alex Volkov 2:02:58

Exactly.

Darya Volkov 2:02:58

It's been wild. And all my agents have sub-agents. So, um, I'm not a regular user who just ask questions into, you know, ChatGPT. Yeah. I found that, uh, with my knowledge, maybe it's not so important today to have engineering knowledge, but it's, uh, extremely helpful, my background. Yeah. I can engineer so many apps, platforms for my company internally and for clients almost overnight. Yeah. Almost overnight, I can create billing platform that organize all my, like, you know, finance, uh, the integration with clients and et cetera. So yeah, how I use my tokens, I just create platforms and, uh, develop, uh, apps for company.

Alex Volkov 2:03:39

You, and I know this because we live together- Yeah. ... um, are not always, "Hey, everything works." Sometimes things don't work as well. What are your current gripes with AI agent? What would you like to see improved? What are, what are you missing? And I know you're using, like, a bunch of them. And I know that, like, we walk around and you meet people from OpenAI that, like, we introduce, like, "Oh, I have some questions for you." Yeah. What, what are your, like, current, like, shortfalls from agents that are not enough for you, and what would you like them to improve?

Darya Volkov 2:04:08

Well, okay, today morning I woke up and I found invitation for this podcast should be scheduled for 1:00 AM in the morning. And I was like, "Alex, why did you invite me tomorrow?"

Alex Volkov 2:04:18

Oh, the invitation for Thursd- AI, yeah. Yeah.

Darya Volkov 2:04:20

It's like, it's Fable, it's not me. So...

Alex Volkov 2:04:23

Fable also screwed up this. This is the, the show schedule.

Darya Volkov 2:04:27

Yeah.

Alex Volkov 2:04:27

And, uh, it wrote the right names a- and the right times that they're supposed to show up on the podcast, but it switched the order.

Darya Volkov 2:04:35

The order, yeah. So

Alex Volkov 2:04:35

it was like the, the later guest showed up earlier in the order, and this is Fable. This is the famous Fable-

Darya Volkov 2:04:40

Famous Fable, yeah ...

Alex Volkov 2:04:41

that we used, uh, that we should tell people about our, our honeymoon as well.

Darya Volkov 2:04:44

We should te- tell people about it more. Yeah. But yeah, I still see, like... I still babysit, like you said, we still babysit our agents. There is, like, um, um... I say a lot of time, "like," but forgive me for that. Uh, there is a huge, um- trust that can be developed with time- Right ... timeline. Uh, so I would definitely like to see in AI overall two things. First is ability to learn progressively. So I don't wanna see the same mistakes, I don't wanna see stupid things, I wanna just, you know, trust it more and more with more, more important things like my personal life, finance and et cetera, as you said, that needs to be actually checked and- Yeah evaluated. But I do wanna see paths where we can do it. And second, I would like to also have, it's my dream, one place where you can actually have connected to all the AI agents, every platform, and not to be more like, you know, separated by, um, providers. Today we have Anthropic, we have OpenAI, everybody. I wanna have one big brain so you don't have to see in every week jump from new tech- new model, new technology every, every time something coming. I just wanna have one

Alex Volkov 2:05:54

that's always- Yeah, you wanna have, like, a unified-

Darya Volkov 2:05:56

Unified ...

Alex Volkov 2:05:56

memory stack.

Darya Volkov 2:05:57

Yeah. Universal.

Alex Volkov 2:05:59

That's true. Yeah. Wolfram,

Wolfram Ravenwolf 2:06:00

go ahead. And, um, would you like to use local AI for the- I think- ... privacy related stuff? Love it. Love the Uh, are you planning to use more local AI or do you more think it will be cloud-based? So would you like to have more capable local AI, like GLM 5.2?

Darya Volkov 2:06:15

Well-

Wolfram Ravenwolf 2:06:16

Have you tried that?

Darya Volkov 2:06:17

Yes, I did try it, and cloud-based is just much easier because it's accessible from a lot of places, right? So it's like, it's like a general brain. I even tried to set up it yesterday, day, and y- and today morning, like, general cloud-based brain for my company, so everybody can access and see. So yeah, in theory I would say

Alex Volkov 2:06:36

cloud. And also, both me and you, we have Mac Minis that we- Yes ... bought for OpenClient and for Hermes, et cetera. Technically, if that Mac Mini ran the AI itself, it would be better for us, right? Like, if it performed the same level, it was like- If it performs the same level, yes if it was like Fable level or even like half Fable level, I would trust it much more because then my tokens don't go to, like, Anthropic. To Anthropic or somebody-

Darya Volkov 2:06:56

Yeah ... else. Yeah. But, uh, right now it's only theory, right? Yeah. Because you still need to be connected to some main

Alex Volkov 2:07:01

brain. Yes. Let's talk about, let's talk about AI, say, courses. And Fable Maybe you wanna talk about

Darya Volkov 2:07:07

it right now? Yeah, yeah.

Alex Volkov 2:07:09

Let's dirt, let's air all our dirty laundry here live I'm,

Darya Volkov 2:07:12

I'm completely there

Alex Volkov 2:07:13

in this

Darya Volkov 2:07:13

process. Um,

Alex Volkov 2:07:14

so I, I think I posted this picture. We went to our mini honeymoon, and on the way there- Yeah ... we had fun on the plane. We watched... We didn't watch any movies. No. All we did- We just fabled, fabled ... I have a picture of you just sitting like this. We fabled all the way. Why? 'Cause, like, we had access to AI before. We're doing AI for a long time. Why, why that specific model, and what d- what does it, what does it feel like to

Darya Volkov 2:07:35

you? This specific model feels like it was, like, completely different level, completely different level. The capabilities that I discovered first 10 minutes I started using, I was just, like, shocked.

Alex Volkov 2:07:45

Can you give it, like, a practical

Darya Volkov 2:07:47

example maybe? Yes, an example. For example, let's say, uh, my own company website, my own company, uh, programs and things we do. Using previous versions of Anthropic, Opus, we could discover gaps here and there, but we never saw the whole picture and deep analysis into what we do. 10 minutes of Fable running, not 10 minutes, uh, a little bit exaggerated, but a few hours, Fable did the full analysis of what we found and, and he found gap that we've been hacked five years ago. Our website was hacked, and some dirty code still left on, in, in the server. M- I have team of developers. We never s- spotted, we never saw it. S- Like... And it's just- Wow ... top of the iceberg. There is huge discovering that we found that no model did the same task before. No model could, could find it.

AI 2:08:37

Yeah.

Darya Volkov 2:08:37

So yeah, I saw the ability, and obviously the course had started because I saw, like, how many things I can not only investigate, I can build and do With this power

Alex Volkov 2:08:47

And then it was taken away

Darya Volkov 2:08:49

It, and then it was taken away and- And

Alex Volkov 2:08:51

how did that make you feel?

Darya Volkov 2:08:52

I actually ha- Did that feel like shit? ... was happy because it was our mini honeymoon, and-

Alex Volkov 2:08:56

Oh, you were happy with Fable being taken away by

Darya Volkov 2:08:57

the

Alex Volkov 2:08:58

government?

Darya Volkov 2:08:58

Because now you and me would just instead of going to see things, we've been fabling, so. That's true.

Alex Volkov 2:09:02

Uh- No, but we had an agreement, right? We said, uh, we'll, we'll, we'll limit the agent on our honeymoon- To two hours to two hours. Yeah. And then we didn't have

Wolfram Ravenwolf 2:09:13

to. Go ahead. Speaking of agents, so you had six, now you have eight. Yes. And tomorrow it will be 10 or something. Um, do you... How, how are you handling them? What interface do you use? Do you just- How

Darya Volkov 2:09:23

do I handle them?

Wolfram Ravenwolf 2:09:24

Yeah, how do you handle the, all that many agents? And it will be more, I guess, so any-

Darya Volkov 2:09:29

Uh, so- ...

Wolfram Ravenwolf 2:09:30

way to handle it better?

Darya Volkov 2:09:30

You know, I'm not maybe super advanced in that. I know people doing, uh, group chats and, uh, Discord. I have Discord with some of them so they can k- uh, manage themself. I'm still i- in, probably in the lower scale of the level. I still manage each agent separately because each agent in my structure has different roles. So I know they're able to open their own sub agents, right? And do workflows and orchestrate d- difficult tasks, but I still dedicate each agent for different tasks that I use in my company, so I need to manage them all. And my team, even today morning, my team contacted me and said like, "Hermes is, is really doing a lot of mistakes." Yeah. I was like, "Okay, I need to manage this specific agent," and yeah. So manually.

Wolfram Ravenwolf 2:10:17

Human in the loop.

Darya Volkov 2:10:18

Unfortunately.

Alex Volkov 2:10:19

You're the human in the loop. Yeah. Uh, Daria Volkov- Thank you ... thank you so much for coming.

Darya Volkov 2:10:23

Thank you so

Alex Volkov 2:10:23

much. I'm glad- I know you were stressing before this.

Darya Volkov 2:10:25

Yeah.

Alex Volkov 2:10:25

Uh, you feel better now?

Darya Volkov 2:10:26

Yeah, I feel better.

Alex Volkov 2:10:29

That's awesome. Well, I know you guys. Let me get your awesome glasses. Yeah. Uh, folks, we have another special guest. Uh, thank you so much. Thank you. Uh, check out Geeks360 if you need any marketing stuff. Uh, I have to, I have to show something. Uh, can you tell Swyx to come over? He's over there. Uh, how was that?

Nisten 2:10:44

Oh, okay, cool. Yeah, you can go. I'll talk to Swyx.

Alex Volkov 2:10:47

Uh, folks, uh, we're almost ready. You're right here, and this is your mic. Hey. Yes. Uh, are you going? I can sit Swyx here. Okay. Cool, cool. Uh, we're almost ready to land this plane. Woo! But just before we do, we have a special guest. Can you go wide? We have a special guest. Uh, the one and only, also token billionaire, Swyx, the- ... co-founder of The Engineer, the host of Latent Space, the curator of AI news that he keeps sending despite having very busy days. Bro, what are you doing? Why aren't you asleep? It's so boring.

Swyx 2:11:25

Nothing's going on.

Alex Volkov 2:11:26

Nothing is going on. Uh, my friend-

Swyx 2:11:29

Hey ...

Alex Volkov 2:11:30

thank you so much for coming back.

Swyx 2:11:31

Yeah. Yeah, good to see you guys. How has the morning been?

Alex Volkov 2:11:35

It's been hectic.

Swyx 2:11:36

Yeah.

Alex Volkov 2:11:36

And also very, very, very... Dude, do, do you, do you, like, ThursdAI has, has come a little bit forward. Yeah. Thank you so much for the stage that has- There's two tables

Swyx 2:11:44

now. Oh ...

Alex Volkov 2:11:44

that has built all these, like, great folks listening to us- and helping making sure that you guys are having a very great show. Um, and, uh, I wanna open With a thank you.

Swyx 2:11:56

Oh, thank you.

Alex Volkov 2:11:56

Yesterday was... I posted about this, folks, please check out. Yesterday was an unforg- unforgettable day. Not because Fable V was given back to us by the US government, which we was all waiting for, but also because, uh, after my talk at the, uh, leadership track, we all piled onto buses and we drove to a perk. I don't know how else to call this. A perk of, uh, being a speaker at AI or a, or a VA.

Swyx 2:12:23

Not all speakers.

Alex Volkov 2:12:23

Not all speakers.

Swyx 2:12:24

Just the, just the top tier-

Alex Volkov 2:12:26

Just the folks who come back

Swyx 2:12:26

and support, yeah ... uh, closest friends of the company, yeah.

Alex Volkov 2:12:29

And then we went to the, uh, San Jose, San Francisco Bay Stadium to see Team USA beat Bosnia's ass. I'm sorry, Bosnia. You got zero to zero. The energy was insane. And you know what, uh, the most insane part to me to land this plane together? Um, the, that stadium clocked at 69,750-fy-ty something, so almost 70,000 people.

Swyx 2:12:53

Wow.

Alex Volkov 2:12:54

This is 10% of that.

Swyx 2:12:56

Yeah.

Alex Volkov 2:12:57

Does that j- j- j- resonate a, a little bit, like how many people you're now drawing into Moscone West? Can you reflect on that a little bit, on the sheer size of, like, what's going on here?

Swyx 2:13:05

Yeah. I, um, we, at first... So last year was 3,000. At first, when we were trying to set targets for this year, I actually wanted 5,000. Um, my co-founder Ben said six. I was like, "Fine, six. I, I mean, if we fall under, we'll just change the website."

Alex Volkov 2:13:22

Yeah.

Swyx 2:13:22

Um, and a month ago it was 3,000.

Alex Volkov 2:13:26

A month

Swyx 2:13:26

ago? So, yes, like one month ago today.

Alex Volkov 2:13:28

Like, like June.

Swyx 2:13:28

And I was like-

Alex Volkov 2:13:29

Start of June ... "

Swyx 2:13:30

I don't think we're gonna make it, guys." Like, I, I think we need to- Is that, is that, like,

Alex Volkov 2:13:32

a stressful feeling? Is that like-

Swyx 2:13:33

No, of course it is. Yeah ... "Oh,

Alex Volkov 2:13:34

shit, we're gonna

Swyx 2:13:36

lose a lot of money"? So, so in my keynote, I, I talked about the Gini coefficient. Gini coefficient is the measure of inequality over time.

Alex Volkov 2:13:40

Yeah.

Swyx 2:13:40

A perfect equality linear transition from zero, T minus 100 to T mi- T minus zero, linear progression.

Alex Volkov 2:13:47

Yeah.

Swyx 2:13:48

It, that's a Gini coefficient of, uh, of zero. Uh, and g- complete inequality where nobody buys and then everybody buys on the last day, that's a Gini coefficient of one.

Alex Volkov 2:13:57

Yeah.

Swyx 2:13:57

Right? So the, the, the amount of stress that you have as an organizer is a measure of the Gini coefficient. Um, the coefficient has just gone up over time, not down, and that's the wrong direction. So

Alex Volkov 2:14:06

more folks buying closer. Yeah, yeah. It was like- Folks, what are you doing? This is the engineer. Look around. Everybody, everybody's like- Why don't you buy the early bird to save yourself a bunch of money?

Swyx 2:14:14

G- yeah, yeah.

Alex Volkov 2:14:15

We find- Tickets are cheaper. Yeah. Motels are cheaper. Sales- What are you doing?

Swyx 2:14:16

Sales go up the more we raise prices. Uh- Yeah. ... so I don't... Um, so yeah, uh, uh, we didn't raise it that much. It was- Yeah ... it was just like to, to force people to buy. And so, um, I think, um, yeah, it's, it's, it's at 0.7 now. I hope it goes down next year. But, uh, the other thing I'm also thinking about is-

Alex Volkov 2:14:33

And you clocked in at seven

Swyx 2:14:34

7,000

Alex Volkov 2:14:35

it did. 7,000, sold out completely. Every

Swyx 2:14:37

single house. I think the actual number is, like, 7,200.

Alex Volkov 2:14:39

Wow.

Swyx 2:14:39

Yeah, yeah. So, um- It's insane Some of the... Not everyone is attendee, attendee. A lot of our, like, staff and- Yeah ... support people. Uh, there's a flash mob happening later today.

Alex Volkov 2:14:49

There's a flash mob?

Swyx 2:14:50

There's a flash mob happening

Alex Volkov 2:14:51

later today. Wait, hold on. I know about the puppies. You know about the puppies. There's a puppies corner that I haven't yet seen. Yeah. That there's just puppies, you come and sit and chill with them- True ... uh, which is an incredible innovation. Yeah,

Swyx 2:14:59

there's robots running around.

Alex Volkov 2:15:01

There's robots. There's

Swyx 2:15:01

T-Rex, orcs, uh, and- There

Alex Volkov 2:15:03

are a few orcs that, like,

Swyx 2:15:05

they should- You should, you should interview the orcs.

Alex Volkov 2:15:06

Oh, that's true.

Swyx 2:15:07

Interview the orcs. Um, and, uh, there's a music corner that people like.

Alex Volkov 2:15:10

Yeah.

Swyx 2:15:10

Uh, that... So basically I just

Alex Volkov 2:15:12

get- Sponsored by, uh, Featherless AI. Shout out to Featherless. Featherless,

Swyx 2:15:14

yeah. Also another yellow company. Um, I think, uh, basically because it's my conference, so I get to express myself- Yeah ... in, like, here's what I like and here's what I think people should do. I really want to create people, opportunities for people to connect. Anyway, so for, um- Uh, for the, the flash mob later, um, just be in the, be around at 12:30.

Alex Volkov 2:15:35

12:30?

Swyx 2:15:36

Yeah.

Alex Volkov 2:15:36

Folks- ... uh, if you're tuning into this from the actual conference, uh, 12:30 is when you need to come back. Swyx, can you show me what you're holding? Like, what is- Oh,

Swyx 2:15:45

yeah ... what is this? Yeah, yeah. I, yeah, I brought... I was very impressed. I saw it, saw it here. Uh, this is our first conference newspaper every day at, uh, 10:00 AM. They put out a report covering the previous day. Uh, we have articles on, um, context and local AI. Uh, this is yesterday's Fable keynote, uh, with Tariq. Tariq

Alex Volkov 2:16:01

Schipker.

Swyx 2:16:02

Yeah. Um, and, uh, I'm just so amazed that we're at the stage now where we have our own newspaper.

Alex Volkov 2:16:08

There is a daily... Can, can I say something? Uh, I met, uh, Swyx.

Swyx 2:16:12

Yeah.

Alex Volkov 2:16:12

They organize a major hacking league.

Swyx 2:16:14

Yeah.

Alex Volkov 2:16:14

And the, the, the founder or co-founder of Practical Dev, I don't know. By the way, we're now translated. If you're seeing us on the homepage of dev.to, we're now, uh,

Swyx 2:16:23

w- we- Streaming? Yeah.

Alex Volkov 2:16:24

Yeah, we're streaming into the- Okay ... the, the folks who build this, this newsletter. This is the type of stuff that happens, like, on the fly here in the hallway track. Dude, this is insane. We're talking with Swyx, he's like, "Oh, you know dev.to?" I was like, "Yes." And then we somehow got to talking, he's like, "Oh, you have live content. Let's put them on the, on the homepage."

Swyx 2:16:38

Yeah, yeah. They, they're trying to build up their, their... Themselves as, like, a destination.

Alex Volkov 2:16:41

Yeah.

Swyx 2:16:41

So a lot of people don't know this, but, um, I s- got my start as a writer, as a blogger.

Alex Volkov 2:16:46

Yeah.

Swyx 2:16:46

And the, the thing that got me started was dev.to.

Alex Volkov 2:16:49

dev.to. So-

Swyx 2:16:49

I went to a conference, dev.to was sponsoring the conference.

Alex Volkov 2:16:52

And it's- dev.to.

Swyx 2:16:53

Yes. And-

Alex Volkov 2:16:54

And

Swyx 2:16:54

as part of, as part of the sponsorship, they were like, "Hey, um, we're gonna issue a challenge to every attendee to write a short blog post-

Alex Volkov 2:16:59

Yeah ...

Swyx 2:17:00

about a talk that you saw." So I wrote three blog posts, and that got s- got me started, and then I didn't stop. So that blog became Late in Space, Late in Space became AIE. And so it's like 10 years later, I have that-

Alex Volkov 2:17:11

10 years later, there's an actual- ... I was writing a blog post ... you have a conference- Yeah, yeah ... in which there's a daily printed newspaper. Yeah. Can I tell you folks about this? It's quite insane. I thought they're gonna prepare this ahead of time and just, like, print it out, like swag, like, like conference swag. This is getting printed. They, they have a newsroom sitting in a hotel somewhere.

Swyx 2:17:26

Yeah, this is pictures from yesterday.

Alex Volkov 2:17:27

This is pictures from yesterday. And I wrote an op-ed, and I didn't submit it on time. I felt like such a failure because went to the game- Oh, yeah, yeah,

Swyx 2:17:33

yeah. Then, then people can submit classifieds-

Alex Volkov 2:17:35

Even- ...

Swyx 2:17:36

where they, they just have weird, like, if you're looking for love- ... or if you're lost and found.

Alex Volkov 2:17:40

Misconnection.

Swyx 2:17:41

Yeah. Yeah, yeah.

Alex Volkov 2:17:42

So we have puppies, we have newspaper, uh, newsletters, uh, sorry, newspapers. We have, uh, the music corner. There's the World Cup insanity. Can we talk about this yesterday? Dude, this was insane. Our suite was right next to the Google suite.

Swyx 2:17:58

Yeah, Sundar and Sergey. And

Alex Volkov 2:17:59

Sundar Pichai

Swyx 2:18:01

and

Alex Volkov 2:18:01

Sergey

Swyx 2:18:02

Brin. And, uh, the C- CEO of YouTube, uh, I forgot his name now,

Alex Volkov 2:18:04

uh, he was there. And Emmanuel something was there- Yeah and his wife. We all were singing, "Country road, take me home to the land I belong." I have a video of this. I'll post it on YouTube. And S- There was like going like this and I see him. It was surreal, man. What the fuck is happening? So thank you for all of that. Yeah. I really, really appreciate it. Um- I,

Swyx 2:18:20

I, the, the Google part was not me. It was just luck.

Alex Volkov 2:18:24

It was just-- But like this is the opportunity that happens, uh, after, after this happens. Go

Wolfram Ravenwolf 2:18:28

ahead. Yeah, I just, uh, I have to run soon, but I wanted to thank you because I think we are all using agents ever more, more loops running, more tokens being spent, but the human element, uh, which, which- Yeah a conference like this enables, that is becoming ever more important, I think. Yeah. All the talks, workshops, that is super important, but we can send our agent to do that. Yeah. But we won't send our agent to talk to the people, so-

Swyx 2:18:51

So I call this the highest loop. This is my conference, my, my keynote this, this time. This is the highest loop that creates all the other loops because we meet together as loop engineers.

Alex Volkov 2:19:01

You need to run to check out from your hotel. Wolfram, thank you so much for, uh, for hosting. Thank you. Yeah. Swyx- Good to see you ... I, I need you for like, at least like five more minutes. Yeah, yeah. We need to talk about some stuff. Uh, my talk at the leadership track- Yeah, ZL ... was the ZL continuum, and, um, I talked about obviously this, right? The token billionaire. Can you talk about, while showing this to the camera, can you talk about the billionaire club at the AI engineer

Swyx 2:19:23

and look over at the camera? Uh, so I'm actually not a token billionaire. I have to create this program for, for people who are not me. Uh, but this, the idea is that, uh, Ryan Lopopolo in the previous AIE Europe-

Alex Volkov 2:19:34

AKA Yolopopolo. Yolopopolo. AKA... There's a new nickname for him, Loopopolo.

Swyx 2:19:39

Yeah, Loopopolo. Um, so he is the author of the Harness Engineering blog post. They work at OpenAI Frontier. They ship a million lines of code.

Alex Volkov 2:19:46

I think it's like the number six or number seventh most popular YouTube video on the AIE YouTube- Oh, really? ... that you guys checked out. Yeah. I checked both of them. Mario is six. I

Swyx 2:19:53

think Mario is beating him. Yeah.

Alex Volkov 2:19:55

And Mario is six- Yeah ... and Ryan is seventh.

Swyx 2:19:57

Yeah, yeah. So I, you know- I think that the kind of problems that token billionaires have are very different than normal engineers. And I think that we always wanted to create, like, a frequent flyer miles of AIE or, like, some kind of Amex Platinum rewards- Yeah ... lounge type thing for AIE. And so I think the, the name sticks is, is fun to say. The, the card obviously lends itself. This card doesn't do anything in- right now, but I think our future cards will have, uh, chips and RFIDs and I, I, I mean, I'm not above, like, putting some money on this and just, like, you know, do something with it, right?

Alex Volkov 2:20:31

Yeah.

Swyx 2:20:31

The, the irony is that- You know that making- ... if you're a token billionaire, if you're spending 5 to 10 to $20,000 a month on API costs-

Alex Volkov 2:20:38

Yeah ...

Swyx 2:20:38

you actually are rich. Like, you don't need any benefits.

Alex Volkov 2:20:41

You don't need another credit card in your life.

Swyx 2:20:42

So you need exclusivity, you need a group chat, you need discussions, and that's what we provide.

Alex Volkov 2:20:46

There is a certain level of affordance of conversation when you're talking to a person that you know. Like Dex Horsey, for example. Our friend Dex Horsey also gave a keynote talk. And I don't know if you saw my meme, it went like this, like the pink man from the internet. Uh, Dex Horsey is a great dude. We hung out also in the World Cup. Um, there's a certain affordance where you know that the person is kind of same level agentic psychosis as you are and running, like, insane loops as well. Yeah. I think this, this card and that, like, um, place for those people kind of allows that.

Swyx 2:21:15

Yeah. And, uh, yeah, Sero also did, like, masseuse and mimosas and-

Alex Volkov 2:21:19

Sero-

Swyx 2:21:20

Cheng, uh, who's, who's- Did she

Alex Volkov 2:21:21

change her name?

Swyx 2:21:22

Uh, no.

Alex Volkov 2:21:22

We don't know. Okay. So Sero Cheng from Cerebras, who are sponsoring this, uh, Token Billionaires Lounge. Shout out to milksnmacho on Twitter, uh, did a great job there.

Nisten 2:21:30

Yeah.

Alex Volkov 2:21:30

She was also our, uh, chief fun officer in our, on the way to the World Cup, so we had, like, face paint. Yeah. We had, like, a bunch of stuff. Uh, dude, I cannot fathom how are you doing this without having any gray hair. Explain. I have gray hair. Yeah, just a little bit. Um- I wanna talk to you about this not being a SF phenomenon anymore. So, Ai.engineer almost... The, the, the essay itself just celebrated three years exactly, right? We celebrated, uh, July 1st? Or, uh, June 30th. June 30th, yeah. June 30th.

Swyx 2:22:02

Yeah.

Alex Volkov 2:22:03

So you wrote

Swyx 2:22:03

an essay- Which I, which I forgot to say on my e-note.

Alex Volkov 2:22:06

I said it on Twitter. But yeah. Um, so June 30th, you wrote an essay called The Rise of the Engineer. You kinda put a line, uh, in the ground and said, "Okay, it... There's enough here to call ourselves something else than front-end and back-end." That was the, the backdrop for this. Then, obviously, I don't know how quickly after you decided... Maybe, maybe some history here. How quickly after the essay did you decide the need, this needs to be a conference?

Swyx 2:22:29

It was two months before.

Alex Volkov 2:22:30

Oh, okay.

Swyx 2:22:31

Because, uh, I describe AIE as the most intentional thing I've ever done. Yes. Because I not only knew that the essay was correct and a big idea, I knew that probably the right way to make it real is to s- start a conference around it and grow it as a brand so that I will be more right over time. Because if people use this term more, it benefits me, it benefits them.

Alex Volkov 2:22:53

Yeah.

Swyx 2:22:53

And I think, uh, someone just needed to plan in advance. So even before I published the blog post, I did a whole lot of research and user, uh, research as well. I bought the domain name, we booked the hotel, um, and we started to book speakers already.

Alex Volkov 2:23:09

And so Hotel Nikko in San Francisco, the first Ai.engineer, uh, we, we ch-

Swyx 2:23:14

500 people, yeah.

Alex Volkov 2:23:15

500 people. That's it.

Swyx 2:23:17

Yeah.

Alex Volkov 2:23:17

I think, I think you had more speakers at the speakers dinner- No, no, no ... than at the first Ai.engineer.

Swyx 2:23:23

No, about 300. Yeah.

Alex Volkov 2:23:24

Or, or close to.

Swyx 2:23:25

Yeah,

Alex Volkov 2:23:25

yeah. And, uh, we had a chat after that. offline, not streaming, just kinda like we're having now. I remember back then, you were able to go through the talks that were on stage and tell me, "Oh, I actually like this talk. They talked about..." C- c- can you, can you think you can do the same for, for, for this conference? Can you like sum it up, or is it now about a different option?

Swyx 2:23:44

Not, not, not every talk, but- Yeah ... um, uh, definitely a lot of the tracks, a lot of the, the speakers, uh, we spend a lot of time on. Yeah. Um, this year 2,200 people applied for the CFPs. Uh, we accepted less than 5% of them.

Alex Volkov 2:23:57

Less than 5% acceptance? Yeah.

Swyx 2:23:59

So, uh, it, uh, it was, it was very, very competitive, and everything, everyone that we say yes to is no to 20 other people.

Alex Volkov 2:24:05

Yeah.

Swyx 2:24:05

So we have to be very careful about who we invite. Um, some of them are very established, some of them are new, and, you know, you wanna have a mix of people who are going to put butts in the seats and sell tickets, and people who you've never heard of, but hopefully they become a big star in the future.

Alex Volkov 2:24:22

Yeah.

Swyx 2:24:22

And AI is part of their career.

Alex Volkov 2:24:25

Who's, uh, one such person that people never heard of that you would like to maybe...?

Swyx 2:24:31

I mean, um, Will Brown was, was that for us, uh, last year.

Alex Volkov 2:24:34

Yeah. Will CCBB on Twitter. Uh- Uh, also a friend of ours.

Swyx 2:24:38

Yeah, Dex also, I, I would say in some extent.

Alex Volkov 2:24:41

Dexter Horsey from Human Layer?

Swyx 2:24:42

Yeah. Um, I mean, this time around, uh, Abhishek, uh, Bhardwaj, he spoke in last year's online track. So we have an online track where it's basically our, our backup speakers.

Alex Volkov 2:24:55

Yeah.

Swyx 2:24:55

Uh, speakers drop out all the time for family reasons, work reasons, health reasons.

Alex Volkov 2:24:59

Yeah.

Swyx 2:24:59

And so I always need some backup to say like, "Hey, this person dropped out. Can you, can you fill in?" Yeah. It's obviously kind of a secondary, like less prestigious thing to be waitlisted and online, but Abhishek was so kind. He submitted a 40-minute talk on, on sandboxing.

Alex Volkov 2:25:14

Oh, nice.

Swyx 2:25:14

Greg Brockman ha- uh, met him at Worlds Fair last year and hired him onto the OpenAI team, and now this year he has the track keynote for sandboxing.

Alex Volkov 2:25:23

That's incredible.

Swyx 2:25:24

Yeah.

Alex Volkov 2:25:24

Um, so Abhishek, the next thing that I wanna ask you is that, okay, we started the Hotel Nico in San Francisco, obviously the bubble in San Francisco. We're going broad. We're also going international, so we had a few offshoots, one of them in Singapore, and then another one in-

Swyx 2:25:41

Singapore team's here, by the way

Alex Volkov 2:25:42

Yeah, yeah. Shout out to Adeline- It's so crazy ... and, uh, Ro- Ro-

Swyx 2:25:46

Agrim, and Sherry, and-

Alex Volkov 2:25:47

Sherry, yes ...

Swyx 2:25:47

Ivan, and,

Alex Volkov 2:25:49

uh- Shout out to the Singapore team ...

Swyx 2:25:50

JQ.

Alex Volkov 2:25:51

And there was another, uh, event in France that was like-

Swyx 2:25:55

Yeah, Paris ...

Alex Volkov 2:25:55

like an offshoot event. Yeah,

Swyx 2:25:56

yeah.

Alex Volkov 2:25:57

And then an official one, obviously 85 days ago, uh, in London- Ah which was, uh, on the-

Swyx 2:26:01

Was it 85?

Alex Volkov 2:26:02

Yes. This is crazy. Dude, I counted for my talk. I think it was like 84 yesterday for my talk, so it's like 85 days. 84, 85 days, uh, which feels insane, right?

Swyx 2:26:11

Yeah,

Alex Volkov 2:26:12

yeah. Um, and so we're going international. Where next?

Swyx 2:26:16

Yeah, Tokyo.

Alex Volkov 2:26:17

Tokyo.

Swyx 2:26:18

Tokyo.

Alex Volkov 2:26:18

Uh, we just had Steph Druga from- Yeah Sakana AI sitting here. Yeah. And she's like, "Oh, I wish there was one." So folks, you heard it here maybe first, AIE Tokyo is coming. An official one, not an offshoot, right?

Swyx 2:26:29

Yeah.

Alex Volkov 2:26:29

Okay, cool. Um,

Swyx 2:26:31

I- Basically, the... So there's, we're going the JSConf model.

Alex Volkov 2:26:33

Yeah

Swyx 2:26:34

Which is we own, the, the central team owns the continents. So West Coast US is a continent. East Coast US is a continent. Yeah Europe is a continent. Asia is a continent. Yes. We'll do Africa and LATAM. Yeah. Maybe Antarctica one day, just for fun.

Alex Volkov 2:26:47

I would love that.

Swyx 2:26:49

Um-

Alex Volkov 2:26:51

Just imagine us sitting in the igloo- ... just freezing our asses off on the podcast, like, "Ah." Just

Swyx 2:26:54

so we can say we've done- Just so we can- the seven, seven continents. Uh, and then, and then individual countries, like if you wanna do it in Chile, you wanna do it in Canada- Yeah ... you wanna do it in like, I don't know, Bosnia Herzegovina.

Alex Volkov 2:27:06

Yeah

Swyx 2:27:07

Go ahead and do it, but we won't, we won't organize it.

Alex Volkov 2:27:09

Okay. Uh, so speaking of that, maybe the last thing I'll tell you, there is a group chat organization of some friends of yours, including myself, to bring this to Israel. Israel. I feel like... And, uh, folks, you know I lived in Israel for, like, m- most of my adult life, and I speak Hebrew. Itamar from Qodo, a good friend of yours that was on the first... Actually was-

Swyx 2:27:28

Very,

Alex Volkov 2:27:29

very first ... deployed during the first AI Engineer. He was deployed.

Swyx 2:27:31

Because the, the war, the war started, like, right before.

Alex Volkov 2:27:33

Yes. October 8th,

Swyx 2:27:36

and October 7th happened just a day before. And we sent a photo of him, like...

Alex Volkov 2:27:38

There was a photo of Itamar, uh, the founder of Qodo, with his rifle and the laptop, shipping from the, the front lines. It was crazy. Uh, so we started thinking about this, and we wanna bring, uh, this maybe the first announcement, too, we wanna bring AIE to Tel Aviv, AIE TLV. There's a effort starting.

Swyx 2:27:55

Oh, that's,

Alex Volkov 2:27:56

that sounds nice. So AIE TLV-

Swyx 2:27:57

Yeah, yeah ...

Alex Volkov 2:27:58

uh, we really wanna, like, organize this. I would love to, like, participate bigly in that. We should talk. Yeah. I don't know if you've ever been to Tel Aviv.

Swyx 2:28:05

No. Uh, and you know, obviously, I think, uh, I'm generally very apolitical.

Alex Volkov 2:28:10

Yeah.

Swyx 2:28:10

I think that, uh, one, you know, I, uh, you know, I'm, I'm not a citizen here. Two, uh, I have friends on both sides. I think this is one of those things where we will not be that involved.

Alex Volkov 2:28:21

By two sides, you mean Anthropic and OpenAI?

Swyx 2:28:23

Yeah. And Gemini and xAI and all those things. Yeah. I think, like, uh, we will not be that involved, but if you wanna go ahead and host it in Israel, we're not gonna, you know, we're not gonna say no.

Alex Volkov 2:28:31

Yeah.

Swyx 2:28:32

I think the, the interesting thing is, like, well, like, how, how do we scale this more globally so that every community can feel like they're, they can talk about technical things in their country, which matters to me, you know? I, I, I'm from Singapore, and we don't have that good of a tech scene, but AIE Singapore really-

Alex Volkov 2:28:49

Yeah

Swyx 2:28:49

gathered the community in a way that no one has seen in Singapore.

Alex Volkov 2:28:52

You also had, like, a government official that does agents. Can

Swyx 2:28:55

you tell me about that, like, super quick? Uh, the min- the, the minister of foreign affairs, the secretary of state, equivalent.

Alex Volkov 2:28:59

Of the state of

Swyx 2:29:00

Singapore. Basically the, I think number three in the government-

Alex Volkov 2:29:03

Wow

Swyx 2:29:03

uh, runs his Meeting notes on Nanoclaw.

Alex Volkov 2:29:08

Nanoclaw?

Swyx 2:29:09

Yes, air-gapped Nanoclaw.

Alex Volkov 2:29:10

Gabriel Cohen-

Swyx 2:29:11

Yes. Was also- Yeah, Gabriel was also there ...

Alex Volkov 2:29:13

was also in the group chat, by the way.

Swyx 2:29:14

Yeah. Um, of course, he's go- Israel.

Alex Volkov 2:29:16

Yeah.

Swyx 2:29:17

Um, and I think, uh, it... You know, our pri- our former prime minister was a computer science graduate from Oxford. Uh, the country's, like, wants to be better in tech, but it's hard when you're not San Francisco. Yeah. Israel, obviously, different story. Yeah. Israel's very, very, very, very, very, very leading in tech. Um, but I think for the smaller countries like us, like, I definitely feel the pain of, like, if you live, if you don't live in SF, if, you know, and you can't make it to SF for whatever reason, then how can you be a part of the AI movement?

Alex Volkov 2:29:45

Yeah.

Swyx 2:29:45

I think being able to host an AIE in your country, very simple task that we can support.

Alex Volkov 2:29:50

Yeah. So how do, actually, folks who are listening to this on LinkedIn, on YouTube, on Instagram, wherever you guys are listening to, if somebody's really passionate, like we are, we're self-organizing. Obviously, I have the benefit of, like, you know, being closer to you and, like, knowing you and asking you the right questions, but folks who are self-organizing and really, like, passionate agentic, how should they go about extending the AIE mission across the world?

Swyx 2:30:12

Oh, we have a partnerships page. Uh, just go down on the landing page, you'll see it. Yeah. Uh, we have about 40 countries on the wait list. The other, the, the big, big one is India.

Alex Volkov 2:30:20

Yeah.

Swyx 2:30:20

Everyone wants to do India. So we'll just kinda go down the list in terms of, like, population size, I guess. Um, and I try to work with everybody that we can. All

Alex Volkov 2:30:29

righty. Uh, Swyx, I think-

Swyx 2:30:30

We are hiring.

Alex Volkov 2:30:32

Oh.

Swyx 2:30:32

If you wanna help us manage our partnerships, our community, our marketing, uh, even our software engineering, we are, we're hiring for software roles. Uh, work with us, join the team. Um, you know, we hired Flo, who's, who's been volunteering with us for a long time.

Alex Volkov 2:30:46

And- Can we do a little shout-out for Flo, dude? Yeah, shout-out Flo. He has been, uh- Where is he? I don't know where he is. Oh, he's gone. He was, like, running around. He's always busy. Um- Flo Young. Yeah. Uh, I saw him at the first Ai.engineer. Dude showed up in a suit. Yeah. Sharp suit.

Swyx 2:30:58

Yeah, yeah.

Alex Volkov 2:30:58

And then since then, he's just, like, showing up everywhere.

Swyx 2:31:00

Yeah.

Alex Volkov 2:31:00

And- Supporting behind the scenes ...

Swyx 2:31:01

he doesn't have creden- he doesn't have the normal credentials, right? He did, like, um... He's self-taught as a programmer. Yeah. None of his friends, and, you know, from that part of Florida, like, care anything about tech, but he sought us out, and he wanted to join us.

Alex Volkov 2:31:16

Yeah.

Swyx 2:31:16

And I'm so glad that we found a home for him.

Alex Volkov 2:31:18

Yeah.

Swyx 2:31:19

And we're letting him shine. He's writing code for us. He's checking my work when I, when I write Ai bot.

Alex Volkov 2:31:24

Oh, he's, he's reading, uh, PRs for you? Which

Swyx 2:31:26

is... Yes. Uh, um, and, and he's catching, you know, important stuff. And, um, I, I really want AIE to be a venue where people can create their careers as well.

Alex Volkov 2:31:35

Yeah.

Swyx 2:31:36

Either as an employee or as a speaker or as a sponsor, whatever. Uh, I think that's very important.

Alex Volkov 2:31:41

Swyx, that's incredible. The mission of AIE continues. Uh, we're gonna see each other in the next one, and hopefully the next one. New York,

Swyx 2:31:47

yeah.

Alex Volkov 2:31:47

Um, I will just say, my last question for you, maybe as we land this plane, is- What is the best part of AIE experience for Swyx? From the start of, "Hey, folks, start showing up," until, "Okay, no more speakers, no more talk, everybody flew away." During that whole period, if you reflect for the past six times or seven times that you did this-

Swyx 2:32:11

Yeah, yeah ...

Alex Volkov 2:32:12

like, what is the best for, for you? What, what is- It's always

Swyx 2:32:14

the same. It's

Alex Volkov 2:32:14

always- What is the... Let me, let, let me- Yeah ... finish the question, sir- Yeah ... please, because I think it's important for folks to know as well. I know how taxing this is for you. Yeah. I know how difficult this is on Leah as well, Leah McBride, the GM of, of AIE that's now, like, everywhere. Shout out, Leah. Uh, um, you guys are working really, really hard to make this happen. What drives you forward? What keeps you going? What is the best part about this?

Swyx 2:32:35

Yeah, there's a lot of questions in there. The best part is somebody that I've never met before coming to me and saying how we affected their lives in some way, right? They, uh, watched a talk, they presented, uh, they talked about it or forwarded it to their friends, or they were a speaker and they got hired at one of these AIEs and they changed their careers. Yeah. Um, I have many I have many friends now who owe me a, owe me a favor at all the big labs, so I can call in those favors to- Over,

Alex Volkov 2:33:02

over, over

Swyx 2:33:03

hiring? Because, because AIE affected them as well, right? Oh,

Alex Volkov 2:33:05

nice. Okay.

Swyx 2:33:06

Uh, so, um, I, I think this is one of those, like, self-sustaining cycles. I, uh, you know, uh, I would... The, the, the anecdote I always quote is, like, one of our keynote speakers two years ago found his fiancee, uh, aft- in the after party after ke- after speaking, right? So this is, uh, I tell my single speakers like, "Hey, speak at this thing. You might get a girlfriend."

Alex Volkov 2:33:25

You might get a girlfriend or a wife.

Swyx 2:33:26

Um, or, or a boyfriend.

Alex Volkov 2:33:28

Yeah.

Swyx 2:33:28

Uh, I think, like, uh, making real impact on human relationships and their lives, their careers, I think is really helpful. Um, and then to me also, like, very fulfilling. I think what keeps me going is honestly a sense of responsibility. I think that I wanted vaguely this thing to happen, and now that it is happening, it is the top industry event, I, I feel like I have to make it as good as I can because that's the promise I made to my community- Yeah my sponsors, my attendees. Yeah. And so it's, it's like, it's very hard because, like, sometimes I just wanna take a break and rest and give up and- Yeah ... whatever, do something else, but I cannot because it's, this thing is, uh, going in motion and, um, it, it, it wills itself into existence and, and we're all s- we're all contributing, you know. Like, um, this, this conference will not happen with so many volunteers and so many people taking time off work and personal lives and all this to just contribute and make it a thing. So yeah, it's definitely not on me and I, I feel a responsibility to them to create the space and the certainty that this will be a thing-

Alex Volkov 2:34:31

Yeah

Swyx 2:34:31

so that they can continue to trust that this is community will stick around for the long run.

Alex Volkov 2:34:35

I love this, dude. This was passionate and, and, and, and beautiful, and I think from me and everybody else, we, we wish you a good rest after this. Go touch grass somewhere that-

Swyx 2:34:47

Yeah ...

Alex Volkov 2:34:47

your phone is not gonna bother you. We have

Swyx 2:34:48

some fake grass at the-

Alex Volkov 2:34:49

Yeah ... coffee

Swyx 2:34:49

corner.

Alex Volkov 2:34:50

Oh, we have some fake grass to touch. I think go touch real grass. Uh, you know, the, the company's bigger than just the one person right now.

Swyx 2:34:56

Yeah. It's about- And- ... 15 now. We wanna, we wanna hire good people, so.

Alex Volkov 2:34:59

My point is they can keep going for a week and a half without you, even if you disconnect, so I think I wish you that after this conference.

Swyx 2:35:05

Yeah. I, I would say if anyone's doing events, actually the most important thing is to capture- Uh, all the learnings for next year.

Alex Volkov 2:35:11

Yeah.

Swyx 2:35:11

And it's freshest in your mind just after an event, right?

Alex Volkov 2:35:14

Yes.

Swyx 2:35:14

Like, hey, like this thing, like it really upset me, but I didn't tell you about it because we, the show has to go on. That's the

Alex Volkov 2:35:20

download that we did at the end- You have to download, yeah ... of the last one. Yeah,

Swyx 2:35:22

yeah.

Alex Volkov 2:35:22

And it's like freshest in your mind. Uh, so folks, post your learnings, post newsletters- Yeah ... and blogs.

Swyx 2:35:27

We, we watch, uh, Twitter and, and, yeah, blogs and vlogs.

Alex Volkov 2:35:30

Dev.to, the, the practical. Post- Yeah, yeah ... whatever you felt here, even if there's stuff that you didn't like. There was one event I remember. Dude, we can keep going with this. I wanna let you go 'cause I know, like, you're a social person. But there was one event when, like, the f- the talks felt a little bit too commercial, and you got this feedback, and then the next one you changed and, like, it was a very selective- No benefits. Yeah, very selective things. And also, World's Fair is a big, like, commercial thing, and then the, the, the code, uh, in New York or the summits are the more, like, pre-selective and, like, more exclusive stuff. So there's, like, a wide variety of things, but all of this came from feedback directly to Swyx on Twitter, on the Ai.engineer.

Swyx 2:36:03

Yeah, I actually have a feedback session, uh, later this afternoon, so attendees can just come and-

Alex Volkov 2:36:08

Yeah ...

Swyx 2:36:09

uh, tell me what they thought, uh, you know, whether they have suggestions. Uh, we did NEO this year. Uh, AIE NEO. It's New Engineer Orientation.

Alex Volkov 2:36:16

Yeah.

Swyx 2:36:16

It's literally onboarding for the conference 'cause we have to teach you how to do this conference.

Alex Volkov 2:36:20

That's crazy. I should, I should, I should visit that next time, although I, I've been here for a minute. Uh, Swyx, we've been on, on the air for two and a half hours. This is our sixth and the continuous coverage of, of, uh, AIE. Thank you so much- Yeah, thanks for doing this ... to Swyx, Turian, and all these guys. Uh, thank you for coming. Folks are staying online. That is great to hear directly from you. Thanks. So hopefully we'll keep this going. Yeah. Go do your stuff. Folks, go give, uh, Swyx feedback.

Swyx 2:36:44

Uh, watch the YouTube. Uh, subscribe. We're trying to reach a million by end of year.

Alex Volkov 2:36:48

Oh, congrats on half a million, by the way.

Swyx 2:36:50

Yeah, yeah.

Alex Volkov 2:36:51

Uh, the YouTube channel is a very important and big piece of AIE that folks don't really, like, gather. Like, folks gather in physical space here, and then all of the talks, including my talk that I'm hoping at some point maybe I'll ask- You gotta blow it up ... Swyx to prioritize. Yeah, it'll blow up. It's gonna blow up. And, uh, all of the talks are there, uh, for you, specifically because it's really fun to be on the hallway. And so when you come to an event and you, you maybe, oh, there's, like, Tarik Shehata talking from Anthropic. Ah, I can see this on YouTube. I can literally just go and talk to people and learn a bunch from them, and I think that this is, like, a in-meet space, a thing that you're, like, enabling. I think it's very, very good. Swyx, co-founder of Ai.engineer, host of Latent Space Podcast. Thanks, man. Curator of AI News. Thanks, buddy. And a, I would say, fairly casual co-host of ThursdAI at this point. I think this is, like, your- Yeah, yeah ... tenth appearance. Yeah. Thank you so much for coming. I really appreciate it. Amazing. Thank you. Before you jump off, can, uh, we ask for a pic, guys? Can we take a picture of us from here? Oh. And I think after this we'll land this plane. Yeah. Yeah, just like this Swyx,

Swyx 2:37:57

thank you so much Thanks, man. Thanks a lot. Yeah, congrats on everything.

Alex Volkov 2:37:59

This is insane. Yeah. I, I still can't believe it. I

Swyx 2:38:01

mean, you work at a public company now. You should also c- talk about that.

Alex Volkov 2:38:03

Oh, that's true. Folks, I work for CoreWeave. They're the only sponsor for, uh, for ThursdAI, and, uh, I cannot believe my luck for them letting me do this for a living. Yeah. This is like... I'm doing the dream life. With that, I will say thank you so much for tuning in for ThursdAI. If you missed any part of the show, ThursdAI starts as a live stream and ends up being a newsletter and a podcast, and the name is ThursdAI, so you, you can expect this to happen very, very soon. I will try to sum up the show for today, although it's gonna be hard because we've grown a lot. We've had... Obviously we started... Folks, let me, uh, actually put a, put a earpiece here and, like, unmute you guys and, uh, hear from you, uh, Peter and LDJ, who are still here. Um, was that informative? Was that helpful? Was that... How did that feel for you guys over there?

Peter Gostev 2:38:59

Yeah, that was awesome. Good to really feel the atmosphere. Uh- Yeah ... yeah, Sw- Swyx is awesome, so yeah. Swyx is incredible, and, and- We, we definitely managed to get him.

Alex Volkov 2:39:09

Go ahead, Peter.

Peter Gostev 2:39:11

Yeah, very good to you managed to get him, 'cause yeah, I saw the comments people were saying goo- good to hear from him. He's a... Like, it's so important to have something, someone like him organizing this, not, like, a generic, uh, event organizing company that, like, decided that's an opportunity. So yeah, we definitely need someone like him to just keep, keep doing this. And I think, uh,

Alex Volkov 2:39:33

one, one of the reason... I asked him what keeps him going. I can tell you guys what keeps me going, okay? One of the reasons we do ThursdAI is that not everybody is in San Francisco. I'm not in San Francisco. Not everybody is as plugged into AI as us, and so we wanna bring this to you. My purpose in life is bridging, bridging you to the technology, to the updates, and then when I'm here, bridging you to feel this vibe. I'm so happy. The last coverage that I did in this place were from a hotel room, and at some point we brought Logan Kilpatrick on, and that was great, but, like, I'm so happy we're in the middle of all of this. You see people walking around. You see people checking us out. If you didn't make it to Ai.engineer, we've got you. ThursdAI is gonna cover Ai.engineer and hopefully get you the feeling of what's happening in the space. Thank you so much for tuning in. Uh, Peter, GosteV, LDJ, Nisten was here before. Thank you guys so much for holding the space for us in case stuff happens as well. Again, if you missed any part of the show, uh, the two and a half hour live stream is gonna end up as a podcast and a newsletter that I'm gonna write from there. For now, I think it's time for me to, to, to tell you that we'll see you here next week. Bye-bye, everyone. Thank you.

ThursdAI · July 2

0:00 0:00

What happened in AI the week of July 2, 2026?

Fourteen lanyards, one table.

Eleven segments, zero dead air.

Episode Summary

In This Episode

Hosts & Guests

By The Numbers

🔥 Breaking During The Show

🎪 LIVE from AI Engineer World's Fair

🏢 Fable is back (and Sonnet 5 is… meh)

🔓 Open source: LongCat-2.0 unmasked (Meituan's Owl Alpha)

🔩 The Etched ASIC debate

🧩 Exo Labs launches local.ai (+ a surprise NVIDIA crash)

🌞 GPT-5.6 with Dominik Kundel (OpenAI)

💛 This Week's Buzz: W&B Aria goes GA

🐡 Sakana Fugu with Stefania Druga

✨ Google DeepMind: OmniFlash + NanoBanana 2 Lite

💙 Darya Volkov's token-billionaire debut

🫶 Swyx closes: what this whole thing is

🏢 Big CO LLMs + APIs

🔓 Open Source LLMs

💛 This Week's Buzz

🤖 AI Coding & Agents

🎵🎬 Voice & Vision

Liked this episode? There's one every week.