ThursdAI — Jun 25, 2026 — GLM 5.2's DeepSeek moment, Sakana Fugu, OpenAI Jalapeno & more AI news

01

💥 GLM 5.2 has its DeepSeek moment

Headline Z.ai Open source

The whole show orbited one model. GLM 5.2 — released last week, but this week the one everyone is actually running, benchmarking and comparing to closed frontier models. The surprise isn’t just coding: it’s web design and UI taste. Peter says Arena data puts 5.2 well above 5.1 and shockingly strong on web dev. Alex showed a ThursdAI page GLM built and called it a genuine first for open source. Wolfram’s caveat: still weak in German, so it’s a workhorse, not necessarily your main conversational model.

Arena data puts GLM 5.2 above 5.1, with surprising strength on web/front-end work.
Alex shows a custom GLM-built ThursdAI page — the first open model genuinely good at design.
Unsloth shipped GGUF quants so you can run a 1M-context GLM locally.

“This is the first model in open source that is really good at web design and front end design.” Alex Volkov

“It was a GLM week, all right? Everybody is realizing this is a DeepSeek moment.” Nisten Tahiraj

744Bparameters (MoE)

1Mcontext window

MITlicense, open weights

↗ GLM-5.2 weights ↗ Unsloth GGUF

02

🐝 This Week’s Buzz: GLM on CoreWeave + WolfBench

This Week’s Buzz CoreWeave

GLM 5.2 went live on CoreWeave Serverless Inference — just bring your W&B key (or hit it via OpenRouter). The team biased the deployment toward speed over the full million-token context. Then Wolfram ran WolfBench: GLM 5.2 is the third best model he’s ever benchmarked, exceeding Opus 4.7, with the strongest “solid base” (consistently-solved tasks) in the run — at a fraction of the cost of GPT 5.5 or Opus.

Served at $1.39/M input and $4.40/M output — cheaper than Opus with caching.
WolfBench solid base of 61% on max thinking — GPT-5.5-level reliability.
Under $200 to run the benchmark vs ~$500 for GPT 5.5 and ~$400 for Opus.

“It is the third best model that I have benchmarked here, and it even exceeds Opus 4.7.” Wolfram Ravenwolf

“This is the DeepSeek moment realized.” Alex Volkov

$1.39/ 1M input

$4.40/ 1M output

#3model on WolfBench

61%solid base (max thinking)

𝕏 CoreWeave announcement ↗ WolfBench

03

🐡 Sakana Fugu and the orchestration layer

Orchestration Sakana AI

Sakana AI launched Fugu: one API endpoint that hides “seven raccoons in a trench coat” — a trained router that dispatches your task to publicly accessible models in Thinker / Worker / Verifier roles, then fuses the result. Nisten clocked the pool as Opus, Codex and Gemini. The panel frames it as the next paradigm after thinking models and MoE: coordinated model teams. (Caveat from chat: people burned through their $20 tier on a single agentic prompt.)

Routes to public frontier models — no private Fable/Mythos in the pool.
Backed by two ICLR papers: Trinity and the Conductor.
Echoes OpenRouter and Arena’s prompt classifiers — routing as a product.

“I think this is one of the new things, to raise the intelligence even higher by combining different models to get one result.” Wolfram Ravenwolf

95.5GPQA Diamond

93.2LiveCodeBench

73.7SWE-Bench Pro

↗ Fugu announcement 𝕏 Sakana launch tweet

04

🤖 Sean Grove on Linzumi and agent fleets

Guest Linzumi

Sean Grove — ex-OpenAI (Model Spec, deliberative alignment), now on his third company — joined to explain Linzumi: a shared chat-and-orchestration environment where humans and fleets of coding agents work in the same threads, continuously compiling the company’s intent into a living specification. His thesis: stop reading every line of generated code; read the failures against the properties you care about, like property-based testing instead of hand-checking unit tests. The target is 10,000 agent hours per person per day.

Linzumi captures ambient chats, calls and coding jobs into a compiled spec.
“Live six months in the future” — build tools for where agents will be, not where they are.
Sean: without agentic company-building, he’d have retired rather than start a fourth.

“If you are involved in reading the output or making every micro decision, you will never be able to scale to that stage.” Sean Grove

“You need a ladder of evidence that allows you to build up trust in the system and know when to trust it and when not to.” Sean Grove

10,000agent hrs / person / day

1.2M+views, AI.Engineer talk

↗ Linzumi 𝕏 YC launch tweet → Sean’s guest profile

05

⚡ Claude joins Slack, OpenAI builds silicon

Big Co Anthropic OpenAI

The big-company stack kept moving. Claude Tag turns Slack into a persistent surface for an ambient AI teammate — shared channel context, proactive follow-up, coding, analysis, incident support and enterprise governance. Nisten’s take: bigger than it first looks, because you keep the context, personality and safety scaffolding instead of “firing” Claude each session. Meanwhile OpenAI Jalapeno — its first custom inference ASIC with Broadcom — signals a full-stack future, with a claimed nine-month design-to-tape-out (engineers in Nisten’s chat suspect work began years earlier).

Claude Tag: an ambient Slack teammate with shared context and follow-up.
Jalapeno targets ~50% lower inference cost; Nvidia keeps the training market.
Every Broadcom dollar is a dollar not spent on Nvidia — like Google’s TPUs and Meta’s silicon.

“OpenAI builds its own chip. Jalapeno is a custom inference ASIC designed with Broadcom.” Alex Volkov

9 modesign → tape-out (claimed)

~50%inference cost cut goal

65%Anthropic team code via Claude

𝕏 Claude Tag launch ↗ OpenAI Jalapeno

06

⚡ Quick hits: open weights, OCR & tiny models

Rapid fire

A loaded week beyond GLM. Krea 2 open-weights (12B image model, raw + turbo) brings back image diversity the “competent but collapsed” frontier models lost. Baidu Unlimited-OCR (3B, constant KV cache, 40+ pages in one pass, ~93.9% on OmniDoc Bench) and Mistral OCR 4 push document AI. Liquid LFM 2.5 230M is billed as the world’s smallest agentic LLM — smaller than a node_modules folder, fast enough for a Raspberry Pi or a toaster. Plus OpenAI Daybreak (security tooling), the Seedance 2.5 teaser (4K, 30s, IP licensing, early July), and the Aside agentic browser.

Krea 2 — open-weights 12B image model with out-of-distribution artistic range.
Baidu Unlimited-OCR & Mistral OCR 4 — cheap, fast document intelligence.
Liquid LFM 2.5 230M — tiny on-device agentic model for edge & CPU.

↗ Krea 2 ↗ Baidu OCR ↗ Mistral OCR 4 ↗ OpenAI Daybreak ↗ Aside browser