Everything AI Released in June 2026

31 releases covered live on the show — every model, product, paper and tool that mattered, with links and our analysis.

🧠 New Models 14

Moonshot AI
New ModelsOpen weights

Kimi K2.7 Code

Moonshot AI open-sources Kimi K2.7 Code for agentic coding

Moonshot AI open-sourced Kimi K2.7 Code, a trillion-parameter MoE coding model with benchmark jumps over K2.6 and fewer reasoning tokens. On the show it landed as the second half of the open-source coding wave beside GLM-5.2.

1T MoE parameters30% fewer reasoning tokens
xAI
New Models

Grok Imagine Video 1.5

xAI launches Grok Imagine Video 1.5 with faster generation and native audio

xAI launched Grok Imagine Video 1.5 with nearly 2x faster generation, native audio, and a claimed #1 leaderboard position. The episode grouped it with Gemini Omni as part of the week’s video-generation frontier.

~2x faster generation
Z.ai (Zhipu AI)
New ModelsOpen weights

GLM-5.2

Z.ai releases GLM-5.2, a 753B open MoE with 1M context

Z.ai released GLM-5.2 as a major open-source coding and agentic model: a 753B-parameter MoE, MIT-licensed, with a one-million-token context window. The episode treated it as the open-source model that arrived exactly as Fable access disappeared, with strong coding and agentic performance close to the frontier.

753B parameters1M context windowMIT license
Google DeepMind
New ModelsOpen weights

Gemma 4 12B

Google drops Gemma 4 12B, an encoder-free multimodal local model

Google released Gemma 4 12B, an encoder-free multimodal model under Apache 2.0 that targets 16GB VRAM local setups. Instead of bolting separate vision or audio encoders onto a language model, it uses one unified network, which LDJ and Yam argued makes smaller multimodal models cheaper, cleaner, and easier to run locally.

H Company
New ModelsOpen weights

Holo 3.1

H Company launches Holo 3.1 local computer-use agent models

H Company released Holo 3.1, a family of local computer-use agent models ranging from 0.8B to 35B parameters with new quantized checkpoints. The lineup targets running screen-driving agents on local hardware rather than in the cloud.

Ideogram
New ModelsOpen weights

Ideogram 4.0

Ideogram 4.0 becomes the top open-weight text-to-image model

Ideogram released Ideogram 4.0, a 9.3B-parameter text-to-image model with open weights under a non-commercial license. It leads open-weight image models on typography and layout, with bounding-box/layout-style prompting that trades casual generation ease for precise structured control.

9.3B Ideogram 4 parameters
JetBrains
New ModelsOpen weights

Mellum 2

JetBrains open-sources Mellum 2, a 12B MoE coding model

JetBrains released Mellum 2, a 12B mixture-of-experts coding model with only 2.5B active parameters, trained from scratch by a small team using a three-stage curriculum over 10T tokens. The panel read it as IDE companies converting years of developer-workflow context into model advantage; it is also available on CoreWeave Inference.

Microsoft
New Models

MAI-Code-1-Flash

Microsoft ships MAI-Code-1-Flash into GitHub Copilot

Part of the seven-model MAI launch at Build 2026, MAI-Code-1-Flash is Microsoft AI's fast coding model and ships directly into GitHub Copilot. The panel saw it as a sign Microsoft intends to serve its own models inside its developer surfaces instead of relying solely on OpenAI.

Microsoft
New Models

MAI-Thinking-1

Microsoft launches MAI-Thinking-1, a 1T MoE trained from scratch

Microsoft AI used Build 2026 to launch seven MAI models, headlined by MAI-Thinking-1, a 1T total, 35B active MoE reasoning model trained from scratch on 33T tokens without distillation. The panel read the launch as Microsoft becoming a frontier model lab in its own right rather than only an OpenAI distribution channel.

1T MAI Thinking 1 total parameters33T MAI training tokens
MiniMax
New Models

MiniMax M3

MiniMax announces M3 coding/agentic model with 1M context

MiniMax announced M3, a natively multimodal coding and agentic model with a one-million-token sparse attention context claim and open weights promised soon. Reported numbers include 59 on SWE-bench Pro, and the panel noted MiniMax already has a following for cheap agentic tool calling even as pure coding quality is debated.

NVIDIA
New ModelsOpen weights

Nemotron 3.5 ASR

NVIDIA ships Nemotron 3.5 ASR, a 600M streaming speech model

NVIDIA released Nemotron 3.5 ASR, a 600M-parameter open multilingual streaming speech-to-text model aimed at voice agents. It supports 40 languages and reportedly delivers 17x more throughput than Parakeet-style baselines at half the size, pushing the latency/accuracy frontier for open voice-agent infrastructure.

17x Nemotron ASR throughput
NVIDIA
New ModelsOpen weights

Nemotron 3 Ultra

NVIDIA releases Nemotron 3 Ultra, a 550B open-weight MoE for agents

NVIDIA dropped Nemotron 3 Ultra the day of the show, a 550B-parameter sparse MoE with 55B active parameters built for long-running agentic harnesses like OpenCode, Hermes, and OpenClaw. Chris Alexiuk joined to explain the hybrid Mamba/Transformer architecture and the unusually complete open release: weights, training data, recipes, a GenRM reward model, and an NVFP4 quantized checkpoint.

550B Nemotron 3 Ultra parameters55B Active parameters
Reve
New Models

Reve 2.0

Reve 2.0 hits #2 on Text-to-Image Arena with layout-first editing

Reve 2.0 jumped to second place on Text-to-Image Arena (around 1200 ELO) with native 4K output, code-like layout control, and precise editing. Alex's live tests found inconsistent portrait identity, but the layout-first editor is the real differentiator for graphic and image iteration workflows.

xAI
New Models

Grok Imagine Video 1.5 Preview

xAI releases Grok Imagine Video 1.5 Preview with synced audio

xAI released a preview of Grok Imagine Video 1.5, an image-to-video model that generates clips with synchronized audio. It adds xAI to the week's crowded race of media-generation model updates.

🚀 Products & Apps 5

OpenAI
Products & Apps

Jalapeno

OpenAI unveils Jalapeno custom inference chip with Broadcom

OpenAI unveiled Jalapeno, its first custom inference ASIC built with Broadcom, positioning it as part of a full-stack strategy to make ChatGPT, Codex, API, and agent workloads cheaper and faster at scale.

9 months claimed design to tape-out50% inference cost reduction claim1.3GW planned deployment scale
Midjourney
Products & Apps

Midjourney Medical scanner

Midjourney announces Midjourney Medical, a full-body ultrasonic scanner concept

Midjourney announced Midjourney Medical, a full-body ultrasound scanner concept that the episode described as capturing 806TB per scan in under 60 seconds. The panel treated it as a striking sign that AI-native companies are moving beyond chatbots into hardware, imaging, and healthcare infrastructure.

806TB scan payload<60s scan time
Cognition Labs
Products & Apps

Devin Desktop

Cognition rebrands Windsurf into Devin Desktop multi-agent hub

Cognition rebranded Windsurf into Devin Desktop, a multi-agent command center with Agent Client Protocol (ACP) support. The move consolidates Cognition's IDE acquisition into its Devin agent brand as a desktop control surface for running multiple coding agents.

Nous Research
Products & Apps

Hermes Desktop

Nous Research launches Hermes Desktop agent app for Mac/Win/Linux

Nous Research launched Hermes Desktop, packaging the Hermes Agent harness into a native desktop app for Mac, Windows, and Linux. Karan previewed chat, permissions, tool-call visibility, reasoning traces, and admin controls aimed at small teams, startups, and personal agent fleets.

NVIDIA
Products & Apps

RTX Spark

NVIDIA announces RTX Spark Arm + Blackwell platform for local AI PCs

At Computex, NVIDIA unveiled RTX Spark, an Arm CPU plus Blackwell GPU PC platform with 128GB unified memory targeting local AI agents and 120B-class local inference. A wave of thin laptops with RTX 5070-class GPUs and roughly one petaflop of local AI compute raises the question of what agents should run locally versus in the cloud.

✨ Major Features & Updates 2

Major Features & Updates

WolfBench Token-Usage Visualization

WolfBench adds 3D token-depth bars to show model efficiency

Wolfram Ravenwolf shipped a WolfBench feature that visualizes token usage alongside benchmark score as 3D token-depth bars. Two models can look close on a leaderboard while one burns dramatically more tokens, which changes the real cost and latency story; Gemini 3.5 Flash and GPT 5.5 were compared as examples.

🔌 APIs & Platforms 2

OpenRouter
APIs & Platforms

Fusion API

OpenRouter launches Fusion API, a panel of budget models competing with frontier models

OpenRouter launched Fusion API, which routes or ensembles a panel of lower-cost models to reach near-frontier results. The episode notes framed it as beating GPT-5.5 and Opus 4.8 in some comparisons while landing within roughly 1% of Claude Fable 5 at half the price.

~1% from Fable 5 in episode notes
APIs & Platforms

Kimi K2.7 Code on CoreWeave Inference

Kimi K2.7 Code goes live on W&B/CoreWeave Inference

Kimi K2.7 Code became available on W&B/CoreWeave Inference, with the episode notes calling out Blackwell NVFP4 serving, speculative decoding, and 289 tokens per second near the top of Artificial Analysis speed and price-performance charts.

289 tok/s reported throughput

🛠️ Dev Tools 5

Anthropic
Dev Tools

Claude Tag

Anthropic launches Claude Tag as a persistent Slack teammate

Claude Tag brings Claude into Slack as a persistent proactive teammate with shared channel context, ambient follow-up, coding tasks, analysis, incident support, and enterprise governance.

65% Anthropic product-team code from internal version$25K Enterprise launch credits
HumanLayer
Dev Tools

Agentic IDE

HumanLayer launches an Agentic IDE to fight AI code slop

HumanLayer launched its Agentic IDE, positioned as a human-in-the-loop answer to lights-out coding-agent slop. Dexter Horthy joined the show to argue that the right architecture keeps humans steering high-impact changes instead of letting agents silently trash production codebases.

Weights & Biases
Dev ToolsOpen weights

HiveMind

Weights & Biases launches HiveMind for coding-agent observability

Weights & Biases launched HiveMind, a dashboard for tracking AI coding-agent sessions, spend, transcripts, ROI, and reusable organizational learning. Chris Van Pelt and Adrian Swanberg joined the show to explain why teams need observability for their growing fleet of coding agents.

📊 Benchmarks & Evals 1

Arena (LMArena)
Benchmarks & Evals

Agent Arena

Arena launches Agent Arena for real-world agent workflow evals

Arena (LMArena) launched Agent Arena during the episode, moving beyond one-turn chatbot preference battles to evaluate models on real agent workflows with web search, files, terminals, user corrections, and objective recovery signals. Peter Gostev joined live to explain why long-running, harder tasks need a different benchmark.

🤝 Acquisitions 1

Cursor
Acquisitions

Cursor acquisition

SpaceX/xAI reportedly acquires Cursor for $60B

The show covered a reported $60B all-stock acquisition of Anysphere/Cursor by SpaceX/xAI. Alex framed it as coding assistants becoming strategic infrastructure: workflows, agent traces, and developer context are now assets frontier labs want to own.

$60B reported acquisition price

🌀 Also Released 1

Anthropic
Also Released

Claude Fable/Mythos access restriction

Anthropic disables Fable and Mythos access after US government restriction

Anthropic reportedly shut down Fable 5 and Mythos 5 access for foreign nationals, then disabled both models broadly to comply. The episode framed it as the first major direct government intervention in frontier model access, turning model availability into a national-security and sovereign-AI story.