Everything AI Released in July 2026

9 releases covered live on the show — every model, product, paper and tool that mattered, with links and our analysis.

🧠 New Models 7

Base44
New Models

Base 1

Base44 launches Base 1, the first in-house vibe-coding LLM

Base44 (the Wix subsidiary at $150M ARR) launched Base 1, a proprietary LLM trained on tens of millions of real app-building interactions — the first vibe-coding platform to ship its own internal model. Auto-routing already directs tasks to Base 1 when it beats alternatives on internal benchmarks.

$150M Base44 ARR
Google DeepMind
New Models

NanoBanana 2 Lite

NanoBanana 2 Lite: sub-4-second images at ~3¢ per 1,000

Google's NanoBanana 2 Lite generates images in under four seconds starting at $0.034 per 1,000 images, with quality above the original NanoBanana. The Interactions API hit GA the same week.

per 1,000 images<4s generation time
Google DeepMind
New Models

OmniFlash

Google DeepMind debuts OmniFlash, first of the any-to-any Omni family

OmniFlash — first of Google's any-to-any Omni family — generates videos up to 10 seconds with precise conversational multi-turn editing via the Interactions API: say 'make it daytime' and it redoes light, sky and shadows. Editing Elo 1087 at $0.10 per second of output.

1087 editing Elo$0.10 per second of video, up to 10s
Meituan
New ModelsOpen weights

LongCat-2.0

Meituan reveals LongCat-2.0, a 1.6T MoE trained entirely on Chinese ASICs

Meituan disclosed LongCat-2.0, a 1.6-trillion-parameter MoE trained entirely on Chinese ASICs without NVIDIA hardware. It scores 59.5 on SWE-bench Pro and runs at $0.038 per million tokens with free cache hits. The model had been serving anonymously as 'Owl Alpha' and ranks among OpenRouter's top models by volume — part of a surge that puts Chinese open-weight models at ~30% of global usage, up from 1.2% eleven months ago.

1.6T MoE parameters, no NVIDIA in training59.5 SWE-bench Pro$0.038 per 1M tokens, free cache hits
OpenAI
New Models

GPT-5.6

OpenAI ships GPT-5.6 as a three-model family: Sol, Terra and Luna

GPT-5.6 arrives as three models — Sol (frontier), Terra (~5.5-level intelligence at half the cost) and Luna (small and fast) — plus a new Ultra mode with a Max reasoning level and heavier sub-agent use. Dominik Kundel confirmed on ThursdAI that 5.6 Sol is coming to Cerebras at extreme speed running the same weights as the API model, not a distill.

3 models: Sol / Terra / Luna50% Terra cost vs GPT-5.5-level intelligence
Anthropic
New Models

Fable 5

Fable 5 restored globally after the export-control pause

Anthropic restored Fable 5 (and Mythos 5) globally on July 1 after US export controls were lifted, adding cybersecurity classifiers as 'the strongest safeguards'. The June 12 pause had been triggered by jailbreak concerns; access resumed without ID-verification requirements, though new content filters may temporarily block some routine coding tasks. Alex celebrated by having Fable prep the entire ThursdAI run of show.

19 days offline (June 12 pause → July 1 restore)
Anthropic
New Models

Sonnet 5

Claude Sonnet 5: 'our most agentic Sonnet yet' at intro pricing

Anthropic launched Sonnet 5 with near-Opus 4.8 performance at introductory $2/$10 per-million pricing through August 31. Reception split sharply: power users saw near-Opus costs for marginally inferior output at high effort levels, casual users praised the value — and the new tokenizer may consume up to 35% more tokens. On ThursdAI, Wolfram's early WolfBench read put it slightly under Opus 4.6 at higher cost.

$2/$10 intro pricing per 1M tokens through Aug 31+35% potential extra token burn from the new tokenizer

🚀 Products & Apps 1

Exo Labs
Products & Apps

local.ai

Exo Labs launches local.ai to track the local-AI frontier

Announced live on ThursdAI at AI Engineer: local.ai tracks the best model for your hardware, the performance trade versus the cloud, and whether running local beats API-token pricing. Early access is live with signup codes, and the Exo CLI — 'vLLM for consumer devices, with the configs figured out for you' — ships in the coming weeks.

71% Terminal Bench 2.1, REAP-pruned GLM 5.2550B Nemotron-3 Ultra running on 4 NVIDIA Sparks

🛠️ Dev Tools 1

Z.ai
Dev Tools

ZCode

Z.ai launches ZCode, a GLM-5.2 agentic coding environment

ZCode is an agentic coding environment built on GLM-5.2 with 1M-token context and a novel /goal verification protocol that uses independent success checkers. Output reaches 173 tokens/second with 1.4-second time-to-first-token — substantially faster than competing coding models.

173 tokens/second output1M token context