Image Generation

Image generation and editing models and creative visual tools. — 59 releases covered on the show.

June 2026

Ideogram
New ModelsOpen weights

Ideogram 4.0

Ideogram 4.0 becomes the top open-weight text-to-image model

Ideogram released Ideogram 4.0, a 9.3B-parameter text-to-image model with open weights under a non-commercial license. It leads open-weight image models on typography and layout, with bounding-box/layout-style prompting that trades casual generation ease for precise structured control.

9.3B Ideogram 4 parameters
Reve
New Models

Reve 2.0

Reve 2.0 hits #2 on Text-to-Image Arena with layout-first editing

Reve 2.0 jumped to second place on Text-to-Image Arena (around 1200 ELO) with native 4K output, code-like layout control, and precise editing. Alex's live tests found inconsistent portrait identity, but the layout-first editor is the real differentiator for graphic and image iteration workflows.

May 2026

Microsoft
New Models

MAI-Image-2.5

Microsoft MAI-Image-2.5 jumps to #3 on Arena text-to-image

MAI-Image-2.5 jumped to number two on Arena's image-to-image leaderboard shortly after launch, with notable strength in image cleanup, backgrounds, documents, and diagrams. Hands-on tests on the show were mixed, and it is publicly accessible through playground.microsoft.ai.

PrismML
New ModelsOpen weights

Bonsai Image 4B

PrismML's 1-bit Bonsai Image 4B runs local image gen under 1GB

PrismML released 1-bit and ternary versions of Bonsai Image 4B, a sub-1GB diffusion transformer for local image generation. The quantized model even runs in-browser via WebGPU and ships with an iOS app and a Hugging Face demo.

Google DeepMind
New Models

Gemini Omni

Gemini Omni: 'create anything from anything' conversational video editor

Google DeepMind launched Gemini Omni, a multimodal 'create anything from anything' model debuting as Google's first conversational video editor. Unlike pure text-to-video systems, Omni is an iterative multi-turn editing model that combines Gemini intelligence, world knowledge, multimodal inputs and generative media, in the same way Nano Banana brought Gemini to interactive image editing. It is available in the Gemini app, Google Flow and YouTube, with API support coming soon.

Krea AI
New Models

Krea 2

Krea 2: Krea's first from-scratch foundation image model

Krea released Krea 2, its first foundation image model trained from scratch, built over six to seven months by nearly half the company. It focuses on aesthetic diversity, style control with up to 4 reference images, and moodboard-driven workflows, generating images in roughly 15 seconds. Co-founder and CEO Victor Perez joined the show to walk through it.

April 2026

Anthropic
Products & Apps

Claude Design

Anthropic ships Claude Design research preview, Figma stock drops 7%

Anthropic released Claude Design as a research preview running on Opus 4.7 at claude.ai/design, and Figma stock dropped 7% on the news. Alex generated a full ThursdAI brand kit including logo, design tokens, and the episode opener videos end-to-end inside Claude Design, then had Codex pick up the kit and produce a GPT-5.5 launch video in 9 minutes. Anthropic also added a new usage meter to Claude Max settings.

OpenAI
New Models

GPT-Image-2

OpenAI's GPT-Image-2 leaks on LM Arena under three codenames

OpenAI's GPT-Image-2 posted the biggest single jump ever recorded on Arena, sitting 200+ ELO points above the previous top image model even on medium reasoning. The thinking/reasoning image model generates functioning QR codes, pixel-perfect infographics, 4K output, multi-image character consistency, and equirectangular 360-degree images that Peter Gostev stitched into a walkable street-view reconstruction of ancient Babylon. It even produces screenshots of IDEs containing SVG code that actually renders, enabling a new design-then-implement meta with Codex.

March 2026

Luma AI
New Models

Uni-1

Luma Labs Uni-1 thinks and generates pixels simultaneously, #1 preference Elo

Luma Labs released Uni-1, an LLM-based image model that thinks and generates pixels simultaneously and claims the number-one human preference Elo. Unlike traditional diffusion workflows you converse with it and iterate together toward results, and it can also generate infographics; a surprising pivot from Luma's video focus.

Modular
Products & Apps

Modular 26.2

Modular 26.2 runs FLUX.2 in under a second, 99% cheaper than Nano Banana

Modular shipped its 26.2 release with state-of-the-art image generation, running FLUX.2 in under one second (sub-300ms claims) at 99% lower cost than Nano Banana, plus upgraded AI coding with Mojo. Alex noted the surprise of an inference platform releasing model-level optimization and hoped the approach spreads to all image generation.

Phota Labs
Products & Apps

Phota Studio + API

Phota Labs launches Phota Studio + API with identity-preserving personalization

Phota Labs launched Phota Studio and an API around a photography-focused image model with identity-preserving personalization: upload a batch of your photos, it trains a personal model, and the generated images actually resemble you. Alex flagged the personalization as a real capability jump over the crowd of photo startups, for professional shots, photo fixes, and adding people to photos.

February 2026

January 2026

Black Forest Labs
New ModelsOpen weights

Flux 2 Klein

Black Forest Labs drops Flux 2 Klein, fast open-weights image model

Wolfram broke the news mid-show: Black Forest Labs released Flux 2 Klein, a fast 4B/9B image generation model with open weights under Apache 2.0. It is designed for near-real-time editing and style iteration, and Alex used it minutes later in his live Claude Cowork demo.

December 2025

Black Forest Labs
New Models

Flux 3

Flux 3 becomes the new gold standard for image generation

Flux 3 dropped in August and immediately became the gold standard for image generation, landing three years almost to the day after Stable Diffusion first went public. Wolfram used it as the yardstick for how far image AI traveled in those three years.

OpenAI
Major Features & Updates

GPT-4o native image generation

GPT-4o native image generation sparks Ghibli-mania

OpenAI shipped native image generation in GPT-4o, producing the viral Ghibli-style image wave and bringing AI image creation to the ChatGPT mainstream. Wolfram cited the 2025 paradigm shift in image generation as his release of the year.

Reve
Products & Apps

Reve image platform

Reve ships a 4-in-1 image creation and editing platform

Reve (rendered as 'RevA' in the episode) emerged in September as a four-in-one image creation and editing platform. Alex said he still uses it daily, making it one of the year's sleeper product hits.

OpenAI
New Models

GPT Image 1.5

OpenAI GPT Image 1.5: 4x faster, 20% cheaper, #1 on LMSYS Image Arena

OpenAI released GPT Image 1.5, an upgraded image generation model that is 4x faster and 20% cheaper than its predecessor. It debuted at #1 on the LMSYS Image Arena leaderboard, part of OpenAI's rapid-fire release week.

November 2025

Alibaba (Tongyi)
New ModelsOpen weights

Z-Image Turbo

Tongyi's Z-Image Turbo brings sub-second open image generation

Alibaba's Tongyi lab released Z-Image Turbo, a 6B-parameter open image generation model that produces images in under a second. It pushes open-source image generation toward real-time speeds at a fraction of the size of competing models.

6B Parameters
Black Forest Labs
New ModelsOpen weights

FLUX.2

Black Forest Labs releases FLUX.2, a 32B multi-reference image model

Black Forest Labs released FLUX.2, a 32B-parameter image model with open weights (FLUX.2-dev) that supports multi-reference image editing. It lets users combine multiple reference images and prompt edits with variables, a step up in controllable image editing.

32B Parameters
Google DeepMind
New Models

Nano Banana Pro

Nano Banana Pro generates 4K images with perfect text

Google's upgraded image model dropped as breaking news mid-show, adding visible thinking traces, 4K resolution output, and SynthID watermarking with C2PA metadata. Alex demoed it live by one-shotting an 8MB AI-news infographic with flawless text and pixel-accurate logos across the entire image. It also powers generative UIs in Gemini, building interactive dashboards with real data on the fly.

4K First image model with flawless 4K output and perfect text
Alibaba (Qwen)
New ModelsOpen weights

Qwen Image Edit Multi-Angle LoRA

Qwen Image Edit gains Multi-Angle LoRA for camera control

A Multi-Angle LoRA for Qwen Image Edit landed, enabling camera-control style edits that re-render a scene from new angles. Available as a Hugging Face space and on fal, it shows the fast-moving open ecosystem building on Qwen's image editing models.

October 2025

Sourceful
New Models

Riverflow 1

Riverflow 1 tops the image-editing leaderboard

Sourceful's Riverflow 1 image-editing model took the top spot on the image-editing leaderboard. It is a notable result from a smaller lab in a category dominated by big-name image models.

September 2025

Reve
Products & Apps

Reve

Reve launches 4-in-1 AI visual platform taking on Nano Banana and Seedream

Reve launched a 4-in-1 AI visual creation platform combining image generation, editing, and related visual workflows in one app. The panel spends real time on it as a serious challenger to Nano Banana and Seedream in the AI image tooling race.

Tencent Hunyuan
Papers & ResearchOpen weights

Hunyuan SRPO

Hunyuan SRPO: preference optimization that supercharges diffusion models

Tencent Hunyuan published SRPO (Semantic Relative Preference Optimization), a post-training technique that significantly improves the output quality of diffusion image models. The team released weights on Hugging Face along with a project page and striking before/after comparisons.

May 2025

Black Forest Labs
New Models

FLUX.1 Kontext

Black Forest Labs drops FLUX.1 Kontext, SOTA image editing

Black Forest Labs, creators of Flux, released Kontext: three models (Pro, Max, and a 12B open-weights Dev in private preview) for consistent, context-aware text and image editing. Unlike GPT-image or VEO-style regeneration, Kontext keeps identity consistent across edits, adding what you ask for without changing your face every generation. Broke as news during the show.

HiDream
New ModelsOpen weights

HiDream E1

HiDream E1: open-weights image model with standout Ghibli style

HiDream released E1, an open-weights image editing/generation model (Apache 2.0-style licensing) noted for beautiful Ghibli-style outputs. It ranks #4 on the Artificial Analysis image arena leaderboard, sitting among top contenders like Google Imagen and ReCraft.

Runway
Major Features & Updates

Gen-4 References

Runway References brings character and scene consistency to Gen-4

Runway launched References for Gen-4 on all paid plans, letting creators supply reference images (characters, outfits, locations, even selfies) and use tags in prompts to keep those elements consistent across generations. It tackles AI video's biggest pain point, frame-to-frame identity drift, at no extra credit cost per run.

April 2025

OpenAI
APIs & Platforms

gpt-image-1

OpenAI's GPT Image generation lands in the API as gpt-image-1

OpenAI's powerful image generation capabilities, previously locked inside ChatGPT, are now available to developers via API under the official name gpt-image-1. This was the big one many developers were waiting for, opening up the viral image generation and editing capabilities for building AI art and image editing applications.

Tencent
New Models

Hunyuan 3D 2.5

Tencent's Hunyuan 3D 2.5 jumps to 10B params with PBR textures and rigging

Tencent updated its 3D generation model to Hunyuan 3D 2.5, now boasting 10 billion parameters, up from 1B. They highlight massive leaps in precision with 1024-resolution geometry, high-quality textures with PBR support, and improved skeletal rigging for animation.

10B Parameters (up from 1B)1024 Geometry resolution

March 2025

Ideogram
New Models

Ideogram 3.0

Ideogram 3.0 launches with strong text, logos, and style references

Ideogram launched version 3.0 of its image generation model with another SOTA claim. It is particularly strong on text and logo rendering, photorealism, and style references, continuing Ideogram's edge in typography-heavy image generation.

OpenAI
Major Features & Updates

GPT-4o Native Image Generation

OpenAI enables native image generation in GPT-4o, internet goes Ghibli

OpenAI finally enabled GPT-4o's native auto-regressive image generation in ChatGPT, sparking the biggest mainstream AI buzz of the week as the internet ghiblified itself. Launched right after Gemini 2.5, it excels at instruction following, text rendering, and multi-turn editing, with viral demos ranging from ad mockups to a full Lord of the Rings trailer.

Google
Dev Tools

Gemini Co-Drawing

Gemini Co-Drawing demo uses native image output to help you draw

A Hugging Face space demo, Gemini Co-Drawing, uses Gemini's native image generation output to collaboratively complete and enhance your sketches as you draw. It showcases the new native image-output capability of Gemini 2.0 Flash in an interactive tool.

ByteDance
New Models

Seedream 2.0

ByteDance unveils Seedream 2.0 bilingual image generation foundation model

ByteDance released Seedream 2.0, a native Chinese-English bilingual image generation foundation model, alongside a technical paper. It emphasizes excellent text rendering (especially Chinese), cultural nuance, and human preference alignment, generating high-quality, culturally relevant images from prompts in either language.

Google DeepMind
Major Features & Updates

Gemini 2.0 Flash native image generation

Gemini Flash gains native image generation and conversational editing

Google enabled native image generation in Gemini Flash Experimental, letting users generate and iteratively edit images conversationally inside the same multimodal model. The crew demoed it live on stream, editing photos of themselves with natural-language instructions, and saw it as a preview of how creative tools like Photoshop will work.

January 2025