New Models
Ideogram 3.0
Ideogram 3.0 launches with strong text, logos, and style references
Ideogram launched version 3.0 of its image generation model with another SOTA claim. It is particularly strong on text and logo rendering, photorealism, and style references, continuing Ideogram's edge in typography-heavy image generation.
Major Features & Updates
GPT-4o Native Image Generation
OpenAI enables native image generation in GPT-4o, internet goes Ghibli
OpenAI finally enabled GPT-4o's native auto-regressive image generation in ChatGPT, sparking the biggest mainstream AI buzz of the week as the internet ghiblified itself. Launched right after Gemini 2.5, it excels at instruction following, text rendering, and multi-turn editing, with viral demos ranging from ad mockups to a full Lord of the Rings trailer.
New Models
Reve Image
Reve emerges with SOTA diffusion image generation claims
Reve launched a new diffusion image generation model claiming state-of-the-art quality, reportedly beating heavyweights like Midjourney and Flux at roughly a penny per image. The previously low-profile lab made a splash with strong prompt adherence and image quality.
Dev Tools
Gemini Co-Drawing
Gemini Co-Drawing demo uses native image output to help you draw
A Hugging Face space demo, Gemini Co-Drawing, uses Gemini's native image generation output to collaboratively complete and enhance your sketches as you draw. It showcases the new native image-output capability of Gemini 2.0 Flash in an interactive tool.
New Models
Seedream 2.0
ByteDance unveils Seedream 2.0 bilingual image generation foundation model
ByteDance released Seedream 2.0, a native Chinese-English bilingual image generation foundation model, alongside a technical paper. It emphasizes excellent text rendering (especially Chinese), cultural nuance, and human preference alignment, generating high-quality, culturally relevant images from prompts in either language.
Major Features & Updates
Gemini 2.0 Flash native image generation
Gemini Flash gains native image generation and conversational editing
Google enabled native image generation in Gemini Flash Experimental, letting users generate and iteratively edit images conversationally inside the same multimodal model. The crew demoed it live on stream, editing photos of themselves with natural-language instructions, and saw it as a preview of how creative tools like Photoshop will work.
New Models
Image-01
MiniMax launches Image-01 text-to-image model at 1/10 the cost
MiniMax released Image-01, a versatile text-to-image model the company positions at roughly one tenth the cost of competing image generation offerings. It is available through MiniMax's hosted platform.
New ModelsOpen weights
CogView 4 (6B)
Zhipu AI open-sources CogView 4, a 6B text-to-image model
Zhipu AI released CogView 4, a 6B-parameter open text-to-image model in the CogView family, with code available on GitHub. It is notable as an open-weights image generation option with strong Chinese and English prompt support.