Products & Apps
AvatarFX
Character.AI opens early access to AvatarFX talking avatars
Character.AI announced AvatarFX, now in early access, which turns static images into speaking, emoting video avatars. It targets bringing characters to life for conversational and creative use cases.
New ModelsOpen weights
FramePack
FramePack generates 120-second videos on just 6GB of VRAM
FramePack, from ControlNet creator Lvmin Zhang (lllyasviel), is an open source next-frame prediction approach for long video generation that runs on consumer hardware. It can generate videos up to 120 seconds long on as little as 6GB of VRAM by packing input frame context into a fixed length.
120s Max video length6GB Minimum VRAM
New ModelsOpen weights
MAGI-1
Sand AI surprises with MAGI-1, a 24B streaming autoregressive video model
Sand AI released MAGI-1, a 24B autoregressive diffusion model for long-form, streaming video generation with remarkable character consistency, often the Achilles' heel of AI video. It predicts video in 24-frame chunks with causal attention between them, enabling real-time streaming generation where compute doesn't scale with length. Nisten speculated it could be a major step toward usable AI-generated movies by solving the face/character consistency problem.
24B Parameters24 Frames per autoregressive chunk
New Models
Seaweed-7B
ByteDance publishes Seaweed-7B video generation foundation model
ByteDance publicly presented Seaweed-7B, a 7B parameter video generation foundation model, showing competitive video quality from a comparatively small model. Details and demos were published at seaweed.video.
Major Features & Updates
Veo 2
Veo 2 video generation hits GA in the API and Gemini App
Google made Veo 2 video generation generally available for developers and rolled it out in the Gemini App. The GA release brings Google's flagship text-to-video model out of preview and into production use.
New Models
Kling 2.0
Kling 2.0 Creative Suite launches
Kuaishou's Kling AI launched Kling 2.0 along with a broader Creative Suite, upgrading its video generation model and tooling. The release kept up the rapid pace in the closed-source video generation race during a packed vision and video week.
Papers & ResearchOpen weights
One-Minute Video Generation with Test-Time Training
Test-Time Training paper one-shots minute-long videos with consistent characters
Researchers published 'One-Minute Video Generation with Test-Time Training', adding TTT layers to a pre-trained transformer to one-shot generate minute-long videos with remarkable character and scene consistency. The Tom & Jerry style demos showed the most impressive long-form AI video consistency to date.
1 min Single-shot generated video length
Products & Apps
OmniHuman (via Dreamina)
ByteDance's OmniHuman image-to-avatar model goes public via Dreamina
ByteDance's impressive OmniHuman model, which turns a single image plus audio into a realistic talking avatar video, became publicly usable through the Dreamina (CapCut) website. The results land squarely in uncanny-valley territory, as Alex demonstrated with his own avatar thread.
Papers & Research
MoCha
Meta's MoCha generates movie-grade talking AI characters from speech and text
Meta GenAI researchers published MoCha, a model that generates stunningly realistic, movie-grade talking characters directly from speech plus text. Co-author Cong Wei joined the show to discuss the work, which points at AI actors entering Hollywood-quality territory.
New Models
Runway Gen-4
Runway Gen-4 announced with major gains in video consistency
Runway announced Gen-4, its next-generation video model focused on character and world consistency across shots. Example videos showed notably coherent characters and scenes, pushing AI video further toward usable filmmaking.