Google DeepMind

73 releases covered on ThursdAI · deepmind.google ↗

June 2026

Google DeepMind
New ModelsOpen weights

Gemma 4 12B

Google drops Gemma 4 12B, an encoder-free multimodal local model

Google released Gemma 4 12B, an encoder-free multimodal model under Apache 2.0 that targets 16GB VRAM local setups. Instead of bolting separate vision or audio encoders onto a language model, it uses one unified network, which LDJ and Yam argued makes smaller multimodal models cheaper, cleaner, and easier to run locally.

May 2026

Google
Products & Apps

Universal Cart / AP2 / UCP

Google launches Universal Cart, AP2 and UCP for agentic commerce

Google launched Universal Cart along with the AP2 and UCP protocols, infrastructure that lets AI agents shop and pay on a user's behalf. It is Google's play to standardize agent-driven commerce across merchants and payment flows.

Google
Products & Apps

Antigravity 2.0

Antigravity 2.0 becomes Google's central agentic coding harness

Antigravity 2.0 was positioned at I/O 2026 as the single agent harness powering agentic experiences across Google, from internal tooling to Search, Workspace and developer products. Born from the Windsurf acquisition, it evolved from an agent-first IDE into the through line for Google's agentic strategy, now exposed to external developers as well.

Google DeepMind
New Models

Gemini 3.5 Flash

Gemini 3.5 Flash launches at I/O as Google's agentic workhorse model

Google launched Gemini 3.5 Flash at I/O 2026 as a fast, determined workhorse model built for agentic loops rather than a budget-tier Flash like prior generations. It is rolling out across the Gemini app, Search AI Mode, the Gemini API, Google AI Studio, Antigravity and the Gemini Enterprise Agent Platform. Nisten noted unusual determinism in its behavior, and Logan Kilpatrick framed it as designed for the agentic era.

900M Gemini app users
Google DeepMind
New Models

Gemini Omni

Gemini Omni: 'create anything from anything' conversational video editor

Google DeepMind launched Gemini Omni, a multimodal 'create anything from anything' model debuting as Google's first conversational video editor. Unlike pure text-to-video systems, Omni is an iterative multi-turn editing model that combines Gemini intelligence, world knowledge, multimodal inputs and generative media, in the same way Nano Banana brought Gemini to interactive image editing. It is available in the Gemini app, Google Flow and YouTube, with API support coming soon.

Google DeepMind
APIs & Platforms

Managed Agents (Gemini API)

Gemini API gets Managed Agents with hosted sandboxes and the Interactions API

Google launched Managed Agents in the Gemini API, letting developers spin up hosted Antigravity agents with Linux sandboxes and persistent state. It ships alongside the next-generation Interactions API, which Logan Kilpatrick described as designed for agentic systems rather than the old tokens-in, tokens-out model interaction pattern.

Google
Products & Apps

Gemini Spark

Gemini Spark announced as a 24/7 proactive personal AI agent

Google announced Gemini Spark, a 24/7 personal AI agent that can proactively work across Google surfaces, framed on the show as Google's OpenClaw competitor. Access was not yet broadly available at announcement time, so the crew discussed it from the announcement rather than hands-on testing.

Google
Major Features & Updates

Google Search agentic capabilities

Google Search adds Gemini 3.5 Flash-powered agentic capabilities

Google Search is getting new Gemini 3.5 Flash-powered agentic capabilities, including a new AI-powered Search box and background information agents. The crew framed the rollout as a massive intelligence uplift across one of Google's largest surfaces, with billions of Search users getting frontier-model capabilities.

3.5B Google Search users

April 2026

Google DeepMind
New Models

Gemini 3.1 Flash TTS

Gemini 3.1 Flash TTS tops TTS Arena at 1,211 Elo with 70+ languages

Google released Gemini 3.1 Flash TTS, which leads TTS Arena at 1,211 Elo, supports 70+ languages with inline audio tags, and costs about $0.03 per 60 seconds, roughly 5x cheaper than ElevenLabs. Kwindla noted it is fully promptable like an LLM rather than limited to fixed tags, but its ~3 second time-to-first-token makes it batch-only for now rather than usable in live conversational pipelines.

1,211 TTS Arena Elo
Google DeepMind
New ModelsOpen weights

Gemma 4

Google releases Gemma 4 open-weights family under Apache 2.0

Google DeepMind's Gemma 4 launch crossed 10M+ downloads with over 1,000 Gemma-4-based fine-tunes on Hugging Face; the Gemma family totals 500M+ downloads. Omar Sanseviero says Gemma is the foundation for the next generation of Gemini Nano shipping on Pixel and Samsung, with the AI Edge gallery letting people run it locally on Android and iOS. It punched above its size on Arena's Pareto curve and is now live on W&B Inference.

Google DeepMind
New Models

Veo 3.1 Lite

Google launches Veo 3.1 Lite at $0.05/sec, cheapest video gen yet

Google released Veo 3.1 Lite, a lighter video generation tier priced at $0.05 per second at 720p, the cheapest video generation offering yet, with further price cuts announced for April 7. The panel framed it as a practical quality-versus-latency tradeoff tier for creator workflows.

March 2026

Google DeepMind
New Models

Gemini 3.1 Flash Live

Google drops Gemini 3.1 Flash Live: Gemini can see, hear, and talk to you

Google released Gemini 3.1 Flash Live, a realtime multimodal model that handles voice and vision interaction in a single model path instead of stitched pipelines. The panel framed it as a major upgrade for end-to-end voice and vision agents, with AI Studio and API availability as the immediate way to experiment.

Google DeepMind
New Models

Lyria 3 Pro

Google Lyria 3 Pro generates full 3-minute music tracks with structural control

Google DeepMind released Lyria 3 Pro, its most advanced music model, generating full 3-minute tracks with structural control over intros, verses, choruses, and bridges, and even composing music from images. The crew generated a drum-and-bass ThursdAI opener live with spot-on instruction following; output is SynthID watermarked and royalty-free, available to Gemini subscribers and via Producer AI.

Google Research
Papers & Research

TurboQuant

Google TurboQuant claims 6x KV-cache compression and 8x faster inference

Google Research published TurboQuant, a KV-cache quantization technique claiming 6x compression and 8x inference speedup with near-zero accuracy loss. The panel framed it as a potential unlock for LLM inference economics, while calling stock-market panic over the result premature without broader production validation.

TurboQuant KV-cache compression TurboQuant speedup claim
Google DeepMind
New Models

Gemini 3.1 Flash-Lite

Google launches Gemini 3.1 Flash-Lite with 1M context at 360 tok/s

Google launched Gemini 3.1 Flash-Lite, a fast and cheap model with 1M token context aimed at the instant/fast tier, running around 360 tokens per second. The panel flagged a material pricing jump versus the prior Flash-Lite generation but saw it as well suited for judge, guardrail, and orchestration workloads in agent systems.

360 tokens/sec Gemini 3.1 Flash-Lite speed

February 2026

Google DeepMind
New Models

Gemini 3.1 Pro

Gemini 3.1 Pro drops live with 44% HLE and 77% ARC-AGI at the same price

Google released Gemini 3.1 Pro minutes before the show, claiming 2.5x better abstract reasoning and improved coding and agentic capabilities at the same price point as its predecessor. It scores 44% on Humanity's Last Exam, 77% on ARC-AGI without a custom harness, and 68 on Terminal Bench, putting it at or near state of the art alongside Opus 4.6. In Nisten's live vibe-coding test it was blazingly fast but less polished than Opus 4.6 and Codex output.

44% Humanities Last Exam77% ARC-AGI
Google DeepMind
New Models

Lyria 3

Google DeepMind launches Lyria 3 music generation in the Gemini app

Google DeepMind launched Lyria 3, its most advanced AI music generation model, now available in the Gemini app. It generates 32-second high-fidelity music tracks with creative controls and can compose music from uploaded images. Google also published a prompt guide covering vocals, lyrics, and different styles.

January 2026

Google
Major Features & Updates

Chrome Auto-Browse

Google launches agentic Auto-Browse in Chrome with Gemini 3

Google unveiled Chrome Auto-Browse with Gemini 3 Nano integration, bringing agentic browsing to Pro and Ultra subscribers in the world's most-used browser with 4 billion daily users. Native browsing avoids Cloudflare bot detection, and Gemini's 2M context window suits long browsing sessions.

4B Chrome daily users
Google DeepMind
New Models

Genie 3 (Project Genie)

Google DeepMind launches Project Genie 3, real-time 24fps world model

Google DeepMind's Genie 3 generates interactive, controllable 3D worlds in real time at 24 frames per second, demoed live on the show with a spaceship exploration and paint persistence on walls. It ships alongside SIMA 2, a self-improving game-playing agent built on Genie 3, and is available to Gemini Ultra subscribers in the US with a one-minute session limit.

24 fps Genie 3 frame rate
Google DeepMind
New ModelsOpen weights

MedGemma 1.5

Google releases MedGemma 1.5 for offline medical imaging

Google released MedGemma 1.5, a small (4B-class) open model for medical use cases, compact enough to run offline for medical imaging. The panel stressed it is a different model class from Byte's giant M3 medical LLM and that the two pair well together rather than replacing each other.

Google
Major Features & Updates

Gemini Personal Intelligence

Gemini personal intelligence reasons across Gmail, YouTube, Photos, Search

Google shipped personalized AI in Gemini, letting it reason across a user's Gmail, YouTube, Photos, and Search history with explicit opt-in for US Pro and Ultra subscribers. Alex tested it live: it inferred he drives a Tesla Model Y from emails and noticed his recent Honda Odyssey searches, highlighting Google's data moat over OpenAI and Anthropic.

December 2025

Google DeepMind
New Models

Gemini TTS

Google ships a Gemini TTS model in its December run

As part of Google's December release wave, a Gemini TTS model shipped alongside realtime model updates. It rounded out Google's full-stack voice story heading into 2026.

Google DeepMind
New Models

VEO3

VEO3: native audio video generation crosses the uncanny valley

Google's VEO3 stunned everyone in Q2 with video generation that included native audio, which the crew credits with crossing the uncanny valley for AI video. It was a centerpiece of Google IO 2025 and of Google's comeback year.

Google DeepMind
New ModelsOpen weights

FunctionGemma

FunctionGemma: Google's 270M function-calling model for edge agents

Google released FunctionGemma, a tiny 270M-parameter open model specialized for function calling on-device. With a roughly 500MB RAM footprint and strong gains after fine-tuning for mobile actions, it points toward privacy-first local agents on constrained hardware.

Google DeepMind
New Models

Gemini 3 Flash

Gemini 3 Flash delivers frontier intelligence at $0.50/1M input tokens

Google launched Gemini 3 Flash, offering frontier-tier capability at flash-tier pricing of $0.50 per million input tokens. It scores 78% on SWE-bench Verified, beating larger models on some agentic tasks, and supports tool-calling at scale with up to 100 simultaneous function calls.

$0.50 per 1M Gemini 3 Flash input tokens78% SWE-bench Verified
Google DeepMind
Major Features & Updates

Gemini 3 Deep Think

Gemini 3 Deep Think hits 45.1% on ARC-AGI-2 with parallel reasoning

Google shipped Deep Think, a high-cost parallel reasoning mode for Gemini 3 that scored 45.1% on ARC-AGI-2. The panel framed it as Google pressing its advantage in the frontier race, where product integration and latency now matter as much as raw benchmark IQ.

45.1% ARC-AGI-2

November 2025

Google DeepMind
Dev Tools

Antigravity

Antigravity: Google's free agent-first IDE powered by Gemini 3 Pro

A free VS Code fork reimagined for agent-first coding, with an inbox-style Agent Manager for running multiple coding agents in parallel across a codebase. Browser integration lets agents control Chrome, take screenshots and videos of the running app, and self-debug. The free tier is powered by Gemini 3 Pro, with GPT-OSS 120B as the open-source alternative and Nano Banana for images.

Google DeepMind
New Models

Gemini 3 Pro

Gemini 3 Pro launches with record ARC-AGI-2 scores

Google's new frontier multimodal model with a 1M-token context window and huge reasoning gains, scoring 31.11% on ARC-AGI-2 (45.14% with Deep Think mode) — roughly double the previous SOTA — plus 81% on MMLU-Pro and major coding improvements. Amp switched to it as their default model on launch day, the first time they have ever switched defaults. Also rolling out across Gmail, Calendar, and AI Mode in Google Search.

45.14% ARC-AGI-2 (Deep Think)31.11% ARC-AGI-2 (standard)1M Token context window
Google DeepMind
New Models

Nano Banana Pro

Nano Banana Pro generates 4K images with perfect text

Google's upgraded image model dropped as breaking news mid-show, adding visible thinking traces, 4K resolution output, and SynthID watermarking with C2PA metadata. Alex demoed it live by one-shotting an 8MB AI-news infographic with flawless text and pixel-accurate logos across the entire image. It also powers generative UIs in Gemini, building interactive dashboards with real data on the fly.

4K First image model with flawless 4K output and perfect text

October 2025

Google DeepMind
Major Features & Updates

AI Studio Vibe Coding

Google AI Studio launches 'Vibe Coding' build experience

Google's Gemini AI Studio launched a 'Vibe Coding' experience at ai.studio/build, letting users build apps from natural-language prompts with Gemini. It puts Google into the rapidly crowding prompt-to-app space alongside the week's other coding-agent moves.

Google DeepMind
New ModelsOpen weights

C2S-Scale 27B

Google's C2S-Scale 27B validates a cancer hypothesis in living cells

Google released C2S-Scale 27B, a Gemma-based single-cell biology model that generated a novel cancer therapy hypothesis later validated in living cells. The show called this a bombshell example of AI contributing to real scientific discovery rather than just benchmarks.

Google DeepMind
New Models

Veo 3.1

Veo 3.1: Google's next-gen video model launches with cinematic audio

Google DeepMind shipped Veo 3.1, the next version of its video generation model with improved quality and cinematic audio. Senior PM Jessica Gallegos joined the show to discuss how the model and its product packaging (including Flow) are evolving video generation into a real user experience story.

September 2025

Google
Major Features & Updates

Gemini in Chrome

Google puts Gemini in Chrome with cross-tab AI assistance

Google shipped Gemini directly into Chrome, adding an AI assistant that works across tabs, a smarter omnibox, and safer-browsing features. It moves the browser itself into the AI interface race, putting an assistant in front of Chrome's massive user base.

Google DeepMind
New ModelsOpen weights

EmbeddingGemma

Google releases EmbeddingGemma, a 300M-param SOTA embedding model for RAG

Google released EmbeddingGemma, a 300M-parameter open embedding model that achieves state-of-the-art results for its size, aimed at RAG and on-device semantic search. It dropped as breaking news during the show, with browser-based demos like Semantic Galaxy showing it running fully client-side.

July 2025

Google DeepMind
Major Features & Updates

Gemini 2.5 Pro (free tier)

Gemini 2.5 Pro returns to Google's free tier

Google brought Gemini 2.5 Pro back to its free tier, making its flagship reasoning model available again to consumer users at no cost. A quick-hit item in the big-company segment of the show.

May 2025

Google DeepMind
Products & Apps

AlphaEvolve

AlphaEvolve: Gemini-powered coding agent for discovering new algorithms

Google DeepMind announced AlphaEvolve, a Gemini-powered coding agent that designs and evolves advanced algorithms, credited on the show as one of the week's mind-bending algorithmic-discovery stories. DeepMind opened an interest form for early access rather than shipping it broadly.

April 2025

Google DeepMind
New ModelsOpen weights

Gemma 3 QAT

Google ships Quantization-Aware Trained Gemma 3 models for consumer GPUs

Google released Quantization-Aware Training (QAT) versions of the Gemma 3 family, dramatically cutting memory requirements while preserving quality. The 27B model drops from a hefty 54GB to just 14.1GB, and even the 1B model goes from 2GB to about half a gig, making state-of-the-art open models runnable on consumer GPUs. Wolfram took the 4B QAT model for a spin in LM Studio on the show.

27B Gemma 3 27B QAT: 54GB down to 14.1GB1B Gemma 3 1B QAT: 2GB down to ~0.5GB4B 4B QAT model tested in LM Studio
Google
New Models

DolphinGemma

DolphinGemma: Google's audio model for decoding dolphin communication

Google, with Georgia Tech and the Wild Dolphin Project, announced DolphinGemma, a ~400M parameter audio model based on the Gemma architecture using SoundStream audio tokenization. Trained on decades of recorded dolphin clicks, whistles and pulses, it aims to decipher structure in dolphin communication and runs on a Pixel phone for field deployment.

Google DeepMind
Major Features & Updates

Veo 2

Veo 2 video generation hits GA in the API and Gemini App

Google made Veo 2 video generation generally available for developers and rolled it out in the Gemini App. The GA release brings Google's flagship text-to-video model out of preview and into production use.

Google
Also ReleasedOpen weights

Agent2Agent (A2A) protocol

Google announces A2A, an open agent-to-agent communication protocol

Google announced the Agent2Agent (A2A) protocol at Cloud Next, an open spec for agents from different vendors to discover and communicate with each other. The spec was published on GitHub with a long list of launch partners, including Weights & Biases.

Google DeepMind
Major Features & Updates

Official MCP support

Google announces official support for the Model Context Protocol (MCP)

Demis Hassabis announced that Google will officially support Anthropic's Model Context Protocol (MCP) in its models and SDKs. This was a major signal of MCP becoming the industry standard for connecting AI models to tools and data.

Google DeepMind
Benchmarks & Evals

Gemini 2.5 Pro USAMO results

Gemini 2.5 Pro scores 24.4% on USAMO olympiad math, crushing the field

New evaluation results published this week showed Gemini 2.5 Pro scoring 24.4% on the USA Math Olympiad (USAMO), problems so hard that most top models score under 5%. The result showcases a step change in frontier reasoning ability on competition mathematics.

24.4% Gemini 2.5 Pro USAMO score<5% typical score for other top models

March 2025

Google DeepMind
New Models

Gemini 2.5 Pro

Google reclaims #1 with Gemini 2.5 Pro thinking model

Google dropped Gemini 2.5 Pro, a thinking model that took the #1 spot as the best all-around LLM available, with massive jumps on benchmarks like AIME (up nearly 20 points) and GPQA. It inherits native multimodality and a 1M token context window, maintaining high accuracy even at 120k+ tokens on needle-in-a-haystack tests, with surprisingly low latency (~13 seconds on hard reasoning questions vs 45+ for others). Tulsee Doshi, head of product for Gemini models, joined the show to give the inside scoop.

20 point jump on AIME benchmark1M token context window13 seconds latency on hard reasoning questions (vs 45+ for others)
Google
Dev Tools

Gemini Co-Drawing

Gemini Co-Drawing demo uses native image output to help you draw

A Hugging Face space demo, Gemini Co-Drawing, uses Gemini's native image generation output to collaboratively complete and enhance your sketches as you draw. It showcases the new native image-output capability of Gemini 2.0 Flash in an interactive tool.

Google
Major Features & Updates

Gemini Deep Research, Canvas & Live Previews

Google makes Deep Research free, adds Canvas and Live Previews to Gemini

Google made its Deep Research agent free for Gemini users and shipped Canvas, a collaborative workspace with live previews for code and documents. Demos on the show included a playable Tetris game and a markdown word counter built and previewed directly inside Gemini.

Google
Major Features & Updates

NotebookLM Mind Maps

NotebookLM teases Mind Maps for visualizing sources

Google's NotebookLM team previewed Mind Maps, a feature that turns your uploaded sources into interactive visual maps of concepts. It was teased publicly by the team this week ahead of a wider rollout.

Google
Major Features & Updates

Google AI Studio YouTube link understanding

Google AI Studio adds native YouTube video understanding via link dropping

Google AI Studio now lets you drop a YouTube link and have Gemini natively understand the video. This unlocks video analysis, summarization, and support use cases without downloading or preprocessing the content.

Google
Major Features & Updates

Gemini Deep Research (free tier)

Google makes Deep Research free in the Gemini app, powered by Gemini Thinking

Google made its Deep Research agent free for everyone in the Gemini app and upgraded it to run on Gemini Thinking. In a live test on the show it browsed over 150 websites to compile a comprehensive answer, with a polished interface and export to Google Docs.

Google DeepMind
Major Features & Updates

Gemini 2.0 Flash native image generation

Gemini Flash gains native image generation and conversational editing

Google enabled native image generation in Gemini Flash Experimental, letting users generate and iteratively edit images conversationally inside the same multimodal model. The crew demoed it live on stream, editing photos of themselves with natural-language instructions, and saw it as a preview of how creative tools like Photoshop will work.

Google DeepMind
New ModelsOpen weights

Gemma 3

Google open sources Gemma 3, 1B-27B multimodal family with 128K context

Google released Gemma 3, an open-weights model family spanning 1B to 27B parameters with multimodal (text, image, video) capabilities, support for over 140 languages, and a 128K context window. The 27B model runs on a single GPU, with Sundar Pichai claiming competitors need roughly 10x the compute for similar performance. It shipped with day-one open source ecosystem support (Hugging Face, Ollama, Kaggle) plus ShieldGemma 2 for content moderation.

Google
Products & Apps

AI Mode & AI Overviews (Gemini 2.0)

Google announces AI Mode in Search powered by Gemini 2.0

Google announced AI Mode, a new conversational search experience in Google Search, alongside Gemini 2.0-powered upgrades to AI Overviews. Robby Stein, VP of Product for Google Search, joined the show for an exclusive interview about the launch, which brings full AI chat-style answers with follow-ups directly into Search.

Google
Dev Tools

Data Science Agent in Colab

Google ships Gemini-powered Data Science Agent in Colab

Google launched a Data Science Agent inside Google Colab, powered by Gemini, that can autonomously generate complete, working notebooks from natural language descriptions of an analysis task. It automates data loading, exploration, and modeling boilerplate for data scientists.

February 2025

Google DeepMind
APIs & Platforms

Veo 2 (via FAL API)

Google's Veo 2 video model becomes available via FAL API

Google DeepMind's Veo 2 video generation model became accessible to developers through FAL's inference API. This was the first broadly available API access to Veo 2, letting builders generate high-quality video from text prompts without waiting on Google's own product surfaces.

January 2025

Google DeepMind
New Models

Gemini 2.0 Flash Thinking 01-21

Google ships updated Gemini Flash Thinking with 1M context

Google released an updated Gemini Flash Thinking model (01-21) with a 1 million token context window, built-in code execution, and improved evals over the previous Thinking release. It pushes Google's reasoning-model line forward in the same week DeepSeek R1 landed.

1M Context window (tokens)