Everything AI Released in March 2025

60 releases covered live on the show — every model, product, paper and tool that mattered, with links and our analysis.

← February 2025 All months April 2025 →

🧠 New Models 31

Alibaba (Qwen) Mar 27, 2025

New ModelsOpen weights

Qwen2.5-Omni-7B

Qwen launches Omni 7B: sees, hears, reads, and talks back

Qwen released Qwen2.5-Omni-7B, an open-weights omni-modal model that perceives text, images, audio, and video, and generates both text and speech. It packs end-to-end multimodal perception and spoken output into a 7B parameter model available on Hugging Face.

7B parameters

Hugging Face ↗

🎙️ Hear our coverage →

#open-source #multimodal #voice-ai

DeepSeek Mar 27, 2025

New ModelsOpen weights

DeepSeek-V3-0324

DeepSeek silently drops V3-0324, 685B params under MIT license

DeepSeek silently updated their V3 base model with DeepSeek-V3-0324, a 685B parameter MoE released on Hugging Face under the MIT license. This is not R1 (their reasoning model) but the powerful base model R1 was built on, and supposedly the base for a future R2.

685B parameters

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #frontier-models

Google DeepMind Mar 27, 2025

New Models

Gemini 2.5 Pro

Google reclaims #1 with Gemini 2.5 Pro thinking model

Google dropped Gemini 2.5 Pro, a thinking model that took the #1 spot as the best all-around LLM available, with massive jumps on benchmarks like AIME (up nearly 20 points) and GPQA. It inherits native multimodality and a 1M token context window, maintaining high accuracy even at 120k+ tokens on needle-in-a-haystack tests, with surprisingly low latency (~13 seconds on hard reasoning questions vs 45+ for others). Tulsee Doshi, head of product for Gemini models, joined the show to give the inside scoop.

20 point jump on AIME benchmark1M token context window13 seconds latency on hard reasoning questions (vs 45+ for others)

X announcement (Jeff Dean) ↗Official blog post ↗Try it at ai.dev ↗

🎙️ Hear our coverage →

#reasoning #architecture #frontier-models

Ideogram Mar 27, 2025

New Models

Ideogram 3.0

Ideogram 3.0 launches with strong text, logos, and style references

Ideogram launched version 3.0 of its image generation model with another SOTA claim. It is particularly strong on text and logo rendering, photorealism, and style references, continuing Ideogram's edge in typography-heavy image generation.

Ideogram 3.0 announcement ↗

🎙️ Hear our coverage →

OpenAI Mar 27, 2025

New Models

GPT-4o (2025-03-26)

GPT-4o gets an update, ties for #1 on LMArena beating GPT-4.5

OpenAI shipped a new GPT-4o checkpoint (2025-03-26) that jumped over GPT-4.5 to tie for #1 on LMArena. The update landed as the show was being written, read as a direct response to Gemini 2.5's launch in the escalating frontier-model race.

🎙️ Hear our coverage →

#frontier-models #benchmarks

Reve Mar 27, 2025

New Models

Reve Image

Reve emerges with SOTA diffusion image generation claims

Reve launched a new diffusion image generation model claiming state-of-the-art quality, reportedly beating heavyweights like Midjourney and Flux at roughly a penny per image. The previously low-profile lab made a splash with strong prompt adherence and image quality.

X announcement (Taesung) ↗Decrypt coverage ↗

🎙️ Hear our coverage →

#image-gen #architecture

C Canopy Labs Mar 20, 2025

New ModelsOpen weights

Orpheus 3B

Canopy Labs drops Orpheus 3B natural-sounding speech model

Canopy Labs released Orpheus, an open speech language model that produces natural, human-sounding speech, headlined by a 3B model with smaller variants (1B, 500M, 150M) in the family. Weights are on Hugging Face with a Colab for trying it out, discussed on the show with Daily.co CEO Kwindla Kramer in the voice AI segment.

Blog ↗HF ↗Colab ↗

🎙️ Hear our coverage →

#voice-ai #open-source

LG AI Research Mar 20, 2025

New ModelsOpen weights

EXAONE Deep 32B

LG open sources EXAONE and EXAONE Deep 32B reasoning model

LG AI Research open sourced its EXAONE family, headlined by EXAONE Deep 32B, a thinking/reasoning model. The release puts a large Korean lab's reasoning model in open weights on Hugging Face, and Alex published a live reaction video to the launch.

LG Blog ↗HuggingFace page ↗Alex Reaction Video ↗

🎙️ Hear our coverage →

#open-source #reasoning

Mistral AI Mar 20, 2025

New ModelsOpen weights

Mistral Small 3.1

Mistral Small 3.1 24B: open-weights multimodal model

Mistral released Mistral Small 3.1, a 24B-parameter open-weights model that adds multimodal (vision) capabilities to the Small line. Both instruct and base checkpoints were published on Hugging Face, making it a strong local multimodal option at the 24B size class.

Blog Post ↗HuggingFace page ↗Base Model on HF ↗

🎙️ Hear our coverage →

#open-source #multimodal #vision

NVIDIA Mar 20, 2025

New ModelsOpen weights

Canary 1B/180M Flash

NVIDIA Canary Flash: Apache 2 speech recognition and translation

NVIDIA released Canary 1B Flash and 180M Flash, Apache 2.0 licensed speech recognition and translation models built as Llama finetunes. The permissive license makes them freely usable for commercial ASR and translation workloads.

🎙️ Hear our coverage →

#voice-ai #multilingual #open-source

NVIDIA Mar 20, 2025

New ModelsOpen weights

Llama-Nemotron (Super 49B, Nano 8B)

NVIDIA drops Llama-Nemotron reasoning models plus training dataset

NVIDIA released the Llama-Nemotron family, including Super 49B and Nano 8B reasoning models, announced around GTC. Alongside the open weights, NVIDIA published the Llama-Nemotron post-training dataset, giving the community both the models and the data recipe behind them.

Announcement ↗X ↗Llama-Nemotron HuggingFace Collection ↗Dataset ↗

🎙️ Hear our coverage →

#open-source #reasoning #training

OpenAI Mar 20, 2025

New Models

Next-gen audio models (gpt-4o-mini-tts & transcription)

OpenAI launches steerable voice model and two new transcription models

OpenAI launched a new emotionally steerable text-to-speech voice model plus two new transcription models, watched live on the show as a watch party. The TTS model can be instructed how to speak (tone, emotion, character), demoed at openai.fm, and the models are available through the API for voice agents.

Blog ↗Youtube ↗openai.fm ↗Live watch party clip ↗

🎙️ Hear our coverage →

Roboflow Mar 20, 2025

New ModelsOpen weights

RF-DETR

Roboflow drops RF-DETR, a SOTA open-source object detection model

Roboflow released RF-DETR, a state-of-the-art real-time object detection model, announced as breaking news on the show by CEO Joseph Nelson. The model is fully open source on GitHub and targets practical, deployable computer vision workloads.

RF-DETR Blog Post ↗RF-DETR Github ↗

🎙️ Hear our coverage →

#vision #open-source

StepFun Mar 20, 2025

New ModelsOpen weights

Step-Video-TI2V

StepFun releases Step-Video-TI2V image-to-video model

Chinese lab StepFun dropped Step-Video-TI2V, an open text/image-to-video generation model. Weights are on Hugging Face with code on GitHub, adding another open-weights option to the fast-moving video generation space.

TI2V HuggingFace Space ↗TI2V Github ↗

🎙️ Hear our coverage →

#video-gen #open-source

Tencent Mar 20, 2025

New ModelsOpen weights

Hunyuan3D 2.0 MV & Turbo

Tencent updates Hunyuan3D 2.0 with MultiView and Turbo variants

Tencent updated its Hunyuan3D 2.0 image-to-3D model with an MV (MultiView) version that conditions on multiple input views, plus a faster Turbo variant. The show highlighted it as new SOTA for 3D generation, available to try in a Hugging Face space.

Hunyuan3D-2mv HF Space ↗

🎙️ Hear our coverage →

#world-models #open-source

Allen Institute for AI (Ai2) Mar 13, 2025

New ModelsOpen weights

OLMo 2 32B

AllenAI ships OLMo 2 32B, a fully open GPT-4-class model

The Allen Institute for AI released OLMo 2 32B, its biggest fully open model yet, with weights, code, and dataset all published under Apache 2.0. Announced by Nathan Lambert as a last-second addition, it reportedly beats GPT-3.5 and GPT-4o mini as well as leading open-weight models like Qwen and Mistral at its size.

X announcement ↗Blog ↗Try It ↗Follow-up tweet ↗

🎙️ Hear our coverage →

#open-source #research

ByteDance Mar 13, 2025

New Models

Seedream 2.0

ByteDance unveils Seedream 2.0 bilingual image generation foundation model

ByteDance released Seedream 2.0, a native Chinese-English bilingual image generation foundation model, alongside a technical paper. It emphasizes excellent text rendering (especially Chinese), cultural nuance, and human preference alignment, generating high-quality, culturally relevant images from prompts in either language.

Blog ↗Paper ↗

🎙️ Hear our coverage →

Cohere Mar 13, 2025

New ModelsOpen weights

Command A

Cohere Command A: 111B enterprise model with 256K context on just 2 GPUs

Cohere announced Command A, a 111B parameter open-weights model with a 256K context window, presented on the show by Cohere's Sandra Kublik. It runs on only two GPUs where models of this size typically require around 32, and is built for enterprise use: agentic tasks, tool use, multilingual performance, and secure private deployments.

🎙️ Hear our coverage →

#open-source #industry #agents

E EuroBERT team Mar 13, 2025

New ModelsOpen weights

EuroBERT

EuroBERT: multilingual encoder models from 210M to 2.1B parameters

EuroBERT is a new family of multilingual encoder models ranging from 210M to 2.1B parameters, trained on a 5 trillion-token dataset across 15 languages with 8K context support. It targets European and global language NLP tasks like retrieval and RAG, where properly encoding non-English character sets matters.

🎙️ Hear our coverage →

#open-source #search #multilingual

Google DeepMind Mar 13, 2025

New ModelsOpen weights

Gemma 3

Google open sources Gemma 3, 1B-27B multimodal family with 128K context

Google released Gemma 3, an open-weights model family spanning 1B to 27B parameters with multimodal (text, image, video) capabilities, support for over 140 languages, and a 128K context window. The 27B model runs on a single GPU, with Sundar Pichai claiming competitors need roughly 10x the compute for similar performance. It shipped with day-one open source ecosystem support (Hugging Face, Ollama, Kaggle) plus ShieldGemma 2 for content moderation.

Blog ↗AI Studio ↗HF Collection ↗Hugging Face (27B) ↗

🎙️ Hear our coverage →

#open-source #multimodal #on-device

H HPC-AI Tech Mar 13, 2025

New ModelsOpen weights

Open-Sora 2.0

OpenSora 2.0: 11B open-source video model trained for $200K

OpenSora 2.0 is an 11B parameter open-source video generation model that claims state-of-the-art results while costing only about $200,000 to train. The team claims performance approaching OpenAI's Sora on some benchmarks, underscoring how fast open-source video generation is improving.

🎙️ Hear our coverage →

#video-gen #open-source

Nous Research Mar 13, 2025

New ModelsOpen weights

DeepHermes 3 (24B / 3B)

Nous Research releases DeepHermes 24B and 3B hybrid reasoning models

Nous Research released DeepHermes hybrid reasoners at 24B (Mistral-based) and 3B sizes, models that can toggle between standard chat responses and long chain-of-thought reasoning. The 24B preview is available on Hugging Face as part of the week's wave of open-source reasoning model releases.

X announcement ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #reasoning

Reka AI Mar 13, 2025

New ModelsOpen weights

Reka Flash 3

Reka Flash 3: 21B open-source reasoning model under Apache 2.0

Reka AI open sourced Reka Flash 3, a 21B parameter reasoning model released under an Apache 2.0 license and trained with the REINFORCE Leave One-Out (RLOO) reinforcement learning technique. It excels at chat, coding, instruction following, and function calling, with Nisten calling it possibly one of the best ~20B models available.

Blog ↗Hugging Face ↗X announcement ↗

🎙️ Hear our coverage →

#open-source #reasoning

R Remade AI Mar 13, 2025

New ModelsOpen weights

Wan 2.1 14B I2V LoRA video effects

Remade AI releases 8 open LoRA video effects for Wan 2.1

Remade AI published eight LoRA video effects for Alibaba's Wan 2.1 14B image-to-video model, including effects like squish, inflate, deflate, and cakeify. The open release shows video effects becoming trainable and customizable via LoRAs on top of open video models.

Hugging Face collection ↗

🎙️ Hear our coverage →

#video-gen #open-source

AI21 Labs Mar 6, 2025

New ModelsOpen weights

Jamba 1.6 Large & Mini

AI21 releases Jamba 1.6 Large and Jamba 1.6 Mini open-weights models

AI21 Labs released Jamba 1.6 in Large and Mini sizes, updating its hybrid SSM-Transformer (Mamba-based) model family with open weights on Hugging Face. The Jamba architecture targets long-context efficiency compared to pure transformer models.

Announcement (X) ↗Hugging Face ↗

🎙️ Hear our coverage →

#open-source #architecture

Alibaba (Qwen) Mar 6, 2025

New ModelsOpen weights

QwQ-32B

Qwen releases QwQ-32B reasoning model that matches R1 on some evals

Alibaba's Qwen team released QwQ-32B, an open-weights reasoning model that matches DeepSeek R1 on several evals despite being roughly 20x smaller at 32B parameters. Qwen tech lead Junyang Lin joined the show to announce it, and the episode dubbed it Alibaba's 'R1 killer' for bringing strong reasoning to a size that runs on consumer hardware.

Announcement (X) ↗Blog ↗Hugging Face ↗Chat Demo ↗

🎙️ Hear our coverage →

#open-source #reasoning

Cohere For AI Mar 6, 2025

New ModelsOpen weights

Aya Vision

Cohere For AI releases Aya Vision 8B and 32B open multilingual vision models

Cohere For AI released Aya Vision in 8B and 32B sizes, extending the multilingual Aya family with open-weights vision-language capabilities. The models target multilingual multimodal understanding across many languages.

Announcement (X) ↗Hugging Face Collection ↗

🎙️ Hear our coverage →

#open-source #vision #multilingual

E ElectricAlexis (research) Mar 6, 2025

New ModelsOpen weights

NotaGen

NotaGen open symbolic music model generates classical sheet music

NotaGen is an open symbolic music generation model that produces high-quality classical sheet music rather than raw audio. The release includes code on GitHub, weights on Hugging Face, and a browser demo.

GitHub ↗Demo ↗Hugging Face ↗

🎙️ Hear our coverage →

#audio #open-source

MiniMax Mar 6, 2025

New Models

Image-01

MiniMax launches Image-01 text-to-image model at 1/10 the cost

MiniMax released Image-01, a versatile text-to-image model the company positions at roughly one tenth the cost of competing image generation offerings. It is available through MiniMax's hosted platform.

Announcement (X) ↗Try It ↗

🎙️ Hear our coverage →

Tencent Mar 6, 2025

New ModelsOpen weights

HunyuanVideo-I2V

Tencent releases HunyuanVideo-I2V open image-to-video model

Tencent finally shipped the long-awaited image-to-video version of HunyuanVideo, with open weights on Hugging Face and a hosted try-it experience. It lets users animate still images using one of the strongest open video generation models.

Announcement (X) ↗Hugging Face ↗Try It ↗

🎙️ Hear our coverage →

#video-gen #open-source

Zhipu AI (GLM) Mar 6, 2025

New ModelsOpen weights

CogView 4 (6B)

Zhipu AI open-sources CogView 4, a 6B text-to-image model

Zhipu AI released CogView 4, a 6B-parameter open text-to-image model in the CogView family, with code available on GitHub. It is notable as an open-weights image generation option with strong Chinese and English prompt support.

Announcement (X) ↗GitHub ↗

🎙️ Hear our coverage →

#image-gen #open-source

🚀 Products & Apps 5

Arcee AI Mar 20, 2025

Products & Apps

Arcee Conductor

Arcee AI announces Conductor, an intelligent model router

Arcee AI's Lucas Atkins joined the show to announce Conductor, a model router that picks the best model (including Arcee's small specialized models) for each query. It targets cost and quality optimization by routing requests instead of sending everything to one large model.

🎙️ Hear our coverage →

#api #agents #infrastructure

Manus AI Mar 13, 2025

Products & Apps

Manus

Manus AI research agent has everyone talking

Manus is a new AI research agent (manus.im) that creates a to-do list, browses the web in a real Chrome browser, and generates files, described on the show as 'Operator on steroids' and seemingly powered by Claude 3.7 behind the scenes. The crew tested it live on a research task and praised its slick UI.

🎙️ Hear our coverage →

#agents #search

E Elysian Labs Mar 6, 2025

Products & Apps

Auren

Elysian Labs launches Auren iOS app

Elysian Labs (from nearcyan) launched Auren, an iOS app offering an emotionally attuned AI companion experience. The launch drew attention for its polished consumer approach to AI companionship.

Announcement (X) ↗App ↗

🎙️ Hear our coverage →

Google Mar 6, 2025

Products & Apps

AI Mode & AI Overviews (Gemini 2.0)

Google announces AI Mode in Search powered by Gemini 2.0

Google announced AI Mode, a new conversational search experience in Google Search, alongside Gemini 2.0-powered upgrades to AI Overviews. Robby Stein, VP of Product for Google Search, joined the show for an exclusive interview about the launch, which brings full AI chat-style answers with follow-ups directly into Search.

Google Blog ↗Alex's Reaction (X) ↗Live Reaction Video ↗

🎙️ Hear our coverage →

#search #industry

Sesame Mar 6, 2025

Products & Apps

Sesame conversational voice demo (Maya)

Sesame's ultra-realistic conversational voice demo takes the world by storm

Sesame released a demo of its conversational speech model featuring the Maya voice, and its naturalness, with human-like pauses, laughs, and interruptions, went viral across the AI community. Alex recorded a reaction conversation with Maya showcasing how lifelike the voice model is.

Alex's Conversation with Maya (YouTube) ↗

🎙️ Hear our coverage →

✨ Major Features & Updates 10

OpenAI Mar 27, 2025

Major Features & Updates

ChatGPT Advanced Voice Mode (semantic VAD)

OpenAI updates ChatGPT advanced voice mode with semantic VAD

Alongside the image generation launch, OpenAI quietly updated ChatGPT's advanced voice mode with semantic voice activity detection. The model now understands when you have actually finished speaking rather than cutting in on pauses, leading to much more natural conversation flow.

YouTube announcement ↗

🎙️ Hear our coverage →

OpenAI Mar 27, 2025

Major Features & Updates

GPT-4o Native Image Generation

OpenAI enables native image generation in GPT-4o, internet goes Ghibli

OpenAI finally enabled GPT-4o's native auto-regressive image generation in ChatGPT, sparking the biggest mainstream AI buzz of the week as the internet ghiblified itself. Launched right after Gemini 2.5, it excels at instruction following, text rendering, and multi-turn editing, with viral demos ranging from ad mockups to a full Lord of the Rings trailer.

X thread with examples ↗Ad threads ↗Full Lord of the Rings trailer ↗Native Image Generation System Card ↗

🎙️ Hear our coverage →

#image-gen #multimodal

OpenAI Mar 27, 2025

Major Features & UpdatesOpen weights

MCP support in OpenAI Agents SDK

OpenAI adopts Anthropic's Model Context Protocol - MCP won

OpenAI officially announced support for the Model Context Protocol (MCP) in its Agents SDK, effectively settling the agent tool-connectivity standards war in MCP's favor. Possibly more impactful long-term than the week's flashier launches, since the entire ecosystem can now converge on one protocol for connecting models to tools and data.

OpenAI Agents SDK MCP docs ↗

🎙️ Hear our coverage →

#agents #coding

Cursor Mar 20, 2025

Major Features & Updates

Claude 3.7 MAX

Cursor ships Claude 3.7 MAX mode

Cursor shipped Claude 3.7 MAX, a mode giving the agent the full context window and higher tool-call limits with Claude 3.7 Sonnet. It is aimed at harder, longer coding tasks at premium usage-based pricing.

🎙️ Hear our coverage →

#coding #agents

Google Mar 20, 2025

Major Features & Updates

Gemini Deep Research, Canvas & Live Previews

Google makes Deep Research free, adds Canvas and Live Previews to Gemini

Google made its Deep Research agent free for Gemini users and shipped Canvas, a collaborative workspace with live previews for code and documents. Demos on the show included a playable Tetris game and a markdown word counter built and previewed directly inside Gemini.

X ↗Tetris game ↗markdown enabled word counter ↗

🎙️ Hear our coverage →

#agents #search #coding

Google Mar 20, 2025

Major Features & Updates

NotebookLM Mind Maps

NotebookLM teases Mind Maps for visualizing sources

Google's NotebookLM team previewed Mind Maps, a feature that turns your uploaded sources into interactive visual maps of concepts. It was teased publicly by the team this week ahead of a wider rollout.

🎙️ Hear our coverage →

#consumer-ai #agents

Google Mar 13, 2025

Major Features & Updates

Google AI Studio YouTube link understanding

Google AI Studio adds native YouTube video understanding via link dropping

Google AI Studio now lets you drop a YouTube link and have Gemini natively understand the video. This unlocks video analysis, summarization, and support use cases without downloading or preprocessing the content.

Google AI Studio ↗

🎙️ Hear our coverage →

#multimodal #coding

Google Mar 13, 2025

Major Features & Updates

Gemini Deep Research (free tier)

Google makes Deep Research free in the Gemini app, powered by Gemini Thinking

Google made its Deep Research agent free for everyone in the Gemini app and upgraded it to run on Gemini Thinking. In a live test on the show it browsed over 150 websites to compile a comprehensive answer, with a polished interface and export to Google Docs.

Try It no cost ↗

🎙️ Hear our coverage →

#agents #search

Google DeepMind Mar 13, 2025

Major Features & Updates

Gemini 2.0 Flash native image generation

Gemini Flash gains native image generation and conversational editing

Google enabled native image generation in Gemini Flash Experimental, letting users generate and iteratively edit images conversationally inside the same multimodal model. The crew demoed it live on stream, editing photos of themselves with natural-language instructions, and saw it as a preview of how creative tools like Photoshop will work.

X announcement ↗AI Studio demo ↗

🎙️ Hear our coverage →

#image-gen #multimodal

xAI Mar 6, 2025

Major Features & Updates

Grok Voice

Grok Voice mode opens up to free users

xAI made Grok's voice mode available to free users, removing the paid-tier requirement. The expansion brings conversational voice AI to everyone on the Grok app.

Announcement (X) ↗

🎙️ Hear our coverage →

#voice-ai #industry

🔌 APIs & Platforms 4

OpenAI Mar 20, 2025

APIs & Platforms

o1-pro API

OpenAI makes o1-pro available via API at $600 per 1M output tokens

OpenAI exposed its o1-pro reasoning model through the API for the first time, priced at $600 per million output tokens. The show jokingly framed the pricing as 'for oligarchs', but it makes OpenAI's highest-compute reasoning tier programmatically accessible.

🎙️ Hear our coverage →

#reasoning #api

Nous Research Mar 13, 2025

APIs & Platforms

Portal

Nous Research opens Portal, an inference API for Hermes models

Nous Research launched Portal, its new inference API service offering access to models like Hermes 3 Llama 70B and DeepHermes 3 8B directly via API. It marks another open-source lab standing up hosted API access to make its models more accessible.

🎙️ Hear our coverage →

#api #infrastructure

OpenAI Mar 13, 2025

APIs & Platforms

Responses API + Web Search, File Search, Computer Use tools

OpenAI launches Responses API with Web Search, File Search, and Computer Use

OpenAI announced a new agent-focused developer stack at a livestream: the Responses API, a new way to build with OpenAI designed for agentic workloads, plus an Agents SDK. It ships with three built-in tools: Web Search, a File Search tool providing built-in RAG over your files, and a Computer Use tool for agents that operate computer interfaces.

X announcement ↗Blog ↗

🎙️ Hear our coverage →

#agents #api #coding

Mistral AI Mar 6, 2025

APIs & Platforms

Mistral OCR

Mistral announces state-of-the-art OCR API

Mistral AI announced Mistral OCR, a document-understanding API the company claims is state of the art at extracting text, tables, and equations from complex documents. It targets RAG and document-processing pipelines with structured markdown output.

🎙️ Hear our coverage →

🛠️ Dev Tools 6

M MLX Community (Prince Canuma) Mar 27, 2025

Dev ToolsOpen weights

MLX-Audio v0.0.3

Prince Canuma releases MLX-Audio v0.0.3 for speech on Apple Silicon

Prince Canuma, creator of MLX-VLM, FastMLX, and MLX Embeddings, released MLX-Audio v0.0.3, an open-source library bringing speech and audio models to Apple Silicon via MLX. It makes powerful open-source TTS and audio models accessible locally on Mac hardware.

GitHub repo ↗Prince Canuma on X ↗

🎙️ Hear our coverage →

#voice-ai #open-source #on-device

Weights & Biases Mar 27, 2025

Dev ToolsOpen weights

Weave MCP Server

W&B ships official Weave MCP server - talk to your evals

Weights & Biases shipped an official MCP server for Weave, its LLM observability and evaluation tool, letting agents and MCP clients query and analyze your evals directly. Morgan McQuire of the W&B Applied AI team demoed it on the show, with wandb Models integration coming soon so agents can monitor loss curves for you.

X announcement ↗GitHub repo ↗Example W&B report ↗

🎙️ Hear our coverage →

#agents #benchmarks #coding

Google Mar 20, 2025

Dev Tools

Gemini Co-Drawing

Gemini Co-Drawing demo uses native image output to help you draw

A Hugging Face space demo, Gemini Co-Drawing, uses Gemini's native image generation output to collaboratively complete and enhance your sketches as you draw. It showcases the new native image-output capability of Gemini 2.0 Flash in an interactive tool.

🎙️ Hear our coverage →

#image-gen #agents

Baidu Mar 6, 2025

Dev Tools

Miaoda

Baidu launches Miaoda no-code AI app building tool

Baidu introduced Miaoda, a no-code AI-powered build tool that lets users create applications without writing code. It joins the growing wave of AI-assisted app builders coming out of Chinese tech giants.

🎙️ Hear our coverage →

Cloudflare Mar 6, 2025

Dev ToolsOpen weights

MCP servers on Cloudflare Workers

Cloudflare ships support for building MCP servers on Workers

Cloudflare published tooling and docs for building and deploying Model Context Protocol servers on Cloudflare Workers, riding the MCP wave sweeping the AI community. Senior PM Dina Kozlov joined the show's MCP deep dive to walk through it alongside MCP builder Jason Kneen.

Cloudflare Blog ↗

🎙️ Hear our coverage →

#agents #coding

Google Mar 6, 2025

Dev Tools

Data Science Agent in Colab

Google ships Gemini-powered Data Science Agent in Colab

Google launched a Data Science Agent inside Google Colab, powered by Gemini, that can autonomously generate complete, working notebooks from natural language descriptions of an analysis task. It automates data loading, exploration, and modeling boilerplate for data scientists.

Google Developers Blog ↗

🎙️ Hear our coverage →

#agents #coding

📄 Papers & Research 1

ByteDance Mar 20, 2025

Papers & ResearchOpen weights

DAPO

ByteDance releases DAPO, an RL method that beats GRPO

ByteDance published DAPO, a reinforcement learning method for LLM post-training presented as an improvement over GRPO. The paper ships with an open GitHub implementation, making the technique reproducible for the open-source RL community.

X thread ↗Github ↗Paper ↗

🎙️ Hear our coverage →

#training #reasoning #research

📊 Benchmarks & Evals 2

ARC Prize Foundation Mar 27, 2025

Benchmarks & Evals

ARC-AGI 2

ARC-AGI 2 benchmark revealed, thinking models score just 4%

The ARC Prize Foundation revealed ARC-AGI 2, the next iteration of the abstract reasoning benchmark. Base LLMs score 0% and even thinking models only reach about 4%, showing how far current frontier models remain from human-level fluid intelligence.

0% base LLM score on ARC-AGI 24% thinking model score on ARC-AGI 2

X announcement ↗

🎙️ Hear our coverage →

#benchmarks #reasoning

Roboflow Mar 20, 2025

Benchmarks & EvalsOpen weights

RF100-VL

Roboflow launches RF100-VL benchmark for vision-language models

Alongside RF-DETR, Roboflow introduced RF100-VL, a new evaluation benchmark for vision-language models built from real-world detection datasets. It gives the community a grounded way to measure how well VLMs handle practical object detection tasks.

RF100-VL Benchmark ↗RF-DETR Blog Post ↗

🎙️ Hear our coverage →

#benchmarks #vision

🤝 Acquisitions 1

Weights & Biases Mar 6, 2025

Acquisitions

CoreWeave acquisition of Weights & Biases

Weights & Biases is acquired by CoreWeave

CoreWeave announced it is acquiring Weights & Biases, the AI developer platform and ThursdAI's home company. The deal pairs W&B's experiment tracking, Weave, and models tooling with CoreWeave's AI cloud infrastructure.

W&B Announcement ↗

🎙️ Hear our coverage →

#industry #infrastructure

← February 2025 All months April 2025 →