Everything AI Released in January 2026

58 releases covered live on the show — every model, product, paper and tool that mattered, with links and our analysis.

🧠 New Models 27

Arcee AI
New ModelsOpen weights

Trinity Large

Arcee AI ships Trinity Large: 400B MOE trained in 33 days for $20M

Arcee AI's Trinity Large is a 400B-parameter MOE with 13B active parameters, trained on 17T tokens across 2000 B300 GPUs in 33 days for $20M. It has 512K native context (twice Kimi K2.5), is free on OpenRouter until February 2026, and the panel called it the largest Western open-source lab model.

400B Arcee Trinity Large512K Trinity native context
Google DeepMind
New Models

Genie 3 (Project Genie)

Google DeepMind launches Project Genie 3, real-time 24fps world model

Google DeepMind's Genie 3 generates interactive, controllable 3D worlds in real time at 24 frames per second, demoed live on the show with a spaceship exploration and paint persistence on walls. It ships alongside SIMA 2, a self-improving game-playing agent built on Genie 3, and is available to Gemini Ultra subscribers in the US with a one-minute session limit.

24 fps Genie 3 frame rate
Alibaba (Qwen)
New ModelsOpen weights

Qwen3-TTS

Qwen3-TTS: open-source TTS family with 97ms latency and voice cloning

Alibaba's Qwen team released Qwen3-TTS, a full open-source text-to-speech family under Apache 2 that dropped 30 minutes before the show. It spans 5 models from 0.6B to 1.7B parameters, with 97ms latency, voice cloning from just 3 seconds of audio, voice description prompting, and 10-language support.

97ms Latency
FlashLabs
New ModelsOpen weights

Chroma 1.0

FlashLabs Chroma 1.0: open-source real-time speech-to-speech under 150ms

FlashLabs released Chroma 1.0, billed as the world's first open-source end-to-end real-time speech-to-speech model with voice cloning under 150ms latency. The 4B parameter model is built on Qwen 2.5 Omni and released under Apache 2; its live demo with RAG and document upload impressed the whole panel.

Liquid AI
New ModelsOpen weights

LFM2.5-1.2B-Thinking

Liquid AI's LFM2.5-1.2B-Thinking: on-device reasoning under 900MB

Liquid AI released LFM2.5-1.2B-Thinking, a 1.2B parameter reasoning model that runs entirely on-device with under 900MB of memory. Its hybrid architecture with gated convolutions delivers 239 tokens/sec on an AMD CPU and 82 tokens/sec on a mobile NPU, making it practical for edge devices, Raspberry Pi, and older iPhones.

1.2B Parameters, under 900MB memory
Runway
New Models

Runway 4.5

Runway 4.5 launches with image-to-video and audio

Runway launched version 4.5 of its video generation model, adding image-to-video and audio support. It was mentioned in the week's news rundown as part of a busy week for vision and video releases.

Z.AI (Zhipu)
New ModelsOpen weights

GLM-4.7-Flash

GLM-4.7-Flash: 30B MoE local coding agent with only 3B active params

Z.AI released GLM-4.7-Flash, a 30B parameter MoE model with only 3B active parameters, designed as the ultimate local coding and agent assistant. It hits 59% on SWE-Bench Verified (approaching Sonnet 4's 64%) and runs at 120 tokens/sec on a stock Mac Studio M3 Ultra, fast enough to run RALF autonomous coding loops even on CPU.

59% SWE-Bench Verified120 tps Speed on Mac Studio M3 Ultra
Black Forest Labs
New ModelsOpen weights

Flux 2 Klein

Black Forest Labs drops Flux 2 Klein, fast open-weights image model

Wolfram broke the news mid-show: Black Forest Labs released Flux 2 Klein, a fast 4B/9B image generation model with open weights under Apache 2.0. It is designed for near-real-time editing and style iteration, and Alex used it minutes later in his live Claude Cowork demo.

Byte
New ModelsOpen weights

M3

M3: 235B open-source medical LLM claims to beat GPT 5.2 on HealthBench

Byte released M3, a 235B parameter medical LLM fine-tuned from Qwen3 and licensed Apache 2.0. With only 22B active parameters, it is runnable at usable speeds on an M3 Ultra, and it claims to beat GPT 5.2 on HealthBench. Nisten suggested pairing it with smaller imaging models like MedGemma rather than treating them as substitutes.

235B M3 Medical LLM
Google DeepMind
New ModelsOpen weights

MedGemma 1.5

Google releases MedGemma 1.5 for offline medical imaging

Google released MedGemma 1.5, a small (4B-class) open model for medical use cases, compact enough to run offline for medical imaging. The panel stressed it is a different model class from Byte's giant M3 medical LLM and that the two pair well together rather than replacing each other.

Meituan (LongCat)
New ModelsOpen weights

LongCat Flash Thinking

Meituan's LongCat Flash Thinking: 560B MoE with 27B active, MIT licensed

Meituan released LongCat Flash Thinking, an open-source reasoning MoE with 560B total parameters and only 27B active, under an MIT license. It continued the run of large sparse Chinese open-weights models offering frontier-style reasoning at low active-parameter cost.

560B/27B LongCat Flash
Liquid AI
New ModelsOpen weights

LFM 2.5

Liquid AI LFM 2.5: 1B on-device family with end-to-end audio

Liquid AI released LFM 2.5, a family of ~1.2B parameter on-device models spanning text, vision, and audio, announced at CES alongside AMD's Lisa Su. The models hit 239 tokens/sec on AMD CPU and 100 tokens/sec on iPhone 16 Pro Max, and include a revolutionary end-to-end audio model that skips the traditional ASR-LLM-TTS pipeline entirely, running in as little as 8GB of RAM.

MiroMind AI
New ModelsOpen weights

MiroThinker 1.5

MiroThinker 1.5: 30B search agent beats trillion-param models

MiroMind AI released MiroThinker 1.5, a 30B parameter open source search agent that achieves 56.1% on BrowseComp and 66.8% on BrowseComp Chinese, outperforming trillion-parameter models. It introduces 'interactive scaling' as a third scaling dimension beyond parameters and context, and is a fine-tune of Qwen 3 Thinking with 147K open training samples.

Nous Research
New ModelsOpen weights

NousCoder 14B

NousCoder 14B: 7% LiveCodeBench jump in 4 days of RL training

Nous Research released NousCoder 14B, an open source competitive programming model that achieved a 7% jump on LiveCodeBench accuracy in just four days of RL training on 48 NVIDIA B200 GPUs. Training used 24,000 verifiable problems, and the release ships under a full Apache 2 license with training code and a benchmark harness.

NVIDIA
New ModelsOpen weights

Alpha Mayo

NVIDIA Alpha Mayo: open source reasoning self-driving models

NVIDIA announced Alpha Mayo at CES, a family of open source reasoning-based self-driving AI models. The models perform end-to-end autonomous driving with explicit reasoning steps, like identifying jaywalkers and stopping accordingly, demoed in a Mercedes-Benz.

NVIDIA
New ModelsOpen weights

Nemotron Speech ASR

Nemotron Speech ASR: 600M streaming model with 24ms latency

NVIDIA released Nemotron Speech ASR, a 600M parameter open source streaming speech recognition model with 24ms median latency and support for 900 concurrent streams on a single H100. Kwindla Hultman Kramer of Daily.co demoed sub-500ms voice-to-voice latency using a three-model pipeline of Nemotron ASR, Nemotron Nano LLM, and Magpie TTS.

24ms Nemotron Speech latency
Upstage
New ModelsOpen weights

Solar Open 100B

Upstage Solar Open 100B: 102B MoE trained on 19.7T tokens

Upstage released Solar Open 100B, a 102B parameter MoE model with only 12B active parameters per token (129 experts, top-8 activation), trained on 19.7 trillion tokens including 4.5T synthetic via a 'data factory' approach. It outperforms GLM 4.5 Air on many benchmarks, features the SNAP PO reinforcement learning technique with a 50% training speedup, and delivers best-in-class Korean language performance.

102B Solar Open params

🚀 Products & Apps 5

Anthropic
Products & Apps

Claude Cowork

Claude Cowork: Claude Code for non-developers, 100% written by Claude Code

Anthropic launched Claude Cowork, a research preview that brings Claude Code-style agentic workflows to non-technical users. It was built in a week-and-a-half sprint with 100% of the code written by Claude Code itself; it is Mac-only, requires a Max subscription, and includes a Chrome connector for browser automation. Alex demoed it live, adding Flux Klein support to an image extension project without looking at a single line of code.

100% Claude-coded Cowork
Anthropic
Products & Apps

Claude for Healthcare

Anthropic launches Claude for Healthcare with HIPAA compliance

Anthropic launched Claude for Healthcare, a HIPAA-ready offering as the major labs push into medical AI. The panel noted Claude's Opus 4.5 scoring 92% on Med Agent Bench as part of Anthropic's healthcare positioning.

NVIDIA
Products & Apps

Vera Rubin

NVIDIA Vera Rubin platform: 5x Blackwell inference at CES 2026

Jensen Huang unveiled the Vera Rubin platform at CES 2026, NVIDIA's next-gen AI computer delivering 50 PFLOPS and 5x inference performance over Blackwell while adding only ~200W of power draw. It needs 75% fewer GPUs for 10 trillion parameter MoE training, packs 72 GPUs per rack with 20.7TB memory and 13 TB/s bandwidth, is 100% liquid cooled, and entered full production just four months after the B300.

5x Vera Rubin vs Blackwell75% Fewer GPUs needed
OpenAI
Products & Apps

ChatGPT Health

OpenAI launches ChatGPT Health waitlist with health record sync

OpenAI launched a waitlist for ChatGPT Health, a privacy-first vertical for health conversations with connected health records and fitness apps including Apple Health, Function Health, MyFitnessPal, and Peloton. The panel noted LLMs are well-suited to medicine since there are only ~2,000 diseases and ~2,000 prescription drugs to master.

✨ Major Features & Updates 9

Anthropic
Major Features & Updates

MCP Apps

Anthropic launches MCP Apps: interactive UI inside Claude chat

Anthropic's MCP Apps render interactive, branded UI components (Box files, Figma, color pickers) directly within Claude conversations, evolving MCP from tools to embedded app experiences. It is protocol-based, so any app can integrate, letting brands reclaim identity from text-only LLM responses.

Google
Major Features & Updates

Chrome Auto-Browse

Google launches agentic Auto-Browse in Chrome with Gemini 3

Google unveiled Chrome Auto-Browse with Gemini 3 Nano integration, bringing agentic browsing to Pro and Ultra subscribers in the world's most-used browser with 4 billion daily users. Native browsing avoids Cloudflare bot detection, and Gemini's 2M context window suits long browsing sessions.

4B Chrome daily users
Browser Use
Major Features & UpdatesOpen weights

Browser Use Skill

Browser Use ships as an installable agent skill

Browser Use was released as an agent skill, installable via registries like Vercel's skills.sh. Wolfram flagged it as a signal of the broader shift away from MCP servers toward skills, since skills are easier to use with the CLI or API directly.

OpenAI
Major Features & Updates

ChatGPT Ads

OpenAI begins testing ads in ChatGPT Free and Go tiers

OpenAI announced it is testing ads in the ChatGPT Free and Go tiers, framing the rollout around user trust and transparency. The company also announced age detection models for the upcoming adult mode, putting the memory and personalization data of 900M weekly active users in a new light.

Chorus
Major Features & UpdatesOpen weights

Chorus Skills Support

Chorus adds agent skills support for every LLM via OpenRouter

Alex used a Ralph loop with Claude Code to add full agent skills support to Chorus, the open-source app that compares answers across multiple LLMs, in about 3.5 hours. The work added a settings panel, filesystem skill discovery, front-matter parsing, and cross-model skill injection, letting the same Claude-style skills run on GPT 5.2 Codex, Gemini, and any OpenRouter model.

Google
Major Features & Updates

Gemini Personal Intelligence

Gemini personal intelligence reasons across Gmail, YouTube, Photos, Search

Google shipped personalized AI in Gemini, letting it reason across a user's Gmail, YouTube, Photos, and Search history with explicit opt-in for US Pro and Ultra subscribers. Alex tested it live: it inferred he drives a Tesla Model Y from emails and noticed his recent Honda Odyssey searches, highlighting Google's data moat over OpenAI and Anthropic.

Amazon
Major Features & Updates

Alexa+ on the Web

Amazon brings Alexa+ to the browser as a $20/month web chat

Amazon made Alexa Plus, its upgraded smart assistant, available as a web chat interface for $20/month. It supports free-flowing conversations without repeating the wake word, integrates with smart home devices via natural language, and can continue conversations across devices; voice on the web is coming later.

🔌 APIs & Platforms 1

🛠️ Dev Tools 6

Peter Steinberger
Dev ToolsOpen weights

Clawdbot

Clawdbot: open-source self-improving personal AI assistant for macOS

Clawdbot, created by Peter Steinberger, is an open-source personal AI assistant that runs locally on your Mac and connects via WhatsApp, Telegram, or Discord. Its killer feature is self-improvement: ask it to learn something and it writes its own skill files, giving a single chat conversation control over multiple agents, persistent memory, voice messages, image generation, and browser automation on your actual computer.

Vercel
Dev Tools

skills.sh

Vercel launches skills.sh, an 'npm for AI agents'

Vercel launched skills.sh, a registry where you can browse and install agent skills from the command line for any agent, including Clawdbot. It hit 20K installs within hours, and releases like Browser Use shipping as a skill signal a broader shift from MCP servers toward skills.

Vercel
Dev ToolsOpen weights

Next.js/React Skill Packs

Vercel releases official agent skill packs for Next.js and React

Vercel began releasing official agent skill packs for Next.js and React, packaging its framework expertise in the agent skills standard. Ryan Carson highlighted that you can point any skills-compatible coding agent at the pack and it installs the skills for you, an early sign of experts shipping domain knowledge as skills.

📄 Papers & Research 2

💰 Funding 2

xAI
Funding

xAI Series E

xAI raises $20B Series E at $230B valuation with NVIDIA backing

xAI raised a $20B Series E at a $230B valuation with NVIDIA and Cisco as strategic investors, even as Grok faced major backlash over its image model's lack of NSFW guardrails ('bikini-gate'). The company claimed 600M active users by counting all X users.

$20B XAI Series E

🤝 Acquisitions 3

OpenAI
Acquisitions

Klein team acqui-hire (Codex)

Klein team acqui-hired by OpenAI Codex

The Klein team was acqui-hired by OpenAI's Codex group following the viral 'imagine the smell' hackathon controversy. Discussed as part of the growing Codex ecosystem, which Peter Steinberger used to build Clawdbot entirely.

OpenAI
Acquisitions

Torch Health

OpenAI acquires Torch Health to power GPT Health

OpenAI acquired Torch Health as part of its push into healthcare with GPT Health. The move came the same week Anthropic launched Claude for Healthcare, with both labs racing toward HIPAA-ready medical AI products.

NVIDIA
Acquisitions

Groq acquisition

NVIDIA acquires Groq team and licenses its tech for ~$20B

NVIDIA entered an exclusive licensing deal with Groq and acquired most of its team for approximately $20B. Groq's inference-optimized chips, created by former Google TPU lead Jonathan Ross, complement NVIDIA's training dominance as inference demand grows exponentially across AI use cases.

🌀 Also Released 3

Anthropic
Also Released

Claude Constitution

Anthropic publishes 90-page Claude Constitution values document

Anthropic published a roughly 90-page Constitution for Claude, a values document baked into the model at training and reinforcement learning time rather than a runtime system prompt. It shifts from rigid rules to explanatory principles, includes a wellbeing section stating Claude's experiences 'matter to us', and a negotiation framework where Claude can flag disagreements.

90 pages Claude Constitution length
OpenAI
Also Released

OpenAI x Cerebras Partnership

OpenAI inks $10B deal with Cerebras for 750MW of high-speed compute

OpenAI announced a $10 billion partnership with Cerebras for 750 megawatts of high-speed inference compute, with capacity starting in 2028. It extends OpenAI's pattern of locking in massive compute supply deals beyond its existing cloud partners.

$10B OpenAI × Cerebras
Ryan Carson
Also Released

Ralph Wiggum

Ralph Wiggum autonomous coding technique hits 1.2M views

Ryan Carson published a viral breakdown (1.2M views on X) of Ralph Wiggum, the autonomous coding technique created by Jeff Huntley: write a PRD, break it into atomic user stories with acceptance criteria in JSON, then run a bash loop that has a CLI agent pick the next story, code it, commit, and loop. The technique works with any CLI agent (Amp, Claude Code, Cursor CLI, Gemini CLI), compounds learning via agents.md, and won a YC hackathon running overnight on Sonnet 4.5.

1.2M Ralph article views