Search & Retrieval

Search products, deep research, RAG, embeddings, and retrieval systems. — 28 releases covered on the show.

May 2026

Google
Major Features & Updates

Google Search agentic capabilities

Google Search adds Gemini 3.5 Flash-powered agentic capabilities

Google Search is getting new Gemini 3.5 Flash-powered agentic capabilities, including a new AI-powered Search box and background information agents. The crew framed the rollout as a massive intelligence uplift across one of Google's largest surfaces, with billions of Search users getting frontier-model capabilities.

3.5B Google Search users

March 2026

Andrej Karpathy
Dev ToolsOpen weights

AutoResearcher

Karpathy open-sources AutoResearcher for autonomous ML experiments

Andrej Karpathy open-sourced AutoResearch, a framework that runs AI-driven ML experiments autonomously. Over two days it ran 700 experiments on nanochat GPT-2, stacked 20 improvements, and achieved an 11% training speedup. Tobi Lütke adapted it overnight for Shopify's Liquid templating engine for a 51% render-time improvement, and the repo hit 26K GitHub stars quickly.

700 AutoResearcher experiments run in 2 days (Karpathy)11% GPT-2 training speedup from stacked AutoResearcher improvements51% Shopify Liquid render time improvement using AutoResearcher
Mixbread
New Models

embed-large-v3

Mixbread embed-large-v3 beats Gemini Embedding 2

mixbread.ai dropped embed-large-v3, an embedding model that beats Gemini Embedding 2 on nearly every benchmark, including a jaw-dropping 98% vs 6.9% on structured-data tasks. Benjamin Clavie announced it live during the show.

98% Mixbread embed-large-v3 structured data benchmark score (vs 6.9% for Gemini)

February 2026

xAI
New Models

Grok 4.20

xAI silently drops Grok 4.20 with four 500B-param collaborating agents

xAI released Grok 4.20, a multi-agent system where four 500B-parameter agents collaborate in a multi-agent UI, with a $300/month Heavy tier scaling to 16 agents. No benchmarks or evals were released with the drop. The panel found it underwhelming for coding and day-to-day agent work but still top tier for deep research thanks to xAI's RAG over X data; Grok 4.1 Fast remains #8 on OpenRouter by API usage.

500B×4 Grok 4 20 Architecture

January 2026

MiroMind AI
New ModelsOpen weights

MiroThinker 1.5

MiroThinker 1.5: 30B search agent beats trillion-param models

MiroMind AI released MiroThinker 1.5, a 30B parameter open source search agent that achieves 56.1% on BrowseComp and 66.8% on BrowseComp Chinese, outperforming trillion-parameter models. It introduces 'interactive scaling' as a third scaling dimension beyond parameters and context, and is a fine-tune of Qwen 3 Thinking with 147K open training samples.

October 2025

DeepSeek
New ModelsOpen weights

DeepSeek-OCR

DeepSeek-OCR turns text into compressed vision tokens for massive contexts

DeepSeek open-sourced DeepSeek-OCR, a 3B model (~570M active parameters) that is less an OCR model and more a context-compression breakthrough: it renders text as images, compresses it up to 10x while retaining 97% decoding accuracy (60% even at 20x), and reads it back with a tiny vision decoder. The approach suggests text tokenization is far from optimal and points at vastly cheaper long-context processing; alphaXiv reportedly OCR'd all of arXiv for $1000 versus $7500 with MistralOCR, and a single H100 can process up to 200K pages.

97% decoding accuracy at 10x compression~570M active parameters (3B total)200K pages scannable on a single H100

September 2025

Alibaba (Tongyi Lab)
New ModelsOpen weights

Tongyi DeepResearch 30B-A3B

Tongyi DeepResearch: open-source A3B web agent rivals OpenAI Deep Research

Alibaba's Tongyi Lab open-sourced Tongyi DeepResearch, a 30B mixture-of-experts web research agent with only 3B active parameters. The lab claims parity with OpenAI's Deep Research on agentic search and report-writing tasks, and the weights are available on Hugging Face.

Alibaba (Tongyi Lab)
New ModelsOpen weights

WebWatcher-32B

Alibaba's Tongyi Lab open-sources WebWatcher vision-language research agent

Alibaba's Tongyi Lab open-sourced WebWatcher, a vision-language deep research agent that sets new state-of-the-art results on agentic browsing and research tasks. The 32B model combines visual understanding with web research capabilities and is available on Hugging Face.

Google DeepMind
New ModelsOpen weights

EmbeddingGemma

Google releases EmbeddingGemma, a 300M-param SOTA embedding model for RAG

Google released EmbeddingGemma, a 300M-parameter open embedding model that achieves state-of-the-art results for its size, aimed at RAG and on-device semantic search. It dropped as breaking news during the show, with browser-based demos like Semantic Galaxy showing it running fully client-side.

May 2025

Mistral AI
APIs & Platforms

Mistral Embed

Mistral ships new state-of-the-art embedding API

Mistral announced a new state-of-the-art embedding API. The release gives developers a SOTA option for retrieval and semantic search workloads served through Mistral's platform.

Anthropic
APIs & Platforms

Web Search API

Anthropic launches Web Search API for real-time retrieval in Claude

Anthropic released a Web Search API that gives Claude models real-time web retrieval, letting developers ground responses in current information directly through the API. It was covered among the week's big-company API updates.

OpenAI
Major Features & Updates

ChatGPT Shopping

ChatGPT adds shopping capabilities

OpenAI rolled out shopping features in ChatGPT, letting the assistant find and recommend products for users. Mentioned briefly in the big-companies roundup amid the week's OpenAI sycophancy drama.

April 2025

Nomic AI
New ModelsOpen weights

Nomic Embed Multimodal

Nomic Embed Multimodal: SOTA embeddings for visual documents

Nomic AI released Nomic Embed Multimodal, new 3B and 7B parameter embedding models built on Alibaba's Qwen2.5-VL. They achieve SOTA on visual document retrieval by embedding interleaved text-image sequences, ideal for PDFs and complex webpages. The 7B model ships under Apache 2.0 with open weights, code, and data; guest Zach Nussbaum discussed the release on the show.

3B parameters (smaller model)7B parameters (Apache 2.0 model)

March 2025

Google
Major Features & Updates

Gemini Deep Research, Canvas & Live Previews

Google makes Deep Research free, adds Canvas and Live Previews to Gemini

Google made its Deep Research agent free for Gemini users and shipped Canvas, a collaborative workspace with live previews for code and documents. Demos on the show included a playable Tetris game and a markdown word counter built and previewed directly inside Gemini.

EuroBERT team
New ModelsOpen weights

EuroBERT

EuroBERT: multilingual encoder models from 210M to 2.1B parameters

EuroBERT is a new family of multilingual encoder models ranging from 210M to 2.1B parameters, trained on a 5 trillion-token dataset across 15 languages with 8K context support. It targets European and global language NLP tasks like retrieval and RAG, where properly encoding non-English character sets matters.

Google
Major Features & Updates

Gemini Deep Research (free tier)

Google makes Deep Research free in the Gemini app, powered by Gemini Thinking

Google made its Deep Research agent free for everyone in the Gemini app and upgraded it to run on Gemini Thinking. In a live test on the show it browsed over 150 websites to compile a comprehensive answer, with a polished interface and export to Google Docs.

Manus AI
Products & Apps

Manus

Manus AI research agent has everyone talking

Manus is a new AI research agent (manus.im) that creates a to-do list, browses the web in a real Chrome browser, and generates files, described on the show as 'Operator on steroids' and seemingly powered by Claude 3.7 behind the scenes. The crew tested it live on a research task and praised its slick UI.

Google
Products & Apps

AI Mode & AI Overviews (Gemini 2.0)

Google announces AI Mode in Search powered by Gemini 2.0

Google announced AI Mode, a new conversational search experience in Google Search, alongside Gemini 2.0-powered upgrades to AI Overviews. Robby Stein, VP of Product for Google Search, joined the show for an exclusive interview about the launch, which brings full AI chat-style answers with follow-ups directly into Search.

February 2025

xAI
Major Features & Updates

DeepSearch

xAI launches DeepSearch, an agentic research feature with live X access

Alongside Grok 3, xAI launched DeepSearch, an agentic deep-research feature comparable to Perplexity or OpenAI's Deep Research, with a leg up on real-time information thanks to native access to X search. Alex's initial tests were underwhelming, nicknaming it 'Shallow Search' after it spent 34 seconds on a query where OpenAI's Deep Research took 11 minutes and cited 17 sources.

January 2025

Exa
Major Features & Updates

Exa DeepSeek Chat

Exa ships free DeepSeek R1 chat demo with web search

Exa integrated DeepSeek R1 into a free hosted chat demo that combines the reasoning model with Exa's web search. Mentioned in the tools section as a no-cost way to try R1 grounded with live search results.

Perplexity
Major Features & Updates

Perplexity Pro with R1

Perplexity adds DeepSeek R1 as a Pro reasoning model option

Perplexity integrated DeepSeek R1 into its Pro search product, letting subscribers choose R1 as the reasoning model behind answers. It was one of several tools that raced to host R1 on Western infrastructure within days of the model's release.

Anthropic
APIs & Platforms

Citations (Claude API)

Anthropic adds Citations to the Claude API

Anthropic launched a Citations capability in the Claude API, letting Claude ground its answers in provided source documents and return precise citations. It targets RAG and document-QA use cases where verifiable sourcing matters.

Perplexity
APIs & Platforms

Sonar Pro Search API

Perplexity ships Sonar Pro search API and an Android AI assistant

Perplexity released its Sonar Pro search-grounded API, giving developers programmatic access to Perplexity-style web-grounded answers, and also launched an AI assistant for Android. Two shipping moves that push Perplexity beyond its consumer answer engine.