Episode Summary

What a wild week, it started super slow, and it still did feel slow as releases are concerned, but the most interesting story was yet another AI gone "rogue" (have you even heard about "kill the boar", if not, Grok will tell you all about it) Otherwise it seemed fairly quiet in AI land this week, besides another Chinese newcomer called AM-thinking 32B that beats DeepSeek and Qwen, and Stability making a small comeback, we focused on distributed LLM training and ChatGPT 4.1 We've had a ton of fun on this episode, this one was being recorded from the Weights & Biases SF Office (I'm here to cover Google IO next week!) Let’s dig inβ€”because what looks like a slow week on the surface was anything but dull under the hood (TL'DR and show notes at the end as always)

Hosts & Guests

Alex Volkov
Alex Volkov
Host Β· W&B / CoreWeave
@altryne
Dillon Rolnick
Dillon Rolnick
COO Β· Nous Research
@DillonRolnick
Nisten Tahiraj
Nisten Tahiraj
Weekly co-host of ThursdAI Β· AI operator & builder
@nisten
Yam Peleg
Yam Peleg
Weekly co-host of ThursdAI Β· AI builder & founder
@Yampeleg
LDJ
LDJ
Weekly co-host of ThursdAI Β· Nous Research
@ldjconfirmed

By The Numbers

πŸ“† ThursdAI - May 15 - Genocidal Grok, ChatGPT 4.1, A
32B
Hey yall, this is Alex πŸ‘‹ What a wild week, it started super slow, and it still did feel slow as releases are concerned, but the most interesting story was yet another AI gone "rogue" (have you even heard about "kill the boar", if not, Grok will tell you all about it) Otherwise it seemed fairly quiet in AI land this week, besides another Chinese newcomer called AM-thinking 32B that beats DeepSeek and Qwen, and Stability making a small comeback, we focused on distributed LLM training and ChatGPT 4.1 We've had a ton of fun on this episode, this one was being recorded from the Weights & Biases SF Office (I'm here to cover Google IO next week!) Let’s dig inβ€”because what looks like a slow week on the surface was anything but dull under the hood (TL'DR and show notes at the end as always)
Open Source LLMs: The Decentralization Tsunami
32B
Open source starts with the kind of progress that would have been unthinkable 18 months ago: a 32B dense LLM, openly released, that takes on the big mixture-of-experts models and comes out on top for math and code.
Open Source LLMs: The Decentralization Tsunami
85.3%
[AM-Thinking v1]( (paper [here]( hits 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard.
Open Source LLMs: The Decentralization Tsunami
25
It even runs at 25 tokens/sec on a single 80GB GPU with INT4 quantization.
Open Source LLMs: The Decentralization Tsunami
128
And yes, they’re already working on a multilingual RLHF pass and 128k context window.

πŸ”“ πŸ“† ThursdAI - May 15 - Genocidal Grok, ChatGPT 4.1, AM-Thinking, Distributed LLM training & more AI news

Hey yall, this is Alex πŸ‘‹ What a wild week, it started super slow, and it still did feel slow as releases are concerned, but the most interesting story was yet another AI gone "rogue" (have you even heard about "kill the boar", if not, Grok will tell you all about it) Otherwise it seemed fairly quiet in AI land this week, besides another Chinese newcomer called AM-thinking 32B that beats DeepSeek and Qwen, and Stability making a small comeback, we focused on distributed LLM training and ChatGPT 4.1 We've had a ton of fun on this episode, this one was being recorded from the Weights & Biases SF Office (I'm here to cover Google IO next week!) Let’s dig inβ€”because what looks like a slow week on the surface was anything but dull under the hood (TL'DR and show notes at the end as always)

  • We've had a ton of fun on this episode, this one was being recorded from the Weights & Biases SF Office (I'm here to cover Google IO next week!)
  • Let’s dig inβ€”because what looks like a slow week on the surface was anything but dull under the hood (TL'DR and show notes at the end as always)

πŸ“° Why does XAI Grok talk about White Genocide and "Kill the boar"??

Just after we're getting over the chatGPT glazing incident (TK: add coverage link), folks started noticing that @grok - XAI's frontier LLM that is also responding to X replies, started talking about White Genocide in South Africa and something called "Kill the boer" with no reference to any of these things in the question!

  • Adding fuel to the fire, are Uncle Elon's recent tweets that are related to South Africa, and this specific change seems to be related to those views at least partly.
  • Remember also, Grok was meant as "maximally truth seeking" AI!
  • I really hope this transparency continues!

πŸ”“ Open Source LLMs: The Decentralization Tsunami

Open source starts with the kind of progress that would have been unthinkable 18 months ago: a 32B dense LLM, openly released, that takes on the big mixture-of-experts models and comes out on top for math and code. [AM-Thinking v1]( (paper [here]( hits 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard.

  • [AM-Thinking v1]( (paper [here]( hits 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard.
  • It even runs at 25 tokens/sec on a single 80GB GPU with INT4 quantization.
  • The model supports a /think reasoning toggle (chain-of-thought on demand), comes with a permissive license, and is fully tooled for vLLM, LM Studio, and Ollama.

πŸ”“ Other Open Source Standouts

The Falcon-Edge project, which slashes memory and compute requirements and enables inference on <1GB VRAM. If you’re looking to fine-tune, you get pre-quantized checkpoints and a clear path to 1-bit LLMs. [StepFun’s 3D pipeline]( is a two-stage system that creates watertight geometry and then view-consistent textures, trained on 2M curated meshes.

  • The Falcon-Edge project, which slashes memory and compute requirements and enables inference on <1GB VRAM.
  • If you’re looking to fine-tune, you get pre-quantized checkpoints and a clear path to 1-bit LLMs.
  • [StepFun’s 3D pipeline]( is a two-stage system that creates watertight geometry and then view-consistent textures, trained on 2M curated meshes.

πŸ”“ Big Company LLMs & APIs: Models, Modes, and Model Zoo Confusion

OpenAI’s GPT-4.1 seriesβ€”previously API-onlyβ€”is now available in the ChatGPT interface. Why does this matter? Because the UX of modern LLMs is, frankly, a mess: seven model options in the dropdown, each with its quirks, speed, and context length.

  • OpenAI’s GPT-4.1 seriesβ€”previously API-onlyβ€”is now available in the ChatGPT interface.
  • Because the UX of modern LLMs is, frankly, a mess: seven model options in the dropdown, each with its quirks, speed, and context length.
  • Most casual users don’t even know the dropdown exists.

⚑ This Week's Buzz - Everything W\&B!

It's a busy time here at Weights & Biases, and I'm super excited about a couple of upcoming events where you can connect with us and the broader AI community. Fully Connected: Our very own 2-day conference is happening June 18-19 in San Francisco!

  • It's a busy time here at Weights & Biases, and I'm super excited about a couple of upcoming events where you can connect with us and the broader AI community.
  • Fully Connected: Our very own 2-day conference is happening June 18-19 in San Francisco!
  • It's going to be packed with insights on building and scaling AI.

πŸ”“ Vision & Video: Open Source Shines Through the Noise

We had a bit of a meta-discussion on the show about "video model fatigue" – with so many incremental updates, it can be hard to keep track or see the big leaps. However, when a release like Alibaba's Wan 2.1 comes along, it definitely cuts through.

  • We had a bit of a meta-discussion on the show about "video model fatigue" – with so many incremental updates, it can be hard to keep track or see the big leaps.
  • However, when a release like Alibaba's Wan 2.1 comes along, it definitely cuts through.
  • Alibaba, the team behind the excellent Qwen LLMs, released Wan 2.1, a full stack of open-source text-to-video foundation models.

πŸ“° StepFun Step1X-3D: High-Fidelity 3D Asset Generation

StepFun released Step1X-3D, an open two-stage framework for generating textured 3D assets. It first synthesizes geometry and then generates view-consistent textures. They've also released a curated dataset of 800K assets.

  • StepFun released Step1X-3D, an open two-stage framework for generating textured 3D assets.
  • It first synthesizes geometry and then generates view-consistent textures.
  • They've also released a curated dataset of 800K assets.

πŸ“° Wrapping Up This "Chill" Week

So, there you have it – another "chill" week in the world of AI! From Grok's controversial escapades to the inspiring decentralized training efforts and mind-bending algorithmic discoveries, it's clear the pace isn't slowing down. Next week is going to be absolutely insane.

  • So, there you have it – another "chill" week in the world of AI!
  • From Grok's controversial escapades to the inspiring decentralized training efforts and mind-bending algorithmic discoveries, it's clear the pace isn't slowing down.
  • Next week is going to be absolutely insane.
TL;DR and show notes
  • Fully Connected - Weights & Biases premier conference - register HERE with coupon WBTHURSAI

  • AI Engineer - THANKSTHURSDAI 30% off coupon - register HERE

  • Hosts and Guests

    Open Source LLMs

  • Big CO LLMs + APIs

    • OpenAI adds GPT 4.1 models in chatGPT

    • AlphaEvolve: Gemini-powered coding agent for algorithm discovery ( Blog )

    • Google shutting off free Gemini 2.5 Pro API due to "demand" ahead of IO

    • ByteDance - Seed-1.5-VL-thinking 20B (Paper)

    • Anthropic Web Search API: real-time retrieval for Claude models ( Blog )

    • What's up with Grok?

  • Vision & Video

    • Wan 2.1: open-source diffusion-transformer video suite
      ( HF, GitHub, Tweet )

    • LTX distilled - near real time video (X)

  • Voice & Audio

    • Haulio - MiniMax Speech tech report is out - best TTS out there (Paper)

    • Stability AI - Stable Audio Open Small 341M: on-device text-to-audio (X, Blog, Paper, HF )

  • AI Art & Diffusion & 3D

    • StepFun Step1x-3D - Towards High-Fidelity and Controllable
      Generation of Textured 3D Assets (HF, Demo, Dataset, report)

  • Tools & Others notable AI things mentioned on the pod

    • The robots are dancing! (X)