Episode Summary
What a wild week, it started super slow, and it still did feel slow as releases are concerned, but the most interesting story was yet another AI gone "rogue" (have you even heard about "kill the boar", if not, Grok will tell you all about it) Otherwise it seemed fairly quiet in AI land this week, besides another Chinese newcomer called AM-thinking 32B that beats DeepSeek and Qwen, and Stability making a small comeback, we focused on distributed LLM training and ChatGPT 4.1 We've had a ton of fun on this episode, this one was being recorded from the Weights & Biases SF Office (I'm here to cover Google IO next week!) Letβs dig inβbecause what looks like a slow week on the surface was anything but dull under the hood (TL'DR and show notes at the end as always)
In This Episode
- π π ThursdAI - May 15 - Genocidal Grok, ChatGPT 4.1, AM-Thinking, Distributed LLM training & more AI news
- π° Why does XAI Grok talk about White Genocide and "Kill the boar"??
- π Open Source LLMs: The Decentralization Tsunami
- π Other Open Source Standouts
- π Big Company LLMs & APIs: Models, Modes, and Model Zoo Confusion
- β‘ This Week's Buzz - Everything W\&B!
- π Vision & Video: Open Source Shines Through the Noise
- π° StepFun Step1X-3D: High-Fidelity 3D Asset Generation
- π° Wrapping Up This "Chill" Week
Hosts & Guests
By The Numbers
π π ThursdAI - May 15 - Genocidal Grok, ChatGPT 4.1, AM-Thinking, Distributed LLM training & more AI news
Hey yall, this is Alex π What a wild week, it started super slow, and it still did feel slow as releases are concerned, but the most interesting story was yet another AI gone "rogue" (have you even heard about "kill the boar", if not, Grok will tell you all about it) Otherwise it seemed fairly quiet in AI land this week, besides another Chinese newcomer called AM-thinking 32B that beats DeepSeek and Qwen, and Stability making a small comeback, we focused on distributed LLM training and ChatGPT 4.1 We've had a ton of fun on this episode, this one was being recorded from the Weights & Biases SF Office (I'm here to cover Google IO next week!) Letβs dig inβbecause what looks like a slow week on the surface was anything but dull under the hood (TL'DR and show notes at the end as always)
- We've had a ton of fun on this episode, this one was being recorded from the Weights & Biases SF Office (I'm here to cover Google IO next week!)
- Letβs dig inβbecause what looks like a slow week on the surface was anything but dull under the hood (TL'DR and show notes at the end as always)
π° Why does XAI Grok talk about White Genocide and "Kill the boar"??
Just after we're getting over the chatGPT glazing incident (TK: add coverage link), folks started noticing that @grok - XAI's frontier LLM that is also responding to X replies, started talking about White Genocide in South Africa and something called "Kill the boer" with no reference to any of these things in the question!
- Adding fuel to the fire, are Uncle Elon's recent tweets that are related to South Africa, and this specific change seems to be related to those views at least partly.
- Remember also, Grok was meant as "maximally truth seeking" AI!
- I really hope this transparency continues!
π Open Source LLMs: The Decentralization Tsunami
Open source starts with the kind of progress that would have been unthinkable 18 months ago: a 32B dense LLM, openly released, that takes on the big mixture-of-experts models and comes out on top for math and code. [AM-Thinking v1]( (paper [here]( hits 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard.
- [AM-Thinking v1]( (paper [here]( hits 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard.
- It even runs at 25 tokens/sec on a single 80GB GPU with INT4 quantization.
- The model supports a /think reasoning toggle (chain-of-thought on demand), comes with a permissive license, and is fully tooled for vLLM, LM Studio, and Ollama.
π Other Open Source Standouts
The Falcon-Edge project, which slashes memory and compute requirements and enables inference on <1GB VRAM. If youβre looking to fine-tune, you get pre-quantized checkpoints and a clear path to 1-bit LLMs. [StepFunβs 3D pipeline]( is a two-stage system that creates watertight geometry and then view-consistent textures, trained on 2M curated meshes.
- The Falcon-Edge project, which slashes memory and compute requirements and enables inference on <1GB VRAM.
- If youβre looking to fine-tune, you get pre-quantized checkpoints and a clear path to 1-bit LLMs.
- [StepFunβs 3D pipeline]( is a two-stage system that creates watertight geometry and then view-consistent textures, trained on 2M curated meshes.
π Big Company LLMs & APIs: Models, Modes, and Model Zoo Confusion
OpenAIβs GPT-4.1 seriesβpreviously API-onlyβis now available in the ChatGPT interface. Why does this matter? Because the UX of modern LLMs is, frankly, a mess: seven model options in the dropdown, each with its quirks, speed, and context length.
- OpenAIβs GPT-4.1 seriesβpreviously API-onlyβis now available in the ChatGPT interface.
- Because the UX of modern LLMs is, frankly, a mess: seven model options in the dropdown, each with its quirks, speed, and context length.
- Most casual users donβt even know the dropdown exists.
β‘ This Week's Buzz - Everything W\&B!
It's a busy time here at Weights & Biases, and I'm super excited about a couple of upcoming events where you can connect with us and the broader AI community. Fully Connected: Our very own 2-day conference is happening June 18-19 in San Francisco!
- It's a busy time here at Weights & Biases, and I'm super excited about a couple of upcoming events where you can connect with us and the broader AI community.
- Fully Connected: Our very own 2-day conference is happening June 18-19 in San Francisco!
- It's going to be packed with insights on building and scaling AI.
π Vision & Video: Open Source Shines Through the Noise
We had a bit of a meta-discussion on the show about "video model fatigue" β with so many incremental updates, it can be hard to keep track or see the big leaps. However, when a release like Alibaba's Wan 2.1 comes along, it definitely cuts through.
- We had a bit of a meta-discussion on the show about "video model fatigue" β with so many incremental updates, it can be hard to keep track or see the big leaps.
- However, when a release like Alibaba's Wan 2.1 comes along, it definitely cuts through.
- Alibaba, the team behind the excellent Qwen LLMs, released Wan 2.1, a full stack of open-source text-to-video foundation models.
π° StepFun Step1X-3D: High-Fidelity 3D Asset Generation
StepFun released Step1X-3D, an open two-stage framework for generating textured 3D assets. It first synthesizes geometry and then generates view-consistent textures. They've also released a curated dataset of 800K assets.
- StepFun released Step1X-3D, an open two-stage framework for generating textured 3D assets.
- It first synthesizes geometry and then generates view-consistent textures.
- They've also released a curated dataset of 800K assets.
π° Wrapping Up This "Chill" Week
So, there you have it β another "chill" week in the world of AI! From Grok's controversial escapades to the inspiring decentralized training efforts and mind-bending algorithmic discoveries, it's clear the pace isn't slowing down. Next week is going to be absolutely insane.
- So, there you have it β another "chill" week in the world of AI!
- From Grok's controversial escapades to the inspiring decentralized training efforts and mind-bending algorithmic discoveries, it's clear the pace isn't slowing down.
- Next week is going to be absolutely insane.
Fully Connected - Weights & Biases premier conference - register HERE with coupon
WBTHURSAIAI Engineer -
THANKSTHURSDAI30% off coupon - register HEREHosts and Guests
Alex Volkov - AI Evangelist & Weights & Biases (@altryne)
Co Hosts - @yampeleg @nisten @ldjconfirmed)
Guest - Dillon Rolnick - COO Nous Research (@dillonRolnick)
Open Source LLMs
AM-Thinking v1: 32B dense reasoning model ( HF, Paper, Page )
Falcon-Edge: ternary BitNet LLMs for edge deployment( Blog, HF-1B, HF-3B )
Nous Research Psyche: decentralized cooperative-training network from Nous Research ( Website, GitHub, Tweet, Dashboard )
INTELLECT-2: globally decentralized RL training of a 32B reasoning model ( Blog, Tech report, HF weights, PRIME-RL code )
Our coverage of Intellect-1 back in Dec (https://sub.thursdai.news/p/thursdai-dec-4-openai-o1-and-o1-pro)
HealthBench: OpenAIβs physician-crafted benchmark for AI in healthcare ( Blog, Paper, Code )
Big CO LLMs + APIs
OpenAI adds GPT 4.1 models in chatGPT
AlphaEvolve: Gemini-powered coding agent for algorithm discovery ( Blog )
Google shutting off free Gemini 2.5 Pro API due to "demand" ahead of IO
ByteDance - Seed-1.5-VL-thinking 20B (Paper)
Anthropic Web Search API: real-time retrieval for Claude models ( Blog )
What's up with Grok?
Vision & Video
Voice & Audio
AI Art & Diffusion & 3D
Tools & Others notable AI things mentioned on the pod
The robots are dancing! (X)