A
A-M Team
New ModelsOpen weights
AM-Thinking v1
AM-Thinking v1: 32B dense reasoning model beats bigger MoEs at math and code
A 32B dense open-weights reasoning LLM from a new Chinese team that takes on much larger mixture-of-experts models and comes out on top for math and code, hitting 85.3% on AIME 2024, 70.3% on LiveCodeBench v5, and 92.5% on Arena-Hard. It supports a /think reasoning toggle, ships with a permissive license, is tooled for vLLM, LM Studio, and Ollama, and runs at 25 tokens/sec on a single 80GB GPU with INT4 quantization. A multilingual RLHF pass and 128k context window are in the works.
32B dense parameters85.3% AIME 202425 tokens/sec on a single 80GB GPU with INT4