Granite 4.1
IBM Granite 4.1: dense non-thinking models with top tool calling
IBM released the Granite 4.1 family (3B/8B/30B), dense non-thinking models under Apache 2.0 with best-in-class tool calling, scoring 73 on BFCL with just 8B parameters. IBM claims 20x token efficiency over Qwen3.5 9B, and the models are live on W&B Inference at $0.05/$0.10 per million input/output tokens with 128K context.