Products & Apps
W&B iOS App
Weights & Biases launches native iOS app for monitoring training runs
W&B shipped its most-requested feature ever: a native iOS app for monitoring AI training runs with live metrics and push notifications for crash alerts. Practitioners can now keep an eye on long-running training jobs from their phone instead of staying glued to a dashboard.
Dev ToolsOpen weights
W&B Agent Skills
Weights & Biases launches Agent Skills
Weights & Biases officially launched Agent Skills, installable via `npx skills add wandb/skills`. The launch coincided with Nemotron 3 Super becoming available on W&B Inference at $0.20/1M input tokens, one of the best price-performance options for a 120B model.
Benchmarks & Evals
Wolf Bench
Wolfram previews Wolf Bench, a multi-metric agent eval from W&B
Wolfram Ravenwolf gave an early preview of Wolf Bench, a Terminal Bench-based evaluation framework from Weights & Biases that reports four metrics (average, best run, ceiling, and consistent floor) instead of a single score. It treats harness differences (Terminal Bench vs Claude Code vs OpenClaw) as a first-class factor and publishes benchmark cost and transparency details.