ARC-AGI-3
ARC-AGI-3 launches: humans score 100%, frontier models under 1%
ARC Prize launched ARC-AGI-3, an interactive agentic reasoning benchmark of turn-based puzzle games designed to test human-like generalization in novel abstract environments. Humans hit a 100% pass rate while top frontier models score under 1%, which the panel welcomed as a healthy reality check against AGI-is-here rhetoric and easy score inflation.