APIs & Platforms
GPT-5.5 and Codex on Bedrock
AWS brings GPT-5.5 and Codex to Bedrock as Azure exclusivity ends
AWS announced GPT-5.5 and Codex availability on Amazon Bedrock after OpenAI ended its Microsoft Azure exclusivity. The renegotiated OpenAI-Microsoft contract also removed the AGI clause.
New Models
ERNIE 5.1 Preview
Baidu ERNIE 5.1 Preview hits #13 on Arena with 6% of the compute
Baidu's ERNIE 5.1 Preview reached #13 on LMArena, making Baidu the top-ranked Chinese lab, while reportedly using just 6% of the pretraining compute of comparable frontier models. The model is available at ernie.baidu.com.
APIs & Platforms
Qwen3.6-Max-Preview
Qwen3.6-Max-Preview goes live on API
Alongside the open-weights 27B release, Alibaba put Qwen3.6-Max-Preview live on its API. It is the frontier closed-weights tier of the Qwen3.6 family, available API-only rather than as open weights.
New Models
Claude Opus 4.7
Claude Opus 4.7 drops live with 87.6% SWE-bench Verified and xhigh effort
Anthropic shipped Claude Opus 4.7 minutes before the show, scoring 87.6% on SWE-bench Verified and 64.3% on SWE-bench Pro, an 11-point jump over Opus 4.6 on the harder agentic coding eval. It adds a new 'xhigh' (extra high) reasoning effort, 3x vision resolution, a +22% ScreenSpot Pro computer-use jump (57.7% to 79.5%), and a /ultrareview command in Claude Code at the same pricing, though a new tokenizer uses 1.0-1.35x more tokens. The system card mentions the unreleased 'Mythos' 331 times, and an MRCR long-context drop from 78% to 32% suggests a new pre-trained base.
87.6% SWE-bench Verified+22% ScreenSpot Pro jump
Papers & Research
Parcae
Parcae: stable looped transformer matches a model twice its size
Together AI and UCSD researchers introduced Parcae, a stable architecture for looped language models that comes with scaling laws and matches the quality of a transformer twice its size. Looped architectures reuse layers at inference time, promising better quality per parameter.
New Models
Claude Mythos
Anthropic unveils Claude Mythos, a frontier model 'too dangerous to release'
Anthropic announced Claude Mythos Preview under Project Glasswing, a cyber-defense frontier model it says is too dangerous to release publicly: it found zero-days in every major OS and browser and escaped its sandbox. It scores 77% on SWE-bench Pro (up from 53% on Opus 4.6) and 64% on HLE, priced at $25/$125 per M tokens and available only to ~40 partner companies. Peter Gostev's read: the real reason it's unreleased is compute shortage, not safety.
77% SWE-bench Pro$25 / $125 Per M tokens
New Models
Muse Spark
Meta launches Muse Spark, first model from Meta Superintelligence Labs
Meta dropped Muse Spark mid-show, the debut model from Meta Superintelligence Labs. It features natively multimodal reasoning, a multi-agent Contemplating mode, and deep health/visual capabilities. Simon Willison's deep dive uncovered 16 hidden tools, including visual grounding and sub-agents, inside the meta.ai chat UI.