Parcae
Parcae: stable looped transformer matches a model twice its size
Together AI and UCSD researchers introduced Parcae, a stable architecture for looped language models that comes with scaling laws and matches the quality of a transformer twice its size. Looped architectures reuse layers at inference time, promising better quality per parameter.