Multilingual & Translation

Multilingual models, translation, and language- or region-specific AI. — 18 releases covered on the show.

May 2026

ElevenLabs
New Models

Dubbing v2

ElevenLabs Dubbing v2 preserves your performance across 90+ languages

ElevenLabs launched Dubbing v2, an audio-to-audio dubbing model that translates voices across more than 90 languages while preserving cadence, expression, intonation, and even stutters. Alex's live demos, including dubbing Nisten into Hebrew and his own voice into multiple languages, were the brain-melting moment of the episode.

Tencent
New ModelsOpen weights

Hy-MT2

Tencent open-sources Hy-MT2 translation models under Apache 2.0

Tencent released the Hy-MT2 family of translation models under Apache 2.0, including a tiny 1.8B model that beats paid translation APIs like Microsoft's Translator, plus a larger 30B-A3B MoE variant. A small, free, locally-runnable model outperforming commercial translation services was one of the open-source wins of the week.

February 2026

Alibaba (Qwen)
New ModelsOpen weights

Qwen3.5-397B-A17B

Alibaba opens Qwen 3.5: 397B-param multimodal MoE with only 17B active

Alibaba released Qwen3.5-397B-A17B, billed as the first open-weight native multimodal MoE model, with 397B total parameters, just 17B active, 512 experts, and 262K native context extendable to 1M. It delivers 8.6-19x faster inference than Qwen3-Max and continues Qwen's strength in multilingual and medical tasks, scoring 52.5% on Terminal Bench, third place among open-source models. Nisten found coding still trails GLM-5.

397B Qwen 3.5 Parameters
Cohere Labs
New ModelsOpen weights

Tiny Aya

Cohere Labs releases Tiny Aya, a 3.35B multilingual model for 70+ languages

Cohere Labs released Tiny Aya, a 3.35B-parameter multilingual model family supporting 70+ languages that is small enough to run locally on phones. It extends Cohere's Aya line of open multilingual models, bringing broad language coverage to on-device deployments.

January 2026

Upstage
New ModelsOpen weights

Solar Open 100B

Upstage Solar Open 100B: 102B MoE trained on 19.7T tokens

Upstage released Solar Open 100B, a 102B parameter MoE model with only 12B active parameters per token (129 experts, top-8 activation), trained on 19.7 trillion tokens including 4.5T synthetic via a 'data factory' approach. It outperforms GLM 4.5 Air on many benchmarks, features the SNAP PO reinforcement learning technique with a 50% training speedup, and delivers best-in-class Korean language performance.

102B Solar Open params

October 2025

KAIST
New ModelsOpen weights

KORMo 10B

KAIST releases KORMo, a bilingual Korean/English 10B open model

KAIST published KORMo, a 10B parameter fully open bilingual model for Korean and English, with weights on Hugging Face and an accompanying paper. It continues the trend of strong national-language open models coming out of Korean labs.

September 2025

Swiss AI Initiative
New ModelsOpen weights

Apertus-8B / Apertus-70B

Switzerland launches Apertus-8B and 70B, fully open multilingual LLMs

The Swiss AI Initiative launched Apertus-8B and Apertus-70B, fully open multilingual LLMs trained on 15T tokens covering more than 1,800 languages. The release stands out for full openness (weights, data recipe, and training transparency) and unusually broad language coverage from a national effort.

Tencent
New ModelsOpen weights

Hunyuan-MT-7B

Tencent open-sources Hunyuan-MT-7B translation model after sweeping WMT2025

Tencent open-sourced Hunyuan-MT-7B, a 7B-parameter machine translation model, after it swept the WMT2025 translation competition. It gives the open-weights community a small, focused translation model that punches well above its size class.

July 2025

Baidu
New ModelsOpen weights

ERNIE 4.5

Baidu open-sources ERNIE 4.5, a 10-model multimodal family

Baidu open-sourced the ERNIE 4.5 series, a family of 10 models ranging from 424B down to 0.3B parameters with multimodal capabilities, reportedly beating o1 on DocVQA. The release marks a sharp reversal from Baidu's previous anti-open-source posture and another sign that Chinese labs are setting the pace in open source.

10 ERNIE 4.5 models
Huawei
New ModelsOpen weights

Pangu Pro MoE

Huawei's Pangu Pro MoE: 72B model trained entirely on Ascend NPUs

Huawei released Pangu Pro, a 72B-parameter MoE trained on its own Ascend NPUs rather than Nvidia or AMD hardware, hitting 1,528 tokens/sec and pretrained on 13T tokens. The panel framed it as the geopolitical open-model story of the week, showing how far Chinese compute stacks have advanced under sanctions.

Tencent
New ModelsOpen weights

Hunyuan-A13B-Instruct

Tencent ships Hunyuan-A13B: 80B MoE with only 13B active params

Tencent released Hunyuan-A13B-Instruct, an 80B-parameter MoE that activates only 13B parameters at inference while keeping a 256K context window. Built by the team with WizardLM lineage, it posts strong reasoning benchmarks and feels unusually practical for its class, though the panel flagged its license limits.

13B Hunyuan active params

May 2025

Alibaba (Qwen)
New ModelsOpen weights

Qwen 3

Alibaba open-weights the full Qwen 3 family under Apache 2.0

Alibaba released the entire Qwen 3 stack: two MoE models (235B total/22B active and 30B/3B active) plus six dense siblings from 32B down to 0.6B, all Apache 2.0 with day-one support in LM Studio, Ollama, vLLM, MLX and llama.cpp. The headline feature is a runtime hybrid 'thinking' toggle (/think and /no_think) that trades latency for reasoning depth. Trained on ~36T tokens with 128K context and 119-language coverage, the 235B MoE rivals DeepSeek-R1, o1, o3-mini and Gemini 2.5 Pro on coding and math.

235 B Flagship MoE total parameters (22B active)30 B Qwen3-30B-A3B hit 57 tok/s on a Mac with speculative decoding36 Trillions of pre-training tokens (2x Qwen 2.5)
Kyutai
New ModelsOpen weights

Helium-1

Kyutai releases Helium-1, a 2B European-language model plus dactory pipeline

Kyutai released Helium-1, a 2B-parameter model distilled from Gemma-2-9B and purpose-built for Europe's 24 official languages, under CC-BY 4.0. It sets a new state of the art for its size class on MMLU-EU, ARC-EU and FLORES translation while fitting in under 2GB VRAM for edge and phone deployment. They also open-sourced 'dactory' (MIT), their full Common Crawl data-processing pipeline that scores, dedups and tags webpages.

March 2025

NVIDIA
New ModelsOpen weights

Canary 1B/180M Flash

NVIDIA Canary Flash: Apache 2 speech recognition and translation

NVIDIA released Canary 1B Flash and 180M Flash, Apache 2.0 licensed speech recognition and translation models built as Llama finetunes. The permissive license makes them freely usable for commercial ASR and translation workloads.

EuroBERT team
New ModelsOpen weights

EuroBERT

EuroBERT: multilingual encoder models from 210M to 2.1B parameters

EuroBERT is a new family of multilingual encoder models ranging from 210M to 2.1B parameters, trained on a 5 trillion-token dataset across 15 languages with 8K context support. It targets European and global language NLP tasks like retrieval and RAG, where properly encoding non-English character sets matters.