Open Weights and Sovereign AI

What the Sovereign AI Index reveals about Llama, Mistral, Qwen, and the from-scratch models behind the world's sovereign AI projects.

May 15, 2026

If you haven’t had a chance to do so already, please check out the new Sovereign AI Index, published last month by the Center for a New American Security, where I’m an adjunct senior fellow. I co-authored the index with my colleagues Vivek Chilukuri and Ruby Scanlon, whose partnership I’m grateful for and have deeply enjoyed.

At a high level, the Index documents how governments outside the United States and China are building national AI capability — what many stakeholders call “sovereign AI.” The Index tracks 139 government-backed projects across 56 jurisdictions, each categorized by stack layer: compute, models, or data.

The Index is dense with insights and data that are useful across policy, business, and academia. One area that’s particularly rich is the evolution of sovereign AI in the model layer. Of the 139 projects, 50 — about 36 percent — are large language model efforts.

Inputs

Forty of the 50 model projects disclose a build approach: 17 are built from scratch, 19 on an existing open-weight base, and 4 do both. Across the 23 — 19 plus 4 — that build on a base, there are 36 disclosed base-model instances (several projects use more than one).

The distribution is concentrated in a few countries. U.S. firms account for nearly two-thirds of disclosed base models, primarily through three families — Meta’s Llama, Google’s Gemma, and OpenAI’s Whisper. The French share comes entirely from Mistral, and the Chinese from Alibaba’s Qwen.

Looking at the same data through a more purely corporate lens, Llama appears in 14 of the 36 instances — roughly 40 percent of the sovereign base-model market. Mistral and Mixtral together appear in 7 (19%). Gemma is in 6 (17%), Qwen in 5 (14%), and Whisper in 3 (8%).

Llama is the most-used disclosed base model in the Index’s sovereign model-project data. Some of the models in the Index that were built on Llama include Saudi Arabia’s ALLaM, Indonesia’s Sahabat-AI, Taiwan’s TAIDE, Greece’s Krikri, and Chile’s Latam-GPT.

Meta’s AI chief Alexandr Wang recently confirmed that the company’s new Muse Spark model underwent internal safety reviews of its biological, chemical, cyber, and loss-of-control capabilities (with chemical and biological capabilities crossing Meta’s “high risk” threshold pre-mitigation) and “in its current form is not suitable for open sourcing.” If Meta follows a more closed path for future frontier releases, many sovereign model projects will have to look elsewhere for their next open-weight base.

Mistral’s quiet year

The Index’s data shows that Mistral is the closest thing the world has to a non-U.S., non-Chinese frontier-class open-weight lab. Six sovereign model projects in the Index were built wholly or partly on Mistral or Mixtral, among them Denmark’s Danish Foundation Models, Bulgaria’s BgGPT, and Poland’s PLLuM.

PLLuM, announced in February 2025, was the most recent of these. Across that year of intense European AI sovereignty focus — the Paris AI Action Summit, the launch of Mistral Compute at VivaTech, and tons of French political attention on AI sovereignty — no new sovereign model project in the Index has been announced as Mistral-derived. The Mistral line in the dataset has flattened.

Mistral's footprint in sovereign cloud deployments does continue to grow, and Morocco's recent partnership with Mistral to develop national-language models in Arabic, Darija, and Amazigh shows Mistral remains a live option for new sovereign builds. But the data through December 2025 show that as a base model for new sovereign builds, Mistral plateaued last year.

Still, as Chart 3 suggests, European sovereign AI model output in 2025 came increasingly from from-scratch projects — including Switzerland’s Apertus, Spain’s ALIA, and the European Union’s OpenEuroLLM — rather than from new Mistral-based fine-tunes.

Qwen’s growth year

What makes the Mistral plateau more striking is what happened with Chinese models. Qwen — Alibaba’s open-weight family — appeared in exactly one sovereign model project before 2025: Singapore’s SEA-LION launched in Q4 2023. Then, beginning in Q1 2025, Qwen was added as a sovereign base at a rate of one new project per quarter, every quarter.

Three of these four 2025 Qwen-based projects are in jurisdictions that are key U.S. partners: Thailand twice, the UAE once. The fourth is in Uganda, with which the United States has made recent efforts to enhance its AI relationship. The most significant is K2-Think, the UAE’s open-source reasoning model released in September 2025 by MBZUAI and G42 — a flagship sovereign AI model built on Alibaba’s Qwen 2.5, released just months after the announcement of Stargate UAE.

It’s worth noting that five Qwen-based projects out of the 23 that build on a disclosed base is not a Chinese takeover of the open-weights ecosystem. Llama’s use through 2025 was much greater. But if Qwen were to continue at a pace of one new project per quarter, and assuming overall sovereign builds don’t accelerate dramatically, Chinese open weights could account for roughly a quarter of new sovereign model builds by the end of 2027. A challenge to this outcome, though, is the fact that Alibaba is signaling a more closed approach to its models. Like Llama, Qwen’s open future is uncertain.

The bottom line is that the sovereign AI model layer picture isn’t static, and how it looks a year from now depends on choices being made by companies (including emerging actors like Reflection AI) and governments as they find alternative open models or decide that building from scratch is worth the investment.

Outputs

According to the Index, 43 of the 50 large language model projects being built by governments outside the United States and China — roughly 86% — release weights in some open form or have been announced as open but haven’t yet been released.

Twenty-six are permissively licensed. Eleven release weights publicly but with some restrictions attached (e.g., required attribution or noncommercial use only). Six are announced as open but not yet released.

Only three are clearly closed according to the public data: Brazil’s SoberanIA at launch, Oman’s Moein for internal government use, and Vietnam’s Viettel administrative pilot. For four projects, the release status is unclear based on public information.

This pattern is consistent across both major project types.

The from-scratch models — e.g., Italy’s Minerva, Germany’s SOOFI, India’s BharatGen — release open weights. The larger group of projects that use existing open-weight bases generally releases under the license of the base model. The details vary, but the broader pattern across model-centric sovereign AI projects is overwhelmingly in the direction of releasing weights.

What to watch for

As of the end of last year, the sovereign AI model layer still depended heavily on a few open-weight suppliers. Llama was the most-used base model, making Meta’s next moves along the open-closed AI model spectrum particularly important. Meta’s Muse Spark decision and Alibaba’s signal that it is making similar moves for Qwen suggest the supply of open models for sovereign builders may be tightening. This could aid broader adoption of models like Gemma, and of Mistral, which plateaued in 2025 but shows signs of very much remaining in the mix. Sovereign builders may also look at other open models or decide to invest in from-scratch training.

A tightening supply could make from-scratch projects more consequential in two ways. First, from-scratch builders that release their own weights would add to the open-weight pool available to fine-tuners (Japan, India, Germany, and Korea are all pursuing from-scratch builds aimed at the frontier). Second, countries that might otherwise have fine-tuned on Llama or Qwen could decide that building from scratch is the more reliable path, a trend Chart 3 suggests is already underway in Europe. Each of these strategies carries cost, execution, and capability risks. How sovereign builders navigate these risks is something the Index will continue to track in future quarterly updates.1

The data in this post was drawn from the CNAS Sovereign AI Index (Chavez, Chilukuri, and Scanlon, CNAS). Project counts are current as of January 2026.

The full Index covers compute and data alongside the model-layer findings above, with project-level detail across 139 efforts in 56 jurisdictions. Please check it out and reach out with any feedback or questions you have.

Conflict of Interest Disclosure: The views I share in this essay are my own. Although I hold professional affiliations and advisory roles with several organizations, I don’t receive compensation or instructions from them (or any other entities) regarding the views I express in essays, panels, or other public and semi-public forums unless otherwise stated in a specific piece or setting. In some cases, I might receive an honorarium from the publisher or event organizer.

Discussion about this post

Ready for more?