Back to the blog

Designing a Sovereign AI
system.

Cross-posted from the AI Impact Foundation blog. The data architecture and the AI stack are becoming the same architecture — here's what that means if you're building either.

A year ago, "sovereign AI" was a term you mostly heard from CIOs of national governments and a handful of European banks.

Today, it is in every enterprise AI procurement conversation.

Three things changed in 2024 and 2025: data residency became enforceable law in at least 40 countries; foundation-model providers consolidated to a handful of vendors, mostly American; and the post-cloud generation of enterprise architects realized that AI infrastructure and data infrastructure are no longer two distinct stacks.

If you are designing for enterprise — at scale, across borders, under regulators — you are now designing a sovereign AI system whether you intended to or not.

40+
Countries with enforceable data-residency laws in 2025. Fineable, not aspirational.
~80%
Share of frontier AI inference flowing through 3 to 4 US-based providers.
18mo
From "sovereign AI" as niche to "sovereign AI" as procurement default.

Why sovereign AI went from niche to mandatory.

Five forces converged. Each one is already showing up in 2025 RFPs.

01

Data residency law

EU AI Act, India's DPDP Act, China's Data Security Law, UAE National AI Strategy, Saudi Arabia's PDPL, plus 30+ others. These are not aspirational anymore — penalties are enforceable.

Consequence

Foreign-controlled AI inference on regulated data is a fineable offense in most jurisdictions. Vendor questions now include "where exactly do the prompts and embeddings rest?"

02

Vendor concentration

~80% of frontier AI inference flows through 3 to 4 US-based providers. Enterprises cannot accept single-vendor operational risk for workloads as load-bearing as AI infrastructure.

Consequence

BYOM and open-weights are no longer "nice to have." They are the baseline ask in any RFP.

03

Export controls

US AI Diffusion rules and counter-restrictions from China. The list of jurisdictions you cannot move model weights across is growing, not shrinking.

Consequence

Software architectures that assume model weights can move freely will need re-design. Sovereign on-prem is the architectural answer to this category of risk.

04

Dark-site demand

Air-gapped AI inference, once a defense-only requirement, is now spec for health systems, banks, energy operators, and government agencies.

Consequence

Every part of the AI stack — model serving, vector store, eval, telemetry — must work without internet. No phone-home. No external licensing checks.

05

Cost crossover

API-based inference at enterprise scale is now genuinely more expensive than running your own infrastructure. The break-even crossed in 2024–2025 for any sustained workload.

Consequence

On-prem AI is now a CFO conversation, not just a CIO conversation. The procurement question shifted from "can we?" to "what is the run-rate savings?"

What sovereign AI actually means.

Sovereign AI is a stack that satisfies four properties simultaneously:

A "private cloud" is not sovereign. A "VPC deployment of someone else's model" is not sovereign. A "self-hosted model" without ops infrastructure around it is not sovereign.

Sovereign AI = all four properties, designed in from the start.

The architecture convergence.

This is the most important architectural shift of the decade, and most enterprise buyers do not yet see it: the data architecture and the AI architecture are becoming the same architecture.

2015 — 2022

Two stacks, side by side

Application layer
ML platform (SageMaker, Vertex, MLflow)
Model training / serving
… stitching, custom plumbing, drift …
Data warehouse / lake / lakehouse
Source systems
Two governance models, two security models, two lineage stories. The integration layer was where projects went to die.
2024 →

One converged stack

Application + agent layer
Vector + relational + feature store
ModelOps + AgentOps
Governance, lineage, security
Source systems
Vector stores became a column type. Feature stores became data-tier services. RAG made the data read part of every inference. The two stacks were always going to merge. They have.

The implication for AI vendors: if you do not think like a data platform vendor — governance, lineage, multi-tenancy, security, time-travel — you will spend the next five years stitching that work in at great cost.

The implication for data platform vendors: they have spent decades on exactly those problems. They arrive at AI from the right starting point. The next decade of enterprise AI is going to be played on data-platform turf.

Vector stores became a column type. Feature stores became data-tier services. The two stacks were always going to merge. They have.

The six-layer sovereign AI stack.

A sovereign AI system needs six capabilities, all designed to run in dark sites and in a wide range of hardware conditions.

Layer 1

Bring Your Own Models

Open-weights, fine-tunes, customer models, routing.

  • Open-weights support (Llama, Mistral, Qwen, DeepSeek, Falcon, Gemma)
  • Customer fine-tunes as first-class objects
  • Model registry with provenance & license metadata
  • Routing by task, cost, jurisdiction
Fails when bolted on: the customer pays integration tax with every new foundation model.
Layer 2

Sovereign Vector Store

Co-located with the relational tier, governance-parity.

  • Co-located with relational store, not a sidecar
  • Hardware-accelerated where available, CPU-fallback where not
  • Lineage and governance parity with data tier
  • No leakage to external embedding APIs
Fails as a sidecar: two governance models, twice the audit surface.
Layer 3

ModelOps

Deploy, version, eval, monitor, rollback.

  • All operations without external calls
  • Customer-data evals that run in-cluster
  • Drift detection that does not phone home
  • Shadow and canary deployment patterns
Fails when it assumes a SaaS control plane — where "private cloud" stories collapse in a dark site.
Layer 4

Agent Builder

Assemble agents from tools you trust.

  • Low-code surface for a curated tool catalog
  • Tool registry with security review and scopes
  • Memory and state in the customer's data tier
  • Human-in-the-loop checkpoints in ITSM flows
Fails without governance: every team ships sovereignty-violating agents with the best intentions.
Layer 5

Agent Ops

Telemetry, audit, permissions, kill switches.

  • Per-agent telemetry and cost attribution
  • Audit log of every agent decision and tool call
  • Permission boundaries per agent and user
  • Kill switches at every layer
ModelOps without AgentOps means you observe the model but not the autonomous behavior on top. Most incidents live here.
Layer 6

Governance & Compliance

Audit-ready in your regulator's language.

  • Model cards, system cards, provenance — auto-generated
  • Mapping to EU AI Act, NIST AI RMF, ISO/IEC 42001
  • Audit-trail export in the regulator's vocabulary
  • Standing review gates that can block a launch
Bolted on after deployment, governance becomes a paperwork exercise the auditor sees through.

Designed for the world that actually exists.

Two design constraints most US-headquartered AI vendors quietly ignore. Either one will eliminate 60–80% of the addressable global market if you get it wrong.

The hardware reality outside hyperscale.

The newest GPU racks assume liquid-cooled data centers. Most of the world's data centers — particularly in India, Africa, Southeast Asia, Latin America — are air-cooled. Liquid-cooling retrofits are expensive and slow.

130 kW
Per-rack density assumed by the latest liquid-cooled GPU servers. Achievable in less than 20% of global data centers in 2025.
~30 kW
Per-rack density typical of air-cooled installations across India, Africa, Southeast Asia. What the rest of the world actually runs on.
100ms+
Inter-region latency in much of South Asia, sub-Saharan Africa, Andean Latin America. The "always-connected control plane" assumption breaks here.
Local
Power, language, regulation: not 240V, not English, not GDPR-equivalent. Designs that assume hyperscale conditions ship in five countries and stall everywhere else.

Real sovereign AI software is hardware-aware in two ways: topology-aware scheduling that maximizes throughput on air-cooled, lower-density racks, and inference patterns — smaller models, quantization, batching — that don't require the latest hardware to be cost-effective.

It is also network-aware: offline-first deployment, local-first updates, graceful degradation under bandwidth constraints. The control plane cannot assume it can reach the mothership.

The kill-switch realization.

For decades, enterprises in the Global South — and, increasingly, in Europe and Asia — operated on an unspoken assumption: that their core software stack ran on US-controlled infrastructure, with US-controlled APIs, under US-controlled licensing. That assumption is no longer holding.

The recognition that any US administration could, in principle, throttle access to a critical SaaS or AI service has shifted from theoretical to operational. CIOs, CDOs, and ministers of digital transformation in dozens of countries are now writing policies to remove single points of US-controlled failure from their stack.

Call it the kill-switch realization. It has seven direct architectural consequences for AI — each one already showing up in 2025 RFPs.

01

The AI stack must consolidate — and become frictionless

Today an enterprise pieces together model serving, vector store, MLOps, eval, agent runtime, and governance from six to twelve vendors. That stitching is fragile, expensive, and a sovereignty risk in itself. The platforms that win will offer one tightly integrated, frictionless stack — one install, one upgrade path, one audit surface.

02

Vector becomes a first-class data type

Not a sidecar service. Not a separate vendor. A column type in the relational store, with the same governance, lineage, and security model as every other piece of enterprise data. Anything else creates two governance domains and twice the audit surface.

03

The economics of GenAI have to work out

API inference at scale is now demonstrably more expensive than running your own. The on-prem economic argument has crossed the threshold. Sovereign AI is no longer a premium paid for governance — it is the cost-rational choice for sustained workloads. The CFO is now in the room.

04

Independence from US cloud providers

Not just dual-cloud. Not just multi-cloud. The ability to run the full stack on European hyperscalers, regional sovereign clouds, and on-prem — without losing functionality, governance posture, or supportability. The reference architecture cannot assume any one provider stays reachable.

05

Open-source and on-prem model support is table stakes

Llama, Mistral, Qwen, DeepSeek, Falcon, Gemma — every serious enterprise platform now has to support the open-weights catalog as a first-class option. Closed-API-only stories are losing every RFP that asks the sovereignty question, which is now most of them.

06

Fine-tuning on-prem

Not in someone else's cloud. Not via a managed service that reads your data into a foreign tenant. On the customer's own infrastructure, with the customer's own data, under the customer's own audit trail. The IP of the fine-tune is the customer's, and the data never leaves their perimeter.

07

Build and manage agents on-prem

Agent runtimes, agent ops, tool registries, agent telemetry — all in-house, all auditable, all kill-switchable by the customer rather than the vendor. An agent that depends on a foreign control plane to keep running is, by definition, not sovereign.

These seven are no longer preferences. They are becoming procurement filters. A platform that fails any one of them will struggle to win enterprise deals in 2026 and beyond.

The kill-switch realization isn't political. It's architectural. It is now a procurement filter.

What this means for the rest of the field.

If you are an enterprise architect, sovereign AI is part of your reference architecture, whether you have drawn it in yet or not. Start with the six-layer question.

If you are an AI vendor, "deploy our model on our cloud and call our API" is ending for any workload above a certain size or sensitivity. The motion is on-prem-friendly, BYOM, dark-site-ready, audit-clean.

If you are a buyer, ask the six-layer question explicitly. Most vendors will pass three and fail three. That tells you the integration burden you are signing up for.

If you are a regulator, the architecture of trust is being defined right now. The companies that build it well will earn the policy frameworks they want. The ones who do not will get the frameworks no one wants.

The closing argument.

The sovereign AI shift is not a backlash to globalization. It is the same maturation that happened to the cloud a decade ago — once the technology matters enough, governance and locality become inseparable from architecture.

The architecture has converged. The hardware has globalized. The regulation has caught up. The vendor landscape is consolidating. The next eighteen months will decide which sovereign AI stacks become the reference architecture for the next decade.

That's the work. Come help us design it.

M

Meeta Vouk

Founder, AI Impact Foundation. VP of Product at Teradata. Adjunct professor at NC State. 22 patents, 20+ years building enterprise AI — and a permanent belief that the platforms treating data and AI as one architecture will win the next decade.

Designing sovereign AI inside your enterprise?

Enterprise architects, platform leaders, data-tier engineers, and policy people building the next reference architecture — I want to compare notes.

Get in touch →