Compute Governance

“Compute governance is the insight that AI capability cannot be directly measured, but training infrastructure can be seen, counted, and controlled which is why hardware has become the primary governance lever.” Compute governance refers to policy frameworks that use the computational resources required to train and run AI models measured in floating-point operations, chip types, or data center capacity as a practical proxy for AI capability, enabling governance interventions (export controls, notification requirements, safety evaluations) that do not require directly assessing model intelligence or dangerousness.

Executive Summary

Compute governance emerged as a distinct policy framework through the work of researchers at the Center for Security and Emerging Technology (CSET), Oxford Internet Institute, and Epoch AI, who argued that training compute provides a measurable, observable, and controllable proxy for AI capability. The framework has been institutionalized in three major governance instruments: the EU AI Act (10^25 FLOPs threshold for GPAI systemic risk), the US Executive Order on AI (reporting requirements for models trained above the same threshold), and US export controls on advanced AI chips. Its core advantage is tractability chips can be tracked, data centers can be inspected, and compute thresholds can be specified without requiring a definition of “dangerous AI.” Its core limitation is that efficiency improvements can deliver capable systems below the threshold.

The Strategic Mechanism

  • Compute thresholds as regulatory triggers: Governance obligations (safety testing, government notification, third-party auditing) are triggered when training runs exceed specified FLOPs thresholds, providing an objective, measurable criterion that avoids subjective capability assessment.
  • Hardware export-controls”>export controls: Restricting sales of advanced AI training chips (GPUs, TPUs, AI accelerators) to specific countries or entities limits the maximum compute available to restricted actors, constraining training scale. The US BIS October 2022-2024 controls on Nvidia H100/A100 exports to China are the primary implementation.
  • KYC for compute: Proposals to require cloud providers to implement “Know Your Customer” (KYC) verification for large AI training runs similar to financial services due diligence would create an observable transaction record for large-scale training without requiring chip hardware controls.
  • Data center surveillance: Physical data centers for large-scale AI training are visible in satellite imagery and commercial data, enabling monitoring of training cluster construction and capacity expansion without on-site inspection.
  • Efficiency as the governance challenge: As demonstrated by DeepSeek’s reported training of a GPT-4-class model for $6 million (January 2025), algorithmic efficiency improvements can deliver capable systems at compute levels well below current governance thresholds, creating persistent threshold calibration challenges.

Market & Policy Impact

  • The EU AI Act’s 10^25 FLOPs threshold for GPAI systemic risk classification was derived directly from compute governance research at the Oxford Internet Institute and CSET, representing the first legislative institutionalization of the framework.
  • The US Executive Order on AI (October 2023) requires companies to notify the federal government when training runs exceed 10^26 FLOPs 10x the EU threshold reflecting a higher risk tolerance or a political compromise with industry.
  • Anthropic’s Responsible Scaling Policy uses internal “AI Safety Levels” tied to capability thresholds, while DeepMind’s Frontier Safety Framework uses similar compute-and-capability markers illustrating industry adoption of compute governance logic for voluntary frameworks.
  • The proposed “KYC for Compute” framework, advocated by researchers at GovAI and CSET, would require cloud providers (AWS, Azure, Google Cloud) to verify the identity and purpose of clients conducting large training runs, creating a surveillance layer above the chip level.
  • China’s reported circumvention of H100 export controls through gray-market channels (documented in Bloomberg and Reuters reporting in 2023-2024) has forced regulators to extend controls to cloud access, not just hardware sales expanding compute governance from chips to services.

Modern Case Study: The 10^25 FLOPs Threshold and Its Origins, 2022-2024

The specific computational threshold that now anchors both the EU AI Act and the US Executive Order on AI 10^25 floating-point operations emerged from a 2022 policy paper by researchers at Oxford Internet Institute and the Center for Security and Emerging Technology. The paper argued that training compute is the most tractable proxy for frontier AI capability and proposed threshold-based governance triggers. The threshold was subsequently adopted by the UK’s Frontier AI Taskforce (later AI Safety Institute) in its model evaluation framework, incorporated into EU AI Act negotiations as the GPAI systemic risk criterion, and embedded in the US Executive Order’s reporting requirements. The episode illustrates how academic policy research can directly shape legislative language: a threshold derived from empirical analysis of training compute across known frontier models became the defining number in the world’s most consequential AI governance instruments within 18 months of publication. Whether the threshold remains calibrated as efficiency improvements lower the compute cost of comparable capability is the central ongoing governance question.