This document is an informal early draft of a forthcoming paper

A key impediment to the international governance of AI systems is the difficulty of credibly verifying claims about what AI accelerators are used for. The existing literature on verifiable computing contains a range of privacy-preserving mechanisms that could be used to ensure that certain computational claims are correct—that is, that they are not spoofed—but an international verification regime would face the novel technical challenge of ensuring that such claims are comprehensive—that is, that virtually all workloads performed on a given set of accelerators have been declared. In this paper, we propose verifiable memory exhaustion (VMX), a protocol by which a compute cluster can demonstrate that it is physically incapable of performing large-scale undeclared workloads by continuously proving that virtually all of its available device memory is occupied by *honest state—*that is, the state that a given device would have in memory if it was performing its declared operations at the claimed times—rendering the cluster meaningfully less useful for performing other computations. To enforce this, a trusted pinging device is used to periodically challenge the cluster’s devices to rapidly respond with succinct measurements of some of the state they currently have in memory, which are verified offline using privacy-preserving mechanisms. We develop the theoretical conditions under which our protocol renders it infeasible for accelerators to release honest state from memory, and we empirically estimate that a compute cluster subject to the protocol that claims to be running inference on model X would face Z computational overhead in training model Y. We conclude by discussing how VMX may be uniquely useful for enabling early international coordination on AI due to its light physical footprint.

1. Introduction

As frontier AI capabilities advance, it is becoming increasingly important for the international community to cooperate to place minimal guardrails on the development and deployment of powerful AI systems. However, parties to such agreements will face strong incentives to defect, which suggests we will need to develop robust mechanisms by which adversaries can verify one another’s compliance. Prior work has proposed personnel-based verification mechanisms such as interviews with engineers, site inspections, and audits of development processes, but these may fail to provide third parties with sufficient observability into what accelerators are being used for. There is also growing interest in the use of hardware-based verification mechanisms, which are appealing because they could directly observe properties of the computations being performed on accelerators—proposals here include on-chip mechanisms such as Confidential Computing and off-chip mechanisms such as network taps or analog sensors. However, hardware-based approaches may require extensive cooperation to deploy worldwide, and it may be challenging to develop such mechanisms on short timescales and secure them against tampering by state actors.

Software-based verification mechanisms could offer a more rapidly deployable alternative—if AI companies were to record detailed logs of the computations that they perform on their clusters, then it should be feasible to develop privacy-preserving mechanisms that allow third parties to verify that these logs are correct, even without trusting any of the company's personnel or hardware. One idea in the international verification literature is the joint development of "trusted clusters" that are secure enough to be given access to confidential information and model parameters so that they can re-execute a subset of claimed computations to check for consistency with the logs. Another idea would be to use zero-knowledge proof systems, compensating for their high overhead by only checking a very small subset of claimed computations. However, these approaches have a fundamental limitation: they verify that declared computations are correct but not that they are comprehensive. A cluster could generate valid proofs for its declared workload while simultaneously performing additional undeclared computations, and by default the verifier would have no way to detect this.

To address this problem, we propose verifiable memory exhaustion (VMX): a protocol by which a compute cluster can demonstrate that it is physically incapable of performing more than a small amount of undeclared computation by continuously proving that virtually all of its available device memory is occupied by honest state—that is, the state that a given device would have in memory if the cluster was performing its workload as claimed. To verify this, a single trusted pinging device deployed within the data center periodically broadcasts challenges to accelerators, requesting succinct measurements of some expensive state they currently have in memory. Accelerators must respond within tight latency bounds—too short to recompute state on demand—proving they already held the requested state in memory. To prevent clusters from caching honest state in external storage and freeing up accelerator memory for undeclared work, the protocol populates all nearby storage with verifiable filler state. VMX thus converts what the literature has treated as primarily a hardware problem into primarily a software problem, requiring only one trusted device per data center and otherwise making good use of cheap intelligence to help develop the requisite software.

We make the following contributions:

Protocol overview and security analysis: We present our protocol and establish theoretical conditions under which it achieves its security objective, estimating the computational overhead an adversarial cluster would face when attempting to perform undeclared training while claiming to run inference (Section 4).
Empirical evaluation: We characterize response latencies and security properties across three deployment configurations—datacenter network (DCN), PCIe, and public internet—showing how achievable latency bounds determine security guarantees (Section 5).
Open-source infrastructure: We release open-source infrastructure for capturing and reporting cluster dataflow in a verifiable format, extending traditional computational graphs to track data movement between storage devices, thus enabling verifiers to observe instantaneous memory utilization and capacity. The infrastructure also provides a red-teaming toolkit for testing the protocol and researching attacks.

2. Problem formulation

This is just a sketch

2.1. Setting

The prover has some set of $n$ devices $d_1, d_2, \dots d_n\in D$ in its cluster.
The prover executes some computational graph $G$ using $D$ in some given time period $T=[T_\text{start}, T_\text{end}]$
At the end of $T$, the prover reports the execution trace $Y^*$, which contains claims about how each operation in $G$ was physically executed, including:
- How the operations of $G$ were distributed across $D$
- Timestamps of the beginning and end of the execution of each operation
- References to all input and output data
During $T$, the prover may also attempt to execute some specific prohibited program $E$ on $D$ interleaved in some manner with $G$—but it will only do this if it thinks there is a low chance of being caught by the verifier