What would a compute monitoring plan look like? [Linkpost]

By undefined @ 2023-03-26T19:33 (+61)

This is a linkpost to https://arxiv.org/abs/2303.11341

Yonadav Shavit (CS PhD student at Harvard) recently released a paper titled What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring.

The paper describes a compute monitoring regime that could allow governments to monitor training runs and detect deviations from training run regulations.

I think it's one of the most detailed public write-ups about compute governance, and I recommend AI governance folks read (or skim) it. A few highlights below (bolding mine). 

Abstract:

As advanced machine learning systems' capabilities begin to play a significant role in geopolitics and societal order, it may become imperative that (1) governments be able to enforce rules on the development of advanced ML systems within their borders, and (2) countries be able to verify each other's compliance with potential future international agreements on advanced ML development. This work analyzes one mechanism to achieve this, by monitoring the computing hardware used for large-scale NN training. The framework's primary goal is to provide governments high confidence that no actor uses large quantities of specialized ML chips to execute a training run in violation of agreed rules. At the same time, the system does not curtail the use of consumer computing devices, and maintains the privacy and confidentiality of ML practitioners' models, data, and hyperparameters. The system consists of interventions at three stages: (1) using on-chip firmware to occasionally save snapshots of the the neural network weights stored in device memory, in a form that an inspector could later retrieve; (2) saving sufficient information about each training run to prove to inspectors the details of the training run that had resulted in the snapshotted weights; and (3) monitoring the chip supply chain to ensure that no actor can avoid discovery by amassing a large quantity of un-tracked chips. The proposed design decomposes the ML training rule verification problem into a series of narrow technical challenges, including a new variant of the Proof-of-Learning problem [Jia et al. '21]. 

Solution overview:

In this section, we outline a high-level technical plan, illustrated in Figure 1, for Verifiers to monitor Provers’ ML chips for evidence that a large rule-violating training occurred. The framework revolves around chip inspections: the Verifier will inspect a sufficient random sample of the Prover’s chips (Section 3.2), and confirm that none of these chips contributed to a rule-violating training run. For the Verifier to ascertain compliance from simply inspecting a chip, we will need interventions at three stages: on the chip, at the Prover’s data-center, and in the supply chain.

These steps, put together, enable a chain of guarantees:

Thus, so long as the Prover complies with the Verifier’s steps, the Verifier will detect the Prover’s rule-violation with high probability. Just as in financial audits, a Prover’s refusal to comply with the verification steps would itself represent an indication of guilt.


tamgent @ 2023-03-27T20:39 (+1)

Nice paper on the technical ways you could monitor compute usage, but governance-wise, I think we're extremely behind on anything making an approach like this remotely plausible (unless I'm missing something, which I may well be).

If we put aside the question b) in the abstract, of getting international compliance, and just focus on a) national governments regulating this for their own citizens. This likely requires some kind of regulatory authority with the remit and the authority to do this. This includes information gathering powers, which require companies by law to give specified information to the regulator. Such powers are common in regulation. However, we do not have AI regulators or even tech regulators (with the exception of data protection whose remit is more specific). We have a bunch of sector regulators, and some cross-sectoral ones (such as data protection, competition etc). The closest regulatory regime to being able to legally do something like this that I'm aware of is the EU, via the EU's AI Act, still in draft. Under this horizontal legislation which is not sector specific it will regulate all high risk AI systems (the annexes stipulate examples of what they consider high-risk). However, they have not defined compute as a relevant risk parameter (to my knowledge, although I think they have a new thing on General Purpose AI systems which could put this in so you might want to influence that, but I'm not sure what their capacity to enforce looks like).

No other western government has a comparable AI regulation plan. The US have a voluntary risk management framework. The UK has a largely voluntary policy framework they're developing (although they are starting to introduce more tech regulation some of which will include AI regulation).

Of course there are other parts of governments than regulators - and I'd really like it if 'compute monitoring' started to pay attention to how differently these different parts might use such a capability. One advantage of regulators is that they have clear, specified, remits and transparency requirements they routinely balance with confidentiality obligations. Other government departments may have more latitude and less transparency.