Frontier AI Regulation

By Zach Stein-Perlman @ 2023-07-10T14:30 (+56)

This is a linkpost to https://arxiv.org/pdf/2307.03718.pdf

This paper is about (1) "government intervention" to protect "against the risks from frontier AI models" and (2) some particular proposed safety standards. It's by Markus Anderljung, Joslyn Barnhart (Google DeepMind), Jade Leung (OpenAI governance lead), Anton Korinek, Cullen O'Keefe (OpenAI), Jess Whittlestone, and 18 others.

Abstract

Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term “frontier AI” models — highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilities can arise unexpectedly; it is difficult to robustly prevent a deployed model from being misused; and, it is difficult to stop a model’s capabilities from proliferating broadly. To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models. Industry self-regulation is an important first step. However, wider societal discussions and government intervention will be needed to create standards and to ensure compliance with them. We consider several options to this end, including granting enforcement powers to supervisory authorities and licensure regimes for frontier AI models. Finally, we propose an initial set of safety standards. These include conducting pre-deployment risk assessments; external scrutiny of model behavior; using risk assessments to inform deployment decisions; and monitoring and responding to new information about model capabilities and uses post-deployment. We hope this discussion contributes to the broader conversation on how to balance public safety risks and innovation benefits from advances at the frontier of AI development.

Executive Summary

The capabilities of today’s foundation models highlight both the promise and risks of rapid advances in AI. These models have demonstrated significant potential to benefit people in a wide range of fields, including education, medicine, and scientific research. At the same time, the risks posed by present-day models, coupled with forecasts of future AI progress, have rightfully stimulated calls for increased oversight and governance of AI across a range of policy issues. We focus on one such issue: the possibility that, as capabilities continue to advance, new foundation models could pose severe risks to public safety, be it via misuse or accident. Although there is ongoing debate about the nature and scope of these risks, we expect that government involvement will be required to ensure that such "frontier AI models” are harnessed in the public interest.

Three factors suggest that frontier AI development may be in need of targeted regulation: (1) Models may possess unexpected and difficult-to-detect dangerous capabilities; (2) Models deployed for broad use can be difficult to reliably control and to prevent from being used to cause harm; (3) Models may proliferate rapidly, enabling circumvention of safeguards.

Self-regulation is unlikely to provide sufficient protection against the risks from frontier AI models: government intervention will be needed. We explore options for such intervention. These include:

Next, we describe an initial set of safety standards that, if adopted, would provide some guardrails on the development and deployment of frontier AI models. Versions of these could also be adopted for current AI models to guard against a range of risks. We suggest that at minimum, safety standards for frontier AI development should include:

Going forward, frontier AI models seem likely to warrant safety standards more stringent than those imposed on most other AI models, given the prospective risks they pose. Examples of such standards include: avoiding large jumps in capabilities between model generations; adopting state-of-the-art alignment techniques; and conducting pre-training risk assessments. Such practices are nascent today, and need further development.

The regulation of frontier AI should only be one part of a broader policy portfolio, addressing the wide range of risks and harms from AI, as well as AI’s benefits. Risks posed by current AI systems should be urgently addressed; frontier AI regulation would aim to complement and bolster these efforts, targeting a particular subset of resource-intensive AI efforts. While we remain uncertain about many aspects of the ideas in this paper, we hope it can contribute to a more informed and concrete discussion of how to better govern the risks of advanced AI systems while enabling the benefits of innovation to society.

Commentary

It is good that this paper exists. It's mostly good because it's a step (alongside Model evaluation for extreme risks) toward making good actions for AI labs and government more mainstream/legible. It's slightly good because of its (few) novel ideas; e.g. Figure 3 helps me think slightly more clearly. I don't recommend reading beyond the executive summary.

Unfortunately, this paper's proposals are unambitious (in contrast, in my opinion, to Model evaluation for extreme risks, which I unreservedly praised), such that I'm on-net disappointed in the authors (and may ask some if they agree it's unambitious and why it is). Some quotes below, but in short: it halfheartedly suggests licensing. It doesn't suggest government oversight of training runs or compute. It doesn't discuss when training runs should be stopped/paused (e.g., when model evaluations for dangerous capabilities raise flags). (It also doesn't say anything specific about international action but it's very reasonable for that to be out of scope.)

On licensing, it correctly notes that

Enforcement by supervisory authorities penalizes non-compliance after the fact. A more anticipatory, preventative approach to ensuring compliance is to require a governmental license to widely deploy a frontier AI model, and potentially to develop it as well.

But then it says:

Licensing is only warranted for the highest-risk AI activities, where evidence suggests potential risk of large-scale harm and other regulatory approaches appear inadequate. Imposing such measures on present-day AI systems could potentially create excessive regulatory burdens for AI developers which are not commensurate with the severity and scale of risks posed. However, if AI models begin having the potential to pose risks to public safety above a high threshold of severity, regulating such models similarly to other high-risk industries may become warranted.

Worse, on after-the-fact enforcement, it says:

Supervisory authorities could “name and shame” non-compliant developers. . . . The threat of significant administrative fines or civil penalties may provide a strong incentive for companies to ensure compliance with regulator guidance and best practices. For particularly egregious instances of non-compliance and harm ["For example, if a company repeatedly released frontier models that could significantly aid cybercriminal activity, resulting in billions of dollars worth of counterfactual damages, as a result of not complying with mandated standards and ignoring repeated explicit instructions from a regulator"], supervisory authorities could deny market access or consider more severe penalties [viz. "criminal sentences"].

This is overdeterminedly insufficient for safety. "Not complying with mandated standards and ignoring repeated explicit instructions from a regulator" should not be allowed to happen, because it might kill everyone. A single instance of noncompliance should not be allowed to happen, and requires something like oversight of training runs to prevent. Not to mention that denying market access or threatening prosecution are inadequate. Not to mention that naming-and-shaming and fining companies are totally inadequate. This passage totally fails to treat AI as a major risk. I know the authors are pretty worried about x-risk; I notice I'm confused.

Next:

While we believe government involvement will be necessary to ensure compliance with safety standards for frontier AI, there are potential downsides to rushing regulation.

This is literally true but it tends to misinform the reader on the urgency of strong safety standards and government oversight, I think.

On open-sourcing, it's not terrible; it equivocates but says "proliferation via open-sourcing" can be dangerous and

prudent practices could include . . . . Having the legal and technical ability to quickly roll back deployed models on short notice if the risks warrant it, for example by not open-sourcing models until doing so appears sufficiently safe.

The paper does say some good things. It suggests that safety standards exist, and that they include model evals, audits & red-teaming, and risk assessment. But it suggests nothing strong or new, I think.

The authors are clearly focused on x-risk, but they clearly tone that down. This is mostly demonstrated above, but also note that they phrase their target as mere "high severity and scale risks": "the possibility that continued development of increasingly capable foundation models could lead to dangerous capabilities sufficient to pose risks to public safety at even greater severity and scale than is possible with current computational systems." Their examples include AI "evading human control" but not killing everyone or disempowering humanity or any specific catastrophes.


I'd expect something stronger from these authors. Again, I notice I'm confused. Again, I might ask some of the authors, or maybe some will share their thoughts here or in some other public place.


Updates & addenda

Thanks to Justin, one of the authors, for replying. In short, he says:

I think your criticism that the tools are not ambitious is fair. I don't think that was our goal. I saw this project as a way of providing tools for which there is broad agreement and that given the current state of AI models we believe would help steer AI development and deployment in a better direction. I do think that another reading of this paper is that it's quite significant that this group agreed on the recommendations that are made. I consider it progress in the discussion of how to effectively govern increasingly power AI models, but it's not the last word either. :)

We also have a couple disagreements about the text.

 

Thanks to Markus, one of the primary authors, for replying. His reply is worth quoting in full:

Thanks for the post and the critiques. I won't respond at length, other than to say two things: (i) it seems right to me that we'll need something like licensing or pre-approvals of deployments, ideally also decisions to train particularly risky models.  Also that such a regime would be undergirded by various compute governance efforts to identify and punish non-compliance. This could e.g. involve cloud providers needing to check if a customer buying more than X compute [has] the relevant license or confirm that they are not using the compute to train a model above a certain size. In short, my view is that what's needed are the more intense versions of what's proposed in the paper. Though I'll note that there are lots of things I'm unsure about. E.g. there are issues with putting in place regulation while the requirements that would be imposed on development are so nascent. 

(ii) the primary value and goal of the paper in my mind (as suggested by Justin) is in pulling together a somewhat broad coalition of authors from many different organizations making the case for regulation of frontier models. Writing pieces with lots of co-authors is difficult, especially if the topic is contentious, as this one is, and will often lead to recommendations being weaker than they otherwise would be. But overall, I think that's worth the cost. It's also useful to note that I think it can be counterproductive for calls for regulation (in particular regulation that is considered particularly onerous) to be coming loudly from industry actors, who people may assume have ulterior motives. 

Note that Justin and Markus don't necessarily speak for the other authors.

 

GovAI has a blogpost summary.

 

Jess Whittlestone has a blogpost summary/commentary.

 

GovAI will host a webinar on the paper on July 20 at 8am PT.

 

Markus has a Twitter summary.

 

The paper is listed on the OpenAI research page and so is somewhat endorsed by OpenAI.