What are Responsible Scaling Policies (RSPs)?

By Vishakha Agrawal, Algon @ 2025-04-05T16:05 (+2)

This is a linkpost to https://www.lesswrong.com/posts/NiWxL7GJGG2fd22Jh/what-are-responsible-scaling-policies-rsps

This is an article in the featured articles series from AISafety.info. AISafety.info writes AI safety intro content. We'd appreciate any feedback

The most up-to-date version of this article is on our website, along with 300+ other articles on AI existential safety.

METR[1] defines a responsible scaling policy (RSP) as a specification of “what level of AI capabilities an AI developer is prepared to handle safely with their current protective measures, and conditions under which it would be too dangerous to continue deploying AI systems and/or scaling up AI capabilities until protective measures improve.”

Anthropic was the first company to publish an RSP in September 2023 defining 4 AI Safety Levels.

“A very abbreviated summary of the ASL system is as follows:

 

Other AI companies[2] have released their own versions of such documents with various names:

RSPs have received positive and negative reactions from the AI safety community. Evan Hubinger of Anthropic, for instance, argues that they are “pauses done right”; others are more skeptical. Objections to RSPs include that they serve to relieve regulatory pressure and shift the "burden of proof" from the people working on capabilities to people concerned about safety while serving only as a promissory note rather than an actual policy.

  1. ^

    Formerly known as ARC Evals.

  2. ^

    More companies still have committed to publish such documents.