What are Responsible Scaling Policies (RSPs)?

By Vishakha Agrawal, Algon @ 2025-04-05T16:05 (+2)

This is a linkpost to https://www.lesswrong.com/posts/NiWxL7GJGG2fd22Jh/what-are-responsible-scaling-policies-rsps

This is an article in the featured articles series from AISafety.info. AISafety.info writes AI safety intro content. We'd appreciate any feedback.

The most up-to-date version of this article is on our website, along with 300+ other articles on AI existential safety.

METR^[1] defines a responsible scaling policy (RSP) as a specification of “what level of AI capabilities an AI developer is prepared to handle safely with their current protective measures, and conditions under which it would be too dangerous to continue deploying AI systems and/or scaling up AI capabilities until protective measures improve.”

Anthropic was the first company to publish an RSP in September 2023 defining 4 AI Safety Levels.

“A very abbreviated summary of the ASL system is as follows:
ASL-1 refers to systems which pose no meaningful catastrophic risk, for example a 2018 LLM or an AI system that only plays chess.
ASL-2 refers to systems that show early signs of dangerous capabilities – for example ability to give instructions on how to build bioweapons – but where the information is not yet useful due to insufficient reliability or not providing information that e.g. a search engine couldn’t. Current LLMs, including Claude, appear to be ASL-2.
ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities.
ASL-4 and higher (ASL-5+) is not yet defined as it is too far from present systems, but will likely involve qualitative escalations in catastrophic misuse potential and autonomy.”
Other AI companies^[2] have released their own versions of such documents with various names:

Other AI companies^[2] have released their own versions of such documents with various names:

OpenAI's 2023 beta version of their Preparedness Framework
Deepmind's 2024 Frontier Safety Framework
Microsoft’s 2025 Frontier Governance Framework
Meta’s 2025 Frontier AI Framework
Amazon’s 2025 Frontier Model Safety Framework

RSPs have received positive and negative reactions from the AI safety community. Evan Hubinger of Anthropic, for instance, argues that they are “pauses done right”; others are more skeptical. Objections to RSPs include that they serve to relieve regulatory pressure and shift the "burden of proof" from the people working on capabilities to people concerned about safety while serving only as a promissory note rather than an actual policy.

Further reading:
METR’s key components of an RSP
METR’s common elements of existing RSPs
Anthropic’s Nick Joseph on RSPs on the 80,000 hours podcast
SaferAI’s comparison of OpenAI’s RDP and Anthropic’s RSP
The Center for Governance of AI’s proposal of a grading rubric for RSPs
longerrambling’s analysis of published RSPs

^{^}
Formerly known as ARC Evals.
^{^}
More companies still have committed to publish such documents.