A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management

By simeon_c @ 2025-03-13T18:29 (+6)

This is a linkpost to https://arxiv.org/abs/2502.06656

We (SaferAI) propose a risk management framework which we think should improve substantially upon existing Frontier Safety Frameworks if followed. It introduces and borrows a range of practice and concepts from other areas of risk management to introduce conceptual clarity and generalize some early intuitions that the field of AI safety independently came up with.

To maintain readability from people in the AI field, we didn’t make the risk management framework fully adequate yet.[1]

To give you a taste, here are some of our risk management framework unique features:

We summarize below the risk management framework components:

 

We welcome feedback, here, by DMs or emails. 

  1. ^

    One example is that to have a fully adequate risk management framework, given that the conditions of deployment and the number of possible instances of a model are a significant risk factor, the thresholds upon which we condition mitigations should in fact be (capabilities ; deployment conditions) thresholds. For simplicity and so that it is reasonably feasible to go from the current developers’ risk framework, we reserve that consideration for future updates of the framework, once the field has stepped up.