Classifying sources of AI x-risk

By Sam Clarke @ 2022-08-08T18:18 (+40)

There are many potential sources of x-risk from AI, and wide disagreement/uncertainty about which are the most important. To help move towards greater clarity, it seems valuable to have a better classification of potential sources of AI x-risk. This is my quick attempt to contribute to that. I don't consider it to be fully satisfying or decisive in any way. Suggestions for improvement are very welcome!

Summary diagram

See here for a more comprehensive version of the diagram.

Misaligned power-seeking AI

This is the most discussed source of AI x-risk (e.g. it's what people remember from reading Superintelligence). The worry is that highly capable and strategic AI agents will have instrumental incentives to gain and maintain power—since this will help them pursue their objectives more effectively—and this will lead to the permanent disempowerment of humanity. (More.)

AI exacerbates other sources of x-risk

As well as causing an existential catastrophe "in itself", AI technology could exacerbate other sources of x-risk (this section), or x-risk factors (next section).

AI-enabled dystopia

The worry here is that AI technology causes humanity to get stuck in some state that is far short of our potential. There are at least three ways that this could happen:

AI leads to deployment of technology that causes extinction or unrecoverable collapse

AI could lead to the development and deployment of technologies that cause an existential catastrophe, by enabling faster technological progress or altering incentives. For instance:

AI exacerbates x-risk factors

AI makes conflict more likely/severe, which is an x-risk factor

AI could make conflict more likely or severe for various reasons, for instance by:

Conflict is a destabilising factor which reduces our ability to mitigate other potential x-risks and steer towards a flourishing future for humanity, e.g. because it erodes international trust and cooperation.

(Note: if the conflict is sufficiently severe to cause extinction or unrecoverable collapse, then it's part of the above section, not this one. This section is about conflict as a risk factor, not the final blow.)

AI degrades epistemic processes, which is an x-risk factor

AI could worsen our epistemic processes: how information is produced and distributed, and the tools and processes we use to make decisions and evaluate claims. For example:

It's likely that a degradation of epistemic processes would reduce our ability steer towards a flourish future, e.g. by causing a decline in trust in credible multipartisan sources, which could hamper attempts at cooperation and collective action.

S-risks from conflict between powerful AI systems

As AI systems become more capable and integral to society, we may also need to consider potential conflicts that could arise between AI systems, and especially the results of strategic threats by powerful AI systems (or AI-assisted humans) against altruistic values. For example, if it's possible to create digital people (or other digital entities with moral patienthood), then advanced AI systems—even amoral ones—could be incentivised to threaten the creation of suffering digital people as a way of furthering their own goals (even if those goals are amoral). (More.)


  1. Using Ord's nomenclature from The Precipice, the "lame future" scenario is an instance of a desired dystopia, while the "stable totalitarianism" and "value erosion" scenarios are instances of an enforced dystopia and undesired dystopia, respectively. ↩︎


Mauricio @ 2022-08-08T23:47 (+5)

Thanks for posting! Tentative idea for tweaks: my intuition would be to modify the middle two branches into the following:

Rationale:

+1 for not including ~"misaligned, non-power-seeking AI"; that seems to be a somewhat common misinterpretation of some AI concerns.

Edit: good point in the below response!

Sam Clarke @ 2022-08-09T10:15 (+3)

Thanks, I agree with most of these suggestions.

"Other (AI-enabled) dangerous tech" feels to me like it clearly falls under "exacerbating other x-risk factors"

I was trying to stipulate that the dangerous tech was a source of x-risk in itself, not just a risk factor (admittedly the boundary is fuzzy). The wording was "AI leads to deployment of technology that causes extinction or unrecoverable collapse" and the examples (which could have been clearer) were intended to be "a pathogen kills everyone" or "full scale nuclear war leads to unrecoverable collapse"

Zach Stein-Perlman @ 2022-08-08T18:38 (+5)

I'm a fan of typologies/trees like this.

If you liked this post, you might also be interested in: 

jskatt @ 2022-08-16T19:06 (+3)

Definitely check out the paper "X-Risk Analysis for AI Research" (from the Center for AI Safety website).