ASI existential risk: reconsidering alignment as a goal

By Matrice Jacobine @ 2025-04-15T13:36 (+26)

This is a linkpost to https://michaelnotebook.com/xriskbrief/index.html

This is the text for a talk exploring why experts disagree so strongly about whether artificial superintelligence (ASI) poses an existential risk to humanity. I review some key arguments on both sides, emphasizing that the fundamental danger isn't about whether "rogue ASI" gets out of control: it's the raw power ASI will confer, and the lower barriers to creating dangerous technologies. This point is not new, but has two underappreciated consequences. First, many people find rogue ASI implausible, and this has led them to mistakenly dismiss existential risk. Second: much work on AI alignment, while well-intentioned, speeds progress toward catastrophic capabilities, without addressing our world's potential vulnerability to dangerous technologies.


Chris Leong @ 2025-04-17T09:20 (+2)

This article is extremely well written and I really appreciated how well he supported his positions with facts.

However, this article seems to suggest that he doesn't quite understand the argument for making alignment the priority. This is understandable as it's rarely articulated clearly. The core limitation of differential tech development/d/acc/coceleration is that these kinds of imperfect defenses only buy time (this judgment can be justified with the articles he provides in his article). An aligned ASI, if it were possible, would be capable of a degree of perfection beyond that of human institutions. This would give us a stable long-term solution. Plans that involve less powerful AIs or a more limited degree of alignment mostly do not

Matrice Jacobine @ 2025-04-17T15:45 (+3)

Answering on the LW thread

Sharmake @ 2025-04-17T14:43 (+1)

I agree with most of this, albeit I have 2 big disagreements with the article:

  1. I think alignment is still important and net-positive, but yeah I've come to think it's no longer the number 1 priority, for the reasons you raise.
     

2. I think with the exception of biotech and maybe nanotech, no plausible technology in the physical world can actually become a recipe for ruin, unless we are deeply wrong about how the physical laws of the universe work, so we can just defer that question to AI superintelligences.

The basic reason for this is that once you are able to build dyson swarms, the fact that space is big and the speed of light is a huge barrier means it's very, very easy to close off a network to prevent issues from spreading, and I think that conditional on building aligned ASI, dyson swarms are likely to be built within 100 years.


And even nanotech has been argued to be way less powerful than people think it is, and @Muireall has argued against nanotech being powerful here:

https://muireall.space/pdf/considerations.pdf#page=17

https://forum.effectivealtruism.org/posts/oqBJk2Ae3RBegtFfn/my-thoughts-on-nanotechnology-strategy-research-as-an-ea?commentId=WQn4nEH24oFuY7pZy

https://muireall.space/nanosystems/