An Argument for Focusing on Making AI go Well

By Chris Leong @ 2023-12-28T13:25 (+13)

Apologies for writing this up quickly, but otherwise it'd likely never be written up at all as I've been wanting to write up something like this for at least the last year. If you think this is useful, feel completely free to copy this and write it up better.

TBH, I think that the exact argument is less important than the meta-point about how to deal with uncertainty: 1) try to figure out some robustly true statements 2) try to figure out which statements have an uncomfortably high chance of being true 3) see what you get by combining the two.

Premise 1: AGI is possible
Likelihood: Pretty darn likely. People have said that AI would never be able to play chess, or create a masterpiece or write a symphony and it looks like they're just wrong. The "humans are special" thesis doesn't seem to be a winning one.

Premise 2: By default, AGI has a reasonable chance of arriving in the next 30 or 40 years.
Likelihood: Seems pretty darn likely given the incredible progress in recent years. Some people are worried that we might run out of data, but we haven't run out of data and even when we do, there's tricks like data augmentation or synthetic data, not to mention that there are hints that we can make progress by focusing more on data quality. Happy to give a fuller explanation in the comments.

Premise 3: The invention of AGI will be one of the most significant things to ever happen to humanity, at least on the scale of the industrial revolution
Likelihood: Almost certainly true. How could technology that could do practically anything we can, but faster and with access with all information on the internet and the ability to learn from all its other instances in the world not have an insanely large impact?

Premise 4: There's a reasonable chance that AGI ends up being one of the best things to ever happen to us.
Likelihood: Doubt is quite reasonable here. It may be that competitive dynamics mean that there is no way that we can develop AGI without it being a complete disaster. Otherwise, mostly seems to follow from premise 3.

Premise 5: There's also a reasonable chance that the development of AGI ends up leading to unnecessary civilizational-level catastrophes (regardless of whether it ends up ultimately being for the best).
Likelihood: Again, it's quite possible to doubt that these catastrophes will be unnecessary. Maybe competitive dynamics make them inevitable?

Even putting aside control risks, there's a large number of plausible threat models: mass-hacking, biological weapons, chemical weapons, AI warfare, election manipulation, great power conflict.

Some people have made arguments that the good guys win because they outnumber the bad guys, but how certain can we be of this? Seems quite plausible that at least one of these risks could have an exceptionally poor offense-defense balance.

Premise 6: We can make a significant difference here
Likelihood: Again, seems more likely than not, but again, it's quite possible that we're screwed no matter what due to competitive dynamics. Some people might argue that attempts in the past haven't gone so well and have even made the situation worse, but it's possible to learn from your mistakes, so I don't think we should conclude yet that we lack the ability to positively intervene.

Premise 7: If our understanding of which specific issues in AI are important changes then most of our skills or career capital will be useful for other issues related to AI as well
Likelihood: Seems pretty high, although there's a decent argument that persuading people to switch what they're doing is pretty hard and that we're not immune to that. Things like having strong technical AI knowledge, research skills and qualifications seem generally useful. The same is true for political capital, relationships with political players and political skill.

Therefore focusing on ensuring that AI goes well is likely to be one of the highest impact things we could focus on, even taking into account the uncertainty noted above.

Note that I made a general claim on focusing on making sure AI goes well rather than a more specific claim about the x-risks/catastrophic risks that EA tends to focus on. I agree that the x-risks/catastrophic risks are the most important area to focus on, but that's a conversation for another day. Right now, I'm focusing more on things that are likely to get broad agreement.

Please: this is an argument that there's a decent chance that the most important thing you could work on could something related to AI, not an argument that it is likely to be net-positive to just pick a random area of AI and start working on it without taking a lot of time to think through your model of the world.

One possible counter-argument would be to claim that there are not just a few things at the same level of importance, but actually many. One approach to this would be to demonstrate that the above argument proves too much.

In any case, I think this is a useful frame to better understand the AI x-risk/catastrophic risk position. I suspect that many people's views are often being driven by this often unstated model. Particularly, I suspect that arguments along these lines mean that the majority of takeover risk folks - if persuaded that takeover risks were actually not going to be a thing - would still likely believe that something to do with AI would be the most important thing for them to focus on. These arguments become even stronger if start taking into account personal fit.

yanni @ 2023-12-28T23:54 (+3)

Great post Chris, very clear. I'd like to add something of a bummer reply, to anyone reading:

Please don't work on AI Safety unless what is motivating you is the genuine desire to have a positive impact.

I think there is already a real failure mode where status motivated people are joining the space because (1) of the attention it is getting among the general public. I.e. it is 'sexy' and (2) the people they respect are also in the space.

If this kind of person is put in the position of losing status for what one believes is good and true (e.g. Stanislov Petrov) then I don't trust them to make the right decisions.

Maybe I'll write a post about this...