When you plan according to your AI timelines, should you put more weight on the median future, or the median future | eventual AI alignment success? ⚖️

By Jeffrey Ladish @ 2023-01-05T01:55 (+16)

This is a question I'm puzzling over, and my current answer is that when it comes to decisions about AI alignment strategy, I will put more planning weight on median futures where we survive, making my effective timelines longer for some planning purposes, but not removing urgency.

I think that in most worlds where we manage to build aligned AGI systems, we managed to do this in large part because we bought more time to solve the alignment problem, probably via one of two mechanisms:

I think we are likely to buy >5 years of time via one or both of these routes in >80% of worlds where we successfully build aligned AGI. 

I have less of a good estimate about how long my AI timelines are for the median future | eventual AGI alignment success. 20 years? 60 years? I haven't thought about it enough to give a good estimate, but I think at least 10 years. Though, I think time bought for additional AI alignment work is not equally useful. [1] 🧐

Implications for me:

Personal considerations

Terms and assumptions:

I'd love to hear how other people are answer this question for themselves, and any thoughts / feedback on how I'm thinking about it. 🦜

This post is also on Lesswrong

  1. ^

    I think the time bought by solving AI alignment in a limited way & using that to buy time, compared to the time obtained through human coordination efforts, is more likely to be a greater proportion of the time in the median world where we eventually solve alignment. However, I also think my own efforts are less important (though potentially still important) in the use-AI-to-buy-time world. So it's hard to know how to weight it, so I'm not distinguishing much between these types of additional time right now.


Yonatan Cale @ 2023-01-05T11:51 (+3)

Using "median timeline" or "median outcome" seems like a mistake to me.

Just like "there's a 10% chance this new covid pandemic will be super lethal" does not mean "let's ignore it" (and also doesn't mean "we will die tomorrow so nothing we do matters"). It means something like "there is more than one outcome to seriously prepare for"

See Scott Alexander's post which inspired how I now think about such questions

Jackson Wagner @ 2023-01-05T02:14 (+2)

I would assume it's most impactful to focus on the marginal future where we survive, rather than the median?  ie, the futures where humanity barely solves alignment in time, or has a dramatic close-call with AI disaster, or almost fails to build the international agreement needed to suppress certain dangerous technologies, or etc.

IMO, the marginal futures where humanity survives, are the scenarios where our actions have the most impact -- in futures that are totally doomed, it's worthless to try anything, and in other futures that go absurdly well it's similarly unimportant to contribute our own efforts.  Just in the same way that our votes are more impactful when we vote in a very close election, our actions to advance AI alignment are most impactful in the scenarios balanced on a knife's edge between survival and disaster.

(I think that is the right logic for your altruistic, AI safety research efforts anyways.  If you are making personal plans, like deciding whether to have children or how much to save for retirement, that's a different case with different logic to it.)