Should strong longtermists really want to minimize existential risk?

By tobycrisford 🔸 @ 2022-12-04T16:56 (+38)

Strong longtermists believe there is a non-negligible chance that the future will be enormous. For example, earth-originating life may one day fill the galaxy with digital minds. The future therefore has enormous expected value, and concern for the long-term should almost always dominate near-term considerations, at least for those decisions where our goal is to maximize expected value.

It is often stated that strong longtermism reduces in practice to the goal: “minimize existential risk at all costs”. I argue here that this is inaccurate. I claim that a more accurate way of summarising the strong longtermist goal is: “minimize existential risk at all costs conditional on the future possibly being very big”. I believe the distinction between these two goals has important practical implications. The strong longtermist goal may actually conflict with the goal of minimizing existential risk unconditionally.

In the next section I describe a thought experiment to demonstrate my claim. In the following section I argue that this is likely to be relevant to the actual world we find ourselves in. In the final section I give some concluding remarks on what we should take away from all this.

The Anti-Apocalypse Machine

The Earth is about to be destroyed by a cosmic disaster. This disaster would end all life, and snuff out all of our enormous future potential.

Fortunately, physicists have almost settled on a grand unified theory of everything that they believe will help them build a machine to save us. They are 99% certain that the world is described by Theory A, which tells us we can be saved if we build Machine A. But there is a 1% chance that the correct theory is actually Theory B, in which case we need to build Machine B. We only have the time and resources to build one machine.

It appears that our best bet is to build Machine A, but there is a catch. If Theory B is true, then the expected value of our future is many orders of magnitude larger (although it is enormous under both theories). This is because Theory B leaves open the possibility that we may one day develop slightly-faster-than-light travel, while Theory A being true would make that impossible.

Due to the spread of strong longtermism, Earth's inhabitants decide that they should build Machine B, acting as if the speculative Theory B is correct, since this is what maximizes expected value. Extinction would be far worse in the Theory B world than the Theory A world, so they decide to take the action which would prevent extinction in that world. They deliberately choose a 99% chance of extinction over a 1% chance, risking all of humanity, and all of humanity's future potential.

The lesson here is that strong longtermism gives us the goal to minimize existential risk conditional on the future possibly being very big, and that may conflict with the goal to minimize existential risk unconditionally.

Relevance for the actual world

The implication of the above thought experiment is that strong longtermism tells us to look at the set of possible theories about the world, pick the one in which the future is largest, and, if it is large enough, act as if that theory were true. This is likely to have absurd consequences if carried to its logical conclusion, even in real world cases. I explore some examples in this section.

The picture becomes more confusing when you consider theories which permit the future to have infinite value. In Nick Beckstead's original thesis, On the Overwhelming Importance of Shaping the Far Future, he explicitly singles out infinite value cases as examples of where we should abandon expected value maximization, and switch to using a more timid decision framework instead. But even if strong longtermists are only reckless in large finite cases, that should still be enough for them to be forced to adopt extremely speculative scientific theories (using 'adopt' as a shorthand for 'act as if this theory were true').

Out of all our scientific knowledge, the 2^nd law of thermodynamics is arguably one of the principles that is least likely to be proven wrong. But we can't completely rule out the possibility that counter-examples will one day be found. The 2^nd law also puts strict limits on how big the future can be. I claim strong longtermists should therefore act as if the 2^nd law will turn out to be false. The same goes for any other currently understood physical limit on our growth, such as the idea that information cannot travel faster than light.

This may well have practical implications for the work that strong longtermists are currently doing on existential risk. For example, perhaps it implies that they should adopt a big distrust of the scientific establishment. There are already people on the internet who claim to have built machines which violate the 2^nd law of thermodynamics. The scientific establishment have largely been ignoring these amateur scientists' claims so far. If strong longtermists should act as if these amateur scientists are correct to reject the 2^nd law, then that might mean putting less weight on the opinions of the scientific establishment, and more weight on the opinions of these amateurs.

It could be fairly objected that in a world where the 2^nd law of thermodynamics is false, it is more likely to be overturned by mainstream physics, than by a random youtuber. If this is true, then perhaps the preceding claim that strong longtermists should distrust the scientific establishment goes too far. Nevertheless, strong longtermists should still act as if the 2^nd law will one day turn out to be broken, or that faster-than-light travel will one day turn out to be possible, since we can't rule these possibilities out completely, and they contain enormous expected value. I find it hard to believe that a commitment to such fundamental and unlikely beliefs would not have any practical implications.

On a more practical level, once we condition on the future potentially being enormous, that should lead us to overestimate humanity's ability to coordinate to solve global problems, relative to what our estimation would have been without this conditioning, since such coordination will surely be necessary for us to spread throughout the galaxy. This overestimation may then lead us to different prioritisations among the current existential risks we face, than if we were just trying to minimize existential risk unconditionally.

Overall, I think we should expect the attempts of strong longtermists to reduce existential risk to be hindered, at least to some extent, if they are committed to adopting descriptions of the world which permit the largest possible future value, rather than descriptions of the world which are most likely to be correct.

I believe that the goals “minimize existential risk” and “minimize existential risk conditional on a possibly big future” are likely to conflict in practice, not just in principle.

Conclusion

Hopefully it is clear that this post is intended to be taken as a critique of strong longtermism, rather than as a recommendation that we should abandon the 2^nd law of thermodynamics. I believe the take away here should be that possible futures involving enormous numbers of digital minds should feature less heavily in our prioritisation decisions than they do in the standard strong longtermist framework.

Zach Stein-Perlman @ 2022-12-04T21:01 (+8)

I of course agree that we should take into account the size of the future. I somewhat disagree with this:

the goals “minimize existential risk” and “minimize existential risk conditional on a possibly big future” are likely to conflict in practice

Do you have examples in mind? I can think of a couple related to anthropics, but their decision-relevance is unclear.

No matter what the universe is like, or whether we're in a simulation, or whatever, averting x-risk seems roughly equivalent to increasing future option value, which seems roughly equivalent to being able to make the most of the universe, whatever it's like.

tobycrisford @ 2022-12-05T18:03 (+7)

I tried to describe some possible examples in the post. Maybe strong longtermists should have less trust in scientific consensus, since they should act as if the scientific consensus is wrong on some fundamental issues (e.g. on the 2nd law of thermodynamics, faster than light travel prohibition). Although I think you could make a good argument that this goes too far.

I think the example about humanity's ability to coordinate might be more decision-relevant. If you need to act as if humanity will be able to overcome global challenges and spread through the galaxy, given the chance, then I think that is going to have relevance for the prioritisation of different existential risks. You will overestimate humanity's ability to coordinate relative to if you didn't make that conditioning, and that might lead you to, say, be less worried about climate change.

I agree that it makes this post much less convincing that I can't describe a clear cut example though. Possibly that's a reason to not be as worried about this issue. But to me, the fact that "allows for a strong future" should almost always dominate "probably true" as a principle for choosing between beliefs to adopt, intuitively feels like it must be decision-relevant.

Kinoshita Yoshikazu (pseudonym) @ 2022-12-05T08:20 (+2)

Not quite sure what "actual examples" we can possibly conjure up, but I suspect this is somewhat related to the issue of technology-related X-risks.

MichaelStJules @ 2022-12-05T03:57 (+6)

Related, also with some relevant discussion in the comments: https://forum.effectivealtruism.org/posts/sEnkD8sHP6pZztFc2/fanatical-eas-should-support-very-weird-projects

tobycrisford @ 2022-12-05T18:05 (+1)

Thanks! Very related. Is there somewhere in the comments that describes precisely the same issue? If so I'll link it in the text.

MichaelStJules @ 2022-12-05T23:42 (+3)

I don't have any specific comment in mind to single out.

Geoffrey Miller @ 2022-12-04T19:33 (+6)

Toby -- interesting essay. But I'm struggling to find any rational or emotive force in your argument that 'strong longtermism tells us to look at the set of possible theories about the world, pick the one in which the future is largest, and, if it is large enough, act as if that theory were true'

The problem is that this leads to a couple of weird edge cases.

First, if we live in a 'quantum multiverse', in which there are quadrillions of time-lines branching off every microsecond into new universes, then the future is very very large indeed, but any decisions we make to influence it seem irrelevant, insofar as we'd make any possible decision in some branching time-line.

Second, the largest possible futures seem associated more with infinite religious afterlives than with scientifically plausible theories. Should 'strong longtermists' simply adopt Christian metaphysics, on the assumption that an infinite afterlife in heaven would be really cool, compared to any atheist metaphysics?

I'd welcome any thoughts about these examples.

tobycrisford @ 2022-12-05T18:23 (+3)

Thanks for the comment! I have quite a few thoughts on that:

First, the intention of this post was to criticize strong longtermism by showing that it has some seemingly ridiculous implications. So in that sense, I completely agree that the sentence you picked out has some weird edge cases. That's exactly the claim I wanted to make! I also want to claim that you can't reject these weird edge cases without also rejecting the core logic of strong longtermism that tells us to give enormous priority to longterm considerations.

The second thing to say though is that I wanted to exclude infinite value cases from the discussion, and I think both of your examples probably come under that. The reason for this is not that infinite value cases are not also problematic for strong longtermism (they really are!) but strong longtermists have already adapted their point of view in light of this. In Nick Beckstead's thesis, he says that in infinite value cases, the usual expected utility maximization framework should not apply. That's fair enough. If I want to criticize strong longtermists, I should criticize what they actually believe, not a strawman, so I stuck to examples containing very large (but finite) value in this post.

The third and final thought I have is a specific comment on your quantum multiverse case. If we'd make any possible decision in any branch, does that really mean that none of our decisions have any relevance? This seems like a fundamentally different type of argument to the Pascal's wager-type arguments that this post relates to, in that I think this objection would apply to any decision framework, not just EV maximization. If you're going to make all the decisions anyway, why does any decision matter? But you still might make the right decision on more branches than you make the wrong decision, and so my feeling is that this objection has no more force than the objection that in a deterministic universe, none of our decisions have relevance because the outcome is pre-determined. I don't think determinism should be problematic for decision theory, so I don't think the many-worlds interpretation of quantum mechanics should be either.

Ariel G. @ 2022-12-04T18:56 (+3)

This was really well written! I appreciate the concise and to the point writing style, as well as a summary at the top.

Regarding the arguments, I think they make sense to me. Although this is where the whole discussion of longtermism does tend to stay pretty abstract, if we can't actually put real numbers on it.

For ex, in the spirit of your example - does working on AI safety at MIRI prevent extinction, while assuming a sufficiently great future compared to, say, working on AI capabilities at OpenAI? (That is, maybe a misaligned AI can cause a greater future?)

I don't think it's actually possible to do a real calculations in this case, and so we make the (reasonable) base assumption that a future with alligned AI is better than a future with a misaligned AI, and go from there.

Maybe I am overly biased against longtermism either way, but in this example it seems to me like the problem you mention isnt really a real-world worry, but only really a theoretically possible pascals mugging.

Having said that I still think it is a good argument against strong longtermism