AI safety can be a Pascal's mugging even if p(doom) is high

By Elliott Thornley (EJT) @ 2026-04-25T16:20 (+36)

People sometimes say that AI safety is a Pascal’s mugging. Other people sometimes reply that AI safety can’t be a Pascal’s mugging, because p(doom) is high. Both these people are wrong.

The second group of people are wrong because Pascal’s muggings are about the probability that you make a difference, not about baseline risk. The first group of people are wrong because the probability that you personally avert AI catastrophe isn’t that small.

Here’s a story to show that Pascal’s muggings are about the probability that you make a difference. Imagine that God will flip a coin at the end of time. If the coin lands heads, He’ll send everyone to heaven. If the coin lands tails, He’ll send everyone to hell. Everyone knows this is what will happen.

In a dark alley, a stranger approaches you and tells you that he can make God’s coin land heads, thereby ensuring that everyone goes to heaven. He says he’ll do it if you give him your wallet. You assign a very low probability to this stranger telling the truth — 1 in a bajillion — but the stranger reminds you that 10 bajillion people will have their fates determined by God’s coin.

‘Hang on,’ you say, ‘This seems a lot like a Pascal’s mugging.’

‘Au contraire,’ says the stranger, ‘It can’t be a Pascal’s mugging. The outcome I’m promising to avert — everyone going to hell — is not low probability at all. p(hell) is 50%.’

Would this reply convince you to hand over your wallet? Of course not. Even though the baseline risk of everyone going to hell is high, the probability that you make a difference — getting everyone to heaven when they otherwise would have gone to hell — is extremely low. And it’s this latter probability that determines whether your situation is a Pascal’s mugging.

So when people say that AI safety is a Pascal’s mugging, you can’t just reply that p(doom) is high. You have to argue that p(you avert doom) is high.

All that said, I think p(you — yes, you — avert doom) is high, or at least high enough. The whole doom situation is really up-in-the-air right now, and you’re at most like 4 degrees of separation from the big players: presidents, lab CEOs, and the like. You can influence someone who influences someone who influences someone. Your chances are way higher than 1 in a bajillion.

Jo_🔸 @ 2026-04-25T19:04 (+11)

Agree with the post and the bottom line, though I don't think it justifies focusing on AI safety, because of a disanalogy.

In your analogy, we assume that when we give the money to the mugger, they either make the coin more likely to land heads, or do nothing.

Meanwhile, in AI safety, small chances of averting doom come with small chances of causing doom - and it seems most of those who work in the field consider that some respected interventions are actually increasing P(Doom). They just disagree on what those doom-increasing interventions are.

Elliott Thornley (EJT) @ 2026-04-26T13:08 (+2)

Yep, when going into AI safety you should take into account p(you cause doom) along with p(you avert doom).

MichaelDickens @ 2026-04-25T20:25 (+5)

Is voting a Pascal's mugging?

Is searching for a breakthrough in cancer treatment a Pascal's mugging?

NickLaing @ 2026-04-26T08:18 (+2)

I'm not sure either of those are great analogies. I would say both are not for different reasons.

Democracy only works properly if most people vote. Everyone who votes plays a role in maintaining the system and the norm of all people having real political power, even if their individual vote didn't change the result. I don't buy the argument which thinks just about Democracy through the effect of the marginal vote.

As a cancer researcher your have a decent chance of making an actual breakthrough, especially if working at a leading company or institution. Every year there are multiple meaningful breakthroughs which actually reduce cancer DALY burden. It's hardly like AI safety where is both harder to make a difference and harder to know if you have...

david_reinstein @ 2026-04-27T01:09 (+2)

Individual votes change the vote totals, which has a marginal effect on strategic policy considerations. So a tiny proportional effect on a tremendously large outcome.

Michael St Jules 🔸 @ 2026-04-25T18:37 (+3)

The first group of people are wrong because the probability that you personally avert AI catastrophe isn’t that small.

What do you estimate it to be, given all of the other actors in the space focused on this binary outcome?

Also, how high should the probability difference be for you to think devoting your career to it makes sense, rather than taking minimal precautions with low opportunity costs, like how we think about seatbelts and insurance against very unlikely events?

simon @ 2026-04-26T11:00 (+1)

I think this is essentially a straw man?
Everyone I know who doesn’t like donating to AI safety basically thinks it’s because p(influencing the outcome positively) is too low.

Elliott Thornley (EJT) @ 2026-04-26T13:05 (+3)

That's my point! p(influencing the outcome positively) is the right thing to focus on, not p(doom).

simon @ 2026-04-26T17:09 (+1)

Yes, exactly. My point is that people are pretty aware and claiming otherwise is a bit of a straw man type fallacy - but I might be wrong, perhaps I interact with different people :D