#200 – What superforecasters and experts think about existential risks (Ezra Karger on The 80,000 Hours Podcast)

By 80000_Hours @ 2024-09-06T17:53 (+12)

We just published an interview: Ezra Karger on what superforecasters and experts think about existential risks. Listen on Spotify, watch on Youtube, or click through for other audio options, the transcript, and related links. Below are the episode summary and some key excerpts.

Episode summary

It’s very hard to find examples where people say, “I’m starting from this point. I’m starting from this belief.” So we wanted to make that very legible to people. We wanted to say, “Experts think this; accurate forecasters think this.” They might both be wrong, but we can at least start from here and figure out where we’re coming into a discussion and say, “I am much less concerned than the people in this report; or I am much more concerned, and I think people in this report were missing major things.”

But if you don’t have a reference set of probabilities, I think it becomes much harder to talk about disagreement in policy debates in a space that’s so complicated like this.

— Ezra Karger

In today’s episode, host Luisa Rodriguez speaks to Ezra Karger — research director at the Forecasting Research Institute — about FRI’s recent Existential Risk Persuasion Tournament to come up with estimates of a range of catastrophic risks.

They cover:

Producer: Keiran Harris
Audio engineering: Dominic Armstrong, Ben Cordell, Milo McGuire, and Simon Monsour
Content editing: Luisa Rodriguez, Katy Moore, and Keiran Harris
Transcriptions: Katy Moore

Highlights

Why we need forecasts about existential risks

Ezra Karger: I don’t think forecasting is the only approach you should use to address these problems. And I want to make sure people don’t think that because we’re talking about forecasts, that forecasts are the be-all and end-all to solving problems that are really hard and complex.

But what I would say is that when decision-makers or normal people are thinking about complex topics, they are implicitly making and relying on forecasts from themselves and forecasts from others who are affecting their decisions. So if you look at recent discussions about existential risk, congressional hearings on risks from artificial intelligence, or workshops on artificial intelligence and biorisk, in many of these recent debates about policies, there’s an implicit assumption that these risks are nonzero, that these risks are big enough to matter for policy discussions.

But it’s very hard to find examples where people say, “I’m starting from this point. I’m starting from this belief.” So we wanted to make that very legible to people. We wanted to say, “Experts think this; accurate forecasters think this.” They might both be wrong, but we can at least start from here and figure out where we’re coming into a discussion and say, “I am much less concerned than the people in this report; or I am much more concerned, and I think people in this report were missing major things.” But if you don’t have a reference set of probabilities, I think it becomes much harder to talk about disagreement in policy debates in a space that’s so complicated like this.

And let me just make a quick analogy to inflation. So governments, researchers at the Federal Reserve, we carefully track expectations about inflation. So we have multi-decade surveys. The Survey of Professional Forecasters does this, where we ask forecasters for their beliefs regularly about what inflation will be, what GDP growth will be, what unemployment will be. And that’s been happening since the 1960s.

So if we’re going to continue to have discussions about existential risks, it seems useful to have forecasts that we in the future will track over time that tell us how people’s beliefs about risks are changing, and how people’s expectations about what policies might work well or poorly in this space are changing. And that’s the type of research we hope to do and build on in this report.

Headline estimates of existential and catastrophic risks

Ezra Karger: So when we thought about existential catastrophe, we split it into two types of existential catastrophe. We asked a set of questions about “extinction risk”: the likelihood that these domains would be responsible for human extinction at various dates; and then we also asked about what we called “catastrophic risk”: the likelihood that each of these risks would lead to the death of at least 10% of the world’s population within a five-year period. And we asked about these numbers over many time horizons. But let me focus on the numbers by 2100, which was the last date we asked about.

Focusing on total extinction risk, this is what people in this project said was the risk of human extinction by 2100 from any source. Domain experts — this is averaged across all of the experts in the project — said there was a 6% chance of human extinction by 2100. Superforecasters said there was a 1% chance of human extinction by 2100. So we can already see that there are major differences in beliefs about extinction risk.

Now, maybe we should pause there for a second and say these numbers seem very big, right? That is a large probability to put on an extinction risk event happening in the next 80 years.

So I do want to say, and maybe we can come back to this later, that we don’t know how to elicit forecasts in low-probability domains. It’s possible that these numbers are high or low, relative to the truth, but we think it’s very important to document what these numbers are and how they compare to each other.

Luisa Rodriguez: OK, sure. So with that caveat in mind, maybe these numbers are inflated because we’re talking about very hard-to-think-about things — like the probability of human extinction. But still, it’s a group of over 100 people who have thought some about these risks, and superforecasters put it at 1% and experts put it at 6% — so 6% chance that by 2100, humanity has gone extinct. How does that compare to other preexisting estimates of human extinction risks?

Ezra Karger: If we look at the academic literature, there have been some attempts to elicit forecasts about extinction risk. What we see is that for experts, this is roughly in line with what we’ve seen in previous work. No one has looked at what superforecasters thought, so we don’t have a good comparison. But superforecasters are on the lower end of forecasts that have been discussed in the academic literature before. And again, this could be because the superforecasters maybe don’t know enough about this topic, or it could be because experts are biased and maybe think that the risks are higher than they actually are.

Luisa Rodriguez: Yeah. OK, so that’s extinction risks. What were the forecasts for catastrophic risks? Which, again, are 10% of the population dies in a short period of time, all by 2100.

Ezra Karger: So domain experts thought there was a 20% chance of this catastrophic event happening — of at least 10% of the world’s population dying within a short period by 2100. And superforecasters thought there was a 9% chance of that happening.

Those are large numbers. They’re larger than extinction risk, which makes sense. And they’re also maybe more similar: if you look at extinction risk, you see that experts were six times as concerned about extinction risk as superforecasters. Here, we see that experts are maybe twice as concerned as superforecasters.

What explains disagreements about AI risks?

Ezra Karger: In the XPT, we saw these major differences in belief about AI extinction risk by 2100: I think it was 6% for AI experts and 1% for superforecasters. Here we’ve accentuated that disagreement: we’ve brought together two groups of people, 22 people in total, where the concerned people are at 25% and the sceptical people are at 0.1%. So that’s a 250 times difference in beliefs about risk.

Luisa Rodriguez: Yeah. So really wildly different views. Then I think that you had four overarching hypotheses for why these two groups had such different views on AI risks. Can you talk me through each of them?

Ezra Karger: Definitely. We developed these hypotheses partially as a result of the X-risk Persuasion Tournament. The four hypotheses were the following.

The first was that disagreements about AI risk persist because there’s a lack of engagement among participants. So, we have low-quality participants in these tournaments; the groups don’t really understand each other’s arguments; just the kind of whole thing was pretty blah.

The second hypothesis was that disagreements about AI risk are explained by different short-term expectations about what will happen in the world. So if hypothesis two is right, then we can hopefully find really good cruxes for why these groups disagree, and really good cruxes that will cause each group to update…

The third hypothesis was that disagreements about AI risk are not explained necessarily by these short-run disagreements, but there are different longer-run expectations. This may be more of a pessimistic hypothesis when it comes to understanding long-run risk, because it might say that we won’t actually know who is right, because in the short run, we can’t really resolve who’s correct, and no one’s going to update that much…

And then the last hypothesis, the fourth hypothesis, was that these groups just have fundamental worldview disagreements that go beyond the discussions about AI. And this gets back to maybe a result from the XPT, where we saw that beliefs about risk were correlated. You might think that this is just because of some underlying differences of belief about how fragile or resilient the world is. It’s not AI-specific; it’s not about beliefs about AI capabilities; it’s not about risks for misalignment — it’s about a belief that, like, regulatory responses are generally good or bad at what they’re doing…

Luisa Rodriguez: OK, so which of those hypotheses ended up seeming right?

Ezra Karger: So I think hypotheses one and two did not turn out to be right, and I think hypotheses three and four have significant evidence behind them. So I can maybe go through the evidence. That may be less exciting, because it would be great if hypothesis one or two had been right. But I was really excited to be able to differentiate these hypotheses, and figure out which ones had more evidence behind them.

Learning more doesn't resolve disagreements about AI risks

Ezra Karger: So, to talk about hypothesis one for a second: this was the idea that these disagreements about risk persisted because there wasn’t that much engagement among participants, or people didn’t disagree well. I think we can reject this hypothesis, but readers may disagree. This is very much a determination you should make after seeing how the disagreements went in our long descriptions of the arguments that people had. I think participants spent a lot of time understanding each other’s arguments, and people largely understood each other’s arguments, and engagement was pretty high quality.

There’s a criticism that was levelled at the XPT in a very interesting way, which is that these people aren’t engaging in a high-quality way. And you could just bring that criticism to this project as well, and say that people who were concerned or not concerned about AI risk weren’t really engaging in a way that was useful.

I think that criticism always applies to research projects like this, but I want to know what the limiting factor is. People in this project spent maybe 50 to 100 hours thinking about these topics. Is it the case that you think if they had spent 1,000 hours, they would have agreed? I don’t think there’s any evidence of that. I think they were really understanding each other’s arguments by the end of this project, and we saw very little convergence.

Luisa Rodriguez: Interesting. OK, so you saw very little convergence in that these two groups didn’t move that much toward each other at the end, which suggests that it’s not that they weren’t engaging. What was the evidence against hypothesis two?

Ezra Karger: Hypothesis two was the one I was saddest not to find strong evidence for. This was: can we find short-term disagreements or short-term differences in expectations that explain these long-run disagreements about AI? Much of this project involved giving these forecasters short-run forecasts to do and asking them to tell us how they would update if those short-term cruxes resolved positively or negatively.

And what we saw is that of the maybe 25-percentage-point gap in those initial beliefs, only about one percentage point of that was closed in expectation by the best of our short-term cruxes.

Luisa Rodriguez: Wow.

Ezra Karger: So what that means is, even if the sceptics and the concerned people had the best evidence from a specific question that they expected to have by 2030, they wouldn’t change their minds that much, and they wouldn’t converge that much.

A lot of disagreement about AI risks is about when AI will pose risks

Ezra Karger: This maybe gets at a source of agreement that I didn’t expect: both the sceptics and the concerned people believe that “powerful AI systems” — and we define this as “AI systems that exceed the cognitive performance of humans in at least 95% of economically relevant domains,” so this is a big change — both groups thought that this would be developed by 2100. The sceptics thought there was a 90% chance this would occur, and the concerned group thought there was an 88% chance this would occur.

Now, that’s a lot of agreement for people who disagree so much about risk. And I think there are a few things going on there. First is that we tried to define these questions really carefully, but what does it mean for AI systems to “exceed the cognitive performance of humans in greater than 95% of economically relevant domains”? We can both agree that this is a big deal if it happens, but it’s possible that the sceptics and the concerned people disagree about the extent to which that means that AI systems have really accelerated in ability.

One other place where the AI risk sceptics and the AI risk concerned groups really seem to agree is in what would happen with AI risk over the next 1,000 years. We defined a cluster of bad outcomes related to AI, and this included AI-caused extinction of humanity. It also included cases where an AI system, either through misuse or misalignment, caused a 50% or greater drop in human population and a large drop in human wellbeing.

What we found is that the AI risk concerned group thought there was a 40% chance that something from this cluster of bad outcomes would occur in the next 1,000 years, but the AI risk sceptics thought there was a 30% chance that something from this cluster of bad outcomes would occur in the next 1,000 years.

So if we connect that to the forecasts we’ve been talking about throughout this conversation, about what will happen with AI risk by 2100, what we’ll see is that both groups are concerned about AI risk, but they have strong disagreements about the timing of that concern. People who are concerned in the short run remain concerned about the long run and get more concerned about the long run if you accumulate those probabilities. But the people who are sceptical about AI risk in the short run are still concerned if you look at a broader set of bad outcomes over a longer time horizon.

Luisa Rodriguez: That does feel really, really huge. Because it feels to me like often when I either talk to people or hear people talk about why they’re not that worried about AI risk, it sounds to me like they sometimes have beliefs like, “We will do AI safety properly,” or, “We’ll come up with the right governance structure for AI that means that people won’t be able to misuse it.”

But this just sounds like actually, that’s not the main thing going on for even the sceptical group. It sounds like the main thing is like, “No, we’re not confident things will go well; we just think it’ll take longer for them to potentially go badly” — which does actually feel really action-relevant. It feels like it would point to taking lots of the same precautions, thinking really hard about safety and misuse. Maybe one group doesn’t feel like it’s as urgent as the other, but both think that the risks are just very genuine. So that’s really cool. Also, terrible news. I just so prefer that the AI sceptics believe that AI poses no risk, and be correct.

Cruxes about AI risks

Ezra Karger: The two other cruxes that stood out for the concerned group were whether there would be a major powers war: by 2030, would at least two major superpowers declare war officially and go to war for at least one year? This maybe gets at beliefs about instability of the world system. So if that happens or doesn’t happen, it would dramatically cause the concerned group to update on AI risk. This may reflect the fact that if major powers declare war on each other, the concerned people think that this will accelerate people’s investment in AI systems and will cause increases in risk from a variety of AI-related sources.

Luisa Rodriguez: Cool. So it’s like if people had been making predictions about nuclear war, they might have put them lower until World War II started, and then they might have all increased them because they were like, now we’re going to invest a bunch in this technology.

Ezra Karger: Exactly. Or another thing you could be worried about is — and there have been some recent policy reports on this — if AI increases biorisk, then investment in AI systems might increase biorisk. And if you think that a large-scale war will lead to a Manhattan Project–style effort by major powers to improve AI systems, and that then causes increases in AI-related biorisk, then that might cause you to update on risk overall.

Luisa Rodriguez: Got it.

Ezra Karger: The last crux that I want to mention for the concerned group was this question about whether an independent body like METR, which was previously called ARC Evals, would conclude that state-of-the-art AI models have the ability to autonomously replicate, acquire resources, and evade deactivation. This is a type of concern that the AI risk concerned people are very concerned about, so if this happens, or if it doesn’t happen, it will cause relative updates for the concerned group.

Luisa Rodriguez: Makes sense.

Ezra Karger: I also want to mention that this was what would cause the concerned group to update the most. It was also, interestingly, something that if it happens, would cause the sceptical group to become more concerned about AI risk. Now, the sceptical group doesn’t think this is likely to happen. They gave something like a 1% chance that this happens. But if it happens, their concerns about risk went from 0.1% up to one percentage point.

Luisa Rodriguez: So that is actually a thing that for both of them would make them much more worried — which is interesting, because it sounds like that means they kind of agree on one of the really scary mechanisms by which AI could end up causing really bad outcomes. A big component of it is that the sceptics just think that’s very, very unlikely, and so it would move them some.

Ezra Karger: Exactly. So it didn’t have what we call “high value of information” for the sceptics, because they thought it was so unlikely to occur, and so they don’t expect to update on it by 2030 because they don’t think it will happen. But if it does happen, they will update a lot. And I thought that was fascinating.

Is forecasting actually useful in the real world?

Luisa Rodriguez: Zooming out a bit on the kind of broad usefulness of forecasting, I feel like I’ve gotten the sense that at least some people kind of think forecasting isn’t actually that valuable in the real world. I have this sense that there was a lot of excitement about Phil Tetlock’s books, and then people were like, it’s actually not that practical to use forecasting. It’s like a fun game, but not useful in the real world. First, have you heard that argument? Second, do you think there’s any truth to that critique?

Ezra Karger: Yeah, I think I partially agree and partially disagree with that critique. So, first of all, I’ll say government agencies are using forecasts all the time, and people are using forecasts all the time. So I think this idea that forecasts themselves aren’t being used or aren’t being used well, I don’t think that’s right. If we look at improvements in weather forecasting, I think that’s just clearly saved lives in the past few years, relative to 100 or 200 years ago, when you saw these really costly natural disasters because people didn’t know when hurricanes were coming, for example.

Now, what we may be talking about more here is these subjective forecasts from random people. Like should we be using forecasts that people online have given about geopolitical events, or should we be using forecasts that people, or even experts on a topic, have given about events? And I do think there’s less evidence that those are useful yet.

What I would say is, Phil’s work in the Good Judgment Project, in these government-funded forecasting tournaments where we tried to understand how crowdsourced forecasts could improve accuracy relative to experts, showed that normal people could come up with forecasts that were as accurate or maybe more accurate than experts in some domains.

But they didn’t look at things like quality of explanation, for example. So if you’re a policymaker trying to make a decision, it’s very hard for you to say, “I’m going to rely on this black box number that came out of this group of people who we recruited online.” It’s much easier to say, “I have some analysts who think that these are the important mechanisms underlying a key decision I’m making.” And relying on that to make a decision I think feels more legible to people who are actually making decisions.

So I would partially agree and partially disagree with the criticism in your question. I think that government agencies are using forecasting. I’m involved in producing this short-run index of retail sales, where we just track retail sales, try to forecast how the economy is doing, and that gets used in our discussions at the Federal Reserve Bank of Chicago about how the economy is going. So that’s an example of a forecast being useful because we can very clearly state how the forecast is constructed using a model based on underlying data that we understand.

When you’re talking about these forecasts that are coming from people who aren’t also explaining their reasoning in very coherent ways or aren’t necessarily being incentivised to write detailed explanations that show that they have knowledge about a specific topic, I think we haven’t yet seen those forecasts being used.

Maybe one last point on this: after Phil’s work and other people’s work on these crowdsourced forecasts, there were attempts within the intelligence agencies in the US — and this has been documented publicly — to use forecasts, to try to use systems like the ones that Phil and others worked on. There’s this great paper by Michael Horowitz and coauthors arguing that the US intelligence community didn’t incorporate these prediction markets or these forecasts into their internal reporting, even though this research shows that those systems generated accurate predictions.

And the reasons were partially related to bureaucracy, partially related to incentives. So people didn’t really have incentives to participate to provide forecasts. If you provide a bad forecast, then maybe you look bad. If you provide a good forecast, maybe no one remembers. And also, the decision-makers were really trying to dig into underlying explanations and rationales, and they weren’t really ready to just take a number and run. And that might be a good thing, but I think that explains why some of these methods haven’t taken off in certain policy domains yet.


Reis @ 2024-10-18T05:26 (+1)

Hi, nice topic! Why haven't you talked about the https://www.metaculus.com/questions/?has_group=false&topic=climate&order_by=-activity Website. Seems very relevant on forecasting landscape..