Summary of posts on XPT forecasts on AI risk and timelines

By Forecasting Research Institute, rosehadshar @ 2023-07-25T08:42 (+28)

In 2022, the Forecasting Research Institute (FRI) ran the Existential Risk Persuasion Tournament (XPT). Over the course of four months, 169 forecasters, including 80 superforecasters[1] and 89 experts, forecasted on various questions related to existential and catastrophic risk. Forecasters moved through a four-stage deliberative process that was designed to incentivize them not only to make accurate predictions but also to provide persuasive rationales that boosted the predictive accuracy of others’ forecasts.

Forecasters stopped updating their forecasts on 31st October 2022, and are not currently updating on an ongoing basis. FRI plans to run future iterations of the tournament, and open up the questions more broadly for other forecasters.

We're in the process of publishing a series of Forum posts on the the XPT results. This post summarises all of the posts in that series on AI risk and AI timelines

(You can see results from the tournament overall here.)

Posts in this series which relate to AI risk and timelines:

This post briefly summarizes the main findings across those posts.

A summary of the main results

AI risk

*Question details and resolution criteria are available here. **Question details and resolution criteria are available here.

AI timelines

It’s unclear how accurate these forecasts will prove, particularly as superforecasters have not been evaluated on this timeframe before.[4]

The bio anchors model

*The most aggressive and most conservative estimates can be considered equivalent to 90% confidence interval for the median estimate.[5]

Expected shape of AI impacts on society

  1. ^

    By superforecasters, we mean seasoned forecasters with a track record of predictive accuracy on shorter-run questions in forecasting tournaments held by the Good Judgment Project.

  2. ^

      See here for a discussion of the feasibility of long-range forecasting.

  3. ^

    Full question text (with details on criteria here):

    When will the first unified AI system meeting all of the following criteria be trained, tested, and publicly known of?

    1. Able to reliably pass a 2-hour adversarial Turing test.

    2. High competency at answering questions across diverse fields of expertise.

    3. High competency on interview-level problems in the APPS benchmark.

    4. Able to learn the classic Atari game “Montezuma’s revenge” in the equivalent of 100 hours or less of real-time play.

  4. ^

    See here for a discussion of the feasibility of long-range forecasting.

  5. ^

    For the relevant questions in the XPT, forecasters were asked to provide their 5th, 25th, 50th, 75th, and 95th percentile forecasts. In this analysis we use the term, ‘median’ to refer to analyses using the group’s median forecast for the 50th percentile of each question. We use the term ‘most aggressive’ to refer to analyses using the group medians for the 5th percentile estimate of the question relating to hardware costs, and the 95th percentile estimate for the questions relating to willingness to spend and algorithmic progress. (I.e., this uses the lowest plausible hardware costs and the highest plausible willingness to spend and algorithmic efficiency to give the highest plausible likelihood of TAI.) We use the term ‘most conservative’ to refer to  analyses using the group medians for the 95th percentile estimate of the question relating to hardware costs, and the 5th percentile estimate for the questions relating to willingness to spend and algorithmic progress. (I.e., this uses the highest plausible hardware costs and the lowest plausible willingness to spend and algorithmic efficiency to give the lowest plausible likelihood of TAI.) The most aggressive and most conservative estimates can be considered equivalent to 90% confidence interval for the median estimate. See here for context on which XPT questions map to which biological anchors inputs.


Vasco Grilo @ 2024-03-28T15:17 (+2)

Hi Rose,

The values for the standard deviation of the AI extinction risk seem to high. For example, the median and maximum AI extinction risk until the end of 2030 by superforecasters are 10^-6 and 10 % (pp. 269 and 272), and therefore the standard deviation has to be lower than 10 % (= 0.1 - 10^-6), but you report a value of 2.6 (p. 269). Maybe 2.6 is the standard deviation as a fraction of the mean (i.e. the coefficient of variation)?

Molly Hickman @ 2024-03-29T20:20 (+3)

Hi @Vasco Gril, thanks for the question. That is the standard deviation in percentage points. The distribution is decidedly un-Gaussian so the standard deviation is a little misleading.

We limited the y axis range on the box-and-dot plots like that one on page 272 -- they're all truncated at the 95th percentile of tournament participants + a 5% cushion (footnote on page 18) -- so the max for Stage 1 for supers was actually 21.9%.

Here are a couple more summary stats for the superforecasters, for the 2030 question. The raw data are available here if you want to explore in more detail!

  stage count   mean    sd   min  median   max
  <int> <int>  <dbl> <dbl> <dbl>   <dbl> <dbl>
1     1    88 0.510  2.60      0 0.0001   21.9
2     2    57 0.378  1.74      0 0.0001   12  
3     3    16 0.0392 0.125     0 0.00075   0.5
4     4    69 0.180  1.20      0 0.0001   10  
Vasco Grilo @ 2024-03-29T21:04 (+2)

Thanks for clarifying, and sharing the data, Molly!

That is the standard deviation in percentage points.

I thought it was something else because you have "%" after the medians, but no "pp" after the standard deviations. For future occasions, you could add "pp" either after the standard deviations or in the headers.

Here are a couple more summary stats for the superforecasters, for the 2030 question.

I am surprised to see the minimum AI extinction risk until the end of 2030 is 0 for all stages. I wonder whether the values were rounded, or you had discrete options for the values which could be inputted and some forecasters selected 0 as the closest value (in a linear scale) to their best guess. I think superforecasters predicting an astronomically low extinction risk would be fine, but guessing a value of exactly 0 would be a pretty bad sign, as one cannot be infinitely confident humans will not go extinct.

spreadlove5683 @ 2023-07-25T12:56 (+2)

Can someone give me the TLDR on the implications of these results in light of the fact that Samotsvety's group seemingly/perhaps had much higher odds for AI catastrophe? I didn't read the exact definitions they used for catastrophe, but:

Samotsvety's group (n=13) gave "What's your probability of misaligned AI takeover by 2100, barring pre-APS-AI catastrophe?" at 25%
(source https://forum.effectivealtruism.org/posts/EG9xDM8YRz4JN4wMN/samotsvety-s-ai-risk-forecasts)

Whereas XPT gave
"AI Catastrophic risk (>10% of humans die within 5 years)" for year 2100 at 2.13%

Without having read the exact definitions for "misaligned AI takeover" and still knowing that Samotsvety's prediction was conditional on pre-APS-AI catastrophe not happening, this still seems like a very large discrepancy. I know that Samotsvety's group was a much smaller n. n=13 vs n=88. How much weight should we give to Samotsvety's group's other predictions on AI timelines given the discrepancy in the risk prediction likelihoods?

rosehadshar @ 2023-07-25T14:29 (+4)

Good question.

There's a little bit on how to think about the XPT results in relation to other forecasts here (not much). Extrapolating from there to Samotsvety in particular:

  • Reasons to favour XPT (superforecaster) forecasts:
    • Larger sample size
    • The forecasts were incentivised (via reciprocal scoring, a bit more detail here)
    • The most accurate XPT forecasters in terms of reciprocal scoring also gave the lowest probabilities on AI risk (and  reciprocal scoring accuracy may correlate with actual accuracy)
  • Speculative reasons to favour Samotsvety forecasts:
    • (Guessing) They've spent longer on average thinking about it
    • (Guessing) They have deeper technical expertise than the XPT superforecasters

I also haven't looked in detail at the respective resolution criteria, but at first glance the forecasts also seem relatively hard to compare directly. (I agree with you though that the discrepancy is large enough that it suggests a large disagreement were the two groups to forecast the same question - just expect that it will be hard to work out how large.)