Forecasting extreme outcomes

By AidanGoth @ 2023-01-09T15:02 (+46)

This is a linkpost to https://docs.google.com/document/d/1XP0SLWAhda0Mk277R3e9YyGb6Y0ftSSCRzhC-3aQSmE/edit?usp=sharing

This document explores and develops methods for forecasting extreme outcomes, such as the maximum of a sample of n independent and identically distributed random variables. I was inspired to write this by Jaime Sevilla’s recent post with research ideas in forecasting and, in particular, his suggestion to write an accessible introduction to the Fisher–Tippett–Gnedenko Theorem.

I’m very grateful to Jaime Sevilla for proposing this idea and for providing great feedback on a draft of this document.

Summary

The Fisher–Tippett–Gnedenko Theorem is similar to a central limit theorem, but for the maximum of random variables. Whereas central limit theorems tell us about what happens on average, the Fisher–Tippett–Gnedenko Theorem tells us what happens in extreme cases. This makes it especially useful in risk management, when we need to pay particular attention to worst case outcomes. It could be a useful tool for forecasting tail events.

This document introduces the theorem, describes the limiting probability distribution and provides a couple of examples to illustrate the use (and misuse!) of the Fisher–Tippett–Gnedenko Theorem for forecasting. In the process, I introduce a tool that computes the distribution of the maximum n iid random variables that follow a normal distribution centrally but with an (optional) right Pareto tail.

Summary:

I expect the time-poor reader to get most of the value from this document by reading the informal statement of the Fisher–Tippett–Gnedenko Theorem, the overview of the generalised extreme value distribution, and the shortest and tallest people in the world example, and then maybe making a copy and playing around with the tool for forecasting the maximum of n random variables that follow normal distributions with Pareto tails (consulting this as needed).


Marcel D @ 2023-01-09T17:55 (+6)

One thing I might recommend in a document like this is to make it clear up front with concrete real examples what the use case of this theorem is. You eventually mention something about forecasting extreme height, but I was a bit confused about that and some readers may not reach that. More generally, after a quick read I am still a bit unclear why I would want to use/know this concept.

For example, you could write something like “Suppose that you are trying to forecast [real thing of interest]. Some of the relevant variables at play here are [ABC]. A naive approach might be [x] or assume [y], but actually, according to this theorem, [____].”

Stephen Clare @ 2023-01-09T15:26 (+6)

Quick note: the Google Docs link you shared has suggestion privileges, which you might not want for a public doc.

David Roodman @ 2023-01-17T01:37 (+3)

I dug waaay into this topic when investigating geomagnetic storms. I found it quite interesting and useful. See 

https://www.openphilanthropy.org/research/geomagnetic-storms-using-extreme-value-theory-to-gauge-the-risk

https://www.openphilanthropy.org/research/updating-my-risk-estimate-for-the-geomagnetic-big-one

Sanjay @ 2023-01-09T18:36 (+2)

Thank you for this. It's a useful contribution, and I upvoted it.

I'd be interested in some discussion about when we'd expect this mathematics to be materially useful, especially when compared with other hard elements of doing this sort of forecast.

Example: if I want to estimate the extent to which averting a gigatonne of greenhouse gas (GHG) emissions influences the probability of human extinction, I suspect that the Fisher-Tippett-Gnedenko theorem isn't very important (shout if you disagree). Other considerations (like: "have I considered all the roundabout/indirect ways that GHG emissions could influence the chance of human extinction?") are probably more important.