Expected value and uncertainty without full Monte Carlo simulations
By Vasco Grilo🔸 @ 2024-01-05T08:57 (+12)
One can often calculate the expected value and uncertainty of expressions involving distributions without running full Monte Carlo simulations. If for nothing else, the following results may be useful for Fermi estimates. In the next sections:
- The input distributions , , ..., and are independent, as is often assumed in Monte carlo simulations.
- and are the expected value and variance.
Uncertainty of the product between independent lognormal distributions
If , and is the ratio between the values of 2 quantiles of (e.g. = "95th percentile of "/"5th percentile of "), I think the ratio between the 2 same quantiles of (e.g. = "95th percentile of "/"5th percentile of ") is . For the particular case where all input distributions have the same uncertainty, , and therefore . This illustrates the point that performing point estimates with pessimistic and optimistic values overestimates uncertainty:
- If the ratio between the 95th and 5th percentile of 3 independent lognormal distributions is = 100, the naive approach will suggest the product would have an uncertainty (ratio between 95th and 5th percentile) of 100^3 = 10^6.
- However, the actual uncertainty of the product will be 100^(3^0.5) = 2.91*10^3, which is only 0.291 % of the above.
The naive approach would only make sense if the input distributions were perfectly (or very highly) correlated.
Sum of independent distributions
If :
Product of independent distributions
If :
Weighted sum of independent distributions
If , where are constants (which often add up to 1):
Other expressions
If can be expressed as a linear function of and , one can calculate and applying the results of the 3 previous sections. For example for :
Otherwise, it is probably better to run a full Monte Carlo simulation. That being said, one can also combine the results of the 3 previous sections with estimates obtained from Monte Carlo simulations which each only involves a single variable. To do this:
- Write as a linear function of , , ..., and . For example for :
- Write as a linear function of the above, , , ..., and . For the example above:
- Generate random samples of (e.g. with Guesstimate or Squiggle), and then compute and . For the example above, , , ..., , , , ..., and .
- Determine and using the expressions of steps 1 and 2 with the results obtained in step 3.
NunoSempere @ 2024-01-05T14:27 (+4)
In case it's of interest, you can see some similar algebraic manipulations here: https://git.nunosempere.com/personal/squiggle.c/src/branch/master/squiggle_more.c#L165, as well as some explanations of how to get a normal from its 95% confidence interval here: https://git.nunosempere.com/personal/squiggle.c/src/branch/master/squiggle.c#L73.
Vasco Grilo @ 2024-01-05T16:37 (+4)
Thanks for sharing, Nuño! Relatedly, I wrote about how to determine distribution parameters from quantiles.
TLDR: Feel free to download or make a copy of this Sheets to calculate the parameters of uniform, normal, loguniform, lognormal, pareto and logistic distributions (including the mean and median), based on the values of 2 quantiles.