Statistical foundations for worldview diversification
By Karthik Tadepalli @ 2024-08-23T11:55 (+48)
Note: this has been in my drafts for a long time, and I just decided to let it go without getting too hung up on details, so this is much rougher than it should be.
Summary:
- Worldview diversification seems hard to justify philosophically, because it results in lower expected value than going with a single worldview that has the highest EV.
- I show that you can justify worldview diversification as the solution to a decision problem under uncertainty.
- The first way is to interpret worldview diversification as a minimax strategy, in which you maximize the worst-case utility of your allocation.
- The second way is as an approximate solution to the problem of maximizing expected utility for a risk-averse decision maker.
Overview
Alexander Berger: ...the central idea of worldview diversification is that the internal logic of a lot of these causes might be really compelling and a little bit totalizing, and you might want to step back and say, “Okay, I’m not ready to go all in on that internal logic.” So one example would be just comparing farm animal welfare to human causes within the remit of global health and wellbeing. One perspective on farm animal welfare would say, “Okay, we’re going to get chickens out of cages. I’m not a speciesist and I think that a chicken-day suffering in the cage is somehow very similar to a human-day suffering in a cage, and I should care similarly about these things." I think another perspective would say, “I would trade an infinite number of chicken-days for any human experience. I don’t care at all.” If you just try to put probabilities on those views and multiply them together, you end up with this really chaotic process where you’re likely to either be 100% focused on chickens or 0% focused on chickens. Our view is that that seems misguided. It does seem like animals could suffer. It seems like there’s a lot at stake here morally, and that there’s a lot of cost-effective opportunities that we have to improve the world this way. But we don’t think that the correct answer is to either go 100% all in where we only work on farm animal welfare, or to say, “Well, I’m not ready to go all in, so I’m going to go to zero and not do anything on farm animal welfare.”
...
Rob Wiblin: Yeah. It feels so intuitively clear that when you’re to some degree picking these numbers out of a hat, you should never go 100% or 0% based on stuff that’s basically just guesswork. I guess, the challenge here seems to have been trying to make that philosophically rigorous, and it does seem like coming up with a truly philosophically grounded justification for that has proved quite hard. But nonetheless, we’ve decided to go with something that’s a bit more cluster thinking, a bit more embracing common sense and refusing to do something that obviously seems mad.
Alexander Berger: And I think part of the perspective is to say look, I just trust philosophy a little bit less. So the fact that something might not be philosophically rigorous... I’m just not ready to accept that as a devastating argument against it.
This note explains how you might arrive at worldview diversification from a formal framework. I don't claim it is the only way you might arrive at it, and I don't claim that it captures everyone's intuitions for why worldview diversification is a good idea. It only captures my intuitions, and formalizes them in a way that might be helpful for others.
Suppose a decisionmaker wants to allocate money across different cause areas. But the marginal social value of money to each cause area is unknown/known with error (e.g. moral weights, future forecasts), so they don't actually know how to maximize social value ex ante. What should they do?
I formally show that the answer depends on what you are trying to optimize. In particular, two reasonable criteria will lead to two different prescriptions:
- Bayes: maximizing expected value given a prior belief about the unknown parameters. A Bayesian decisionmaker would allocate all their money towards the cause area which is the best according to their ex-ante guess.
- Minimax: maximizing the worst-case value if we were extremely wrong about the unknown parameters. A minimax decisionmaker would allocate their money proportionally to the best-case values of each case across cause areas. This resembles worldview diversification in that it allocates money across multiple cause areas.
So the conclusion of this argument is that if you believe in the minimax criterion for decisionmaking, you would arrive at worldview diversification as the right criterion.
Minimax decisionmaking may be unintuitive or unappealing to you on a philosophical level. So as a second approach, I also imagine the problem as a risk-averse decisionmaker, who is maximizing expected utility from their allocation. I simulate this problem to compute the allocation that maximizes expected utility, and then I compare minimax and Bayes allocations on how well they approximate this optimal allocation. In this view, either minimax or Bayes may be justified based on how well they approximate expected utility maximization, rather than on their own philosophical merits.
I find that minimax is more "robust" than Bayes, in the sense that it has a lower mean squared deviation from the optimal allocation. However, the most robust is actually a heuristic allocation, in which we simply allocate money across causes proportional to our best guess about that cause's value. This offers another, more intuitive foundation for worldview diversification: allocating money proportional to our best guess is the minimum-MSE decision rule.
Statistical decision theory
This section goes through the technical details of how I established the result that the Bayesian decision is to go all in on one cause, while the minimax decision is to split money equally across cause areas.
Who should read this? This section is primarily for skeptical readers who think that I am slipping assumptions by with fuzzy words, and that those assumptions will be laid bare when pinned to the wall as equations. It will not provide much insight if the intuitive explanation above satisfied you, or if you don't have any technical background (though I try to keep it straightforward).
Suppose a decisionmaker has a budget to allocate across two cause areas and . Suppose the marginal social value of money to each cause are random variables . Their problem can be expressed as how to allocate money in order to
A few things are being represented in this formalism:
- I'm assuming that the marginal social value of money is constant. If there were diminishing marginal value within each cause area, then even a Bayes decisionmaker would fund multiple cause areas, depending on how steeply that value declines. But while it is true that diminishing marginal value is a part of why funding multiple cause areas is good, I don't think we should sidestep the philosophical aspect of this debate. Personally, diminishing marginal value feels like a cop out, a reason to avoid engaging with the harder question of whether it is morally consistent to allocate money across multiple cause areas. I remove diminishing marginal value of money to let this philosophical question take center stage.
- I'm placing absolutely no restrictions on how the social value of money to each cause depends on other unknowns. You could imagine that maybe the key unknown is a forecast about the future , but the marginal social value is some complicated and nonlinear transformation of that forecast . That's fine. You can then define and is now the random variable we focus on. Don't let the linearity of this expression in fool you into thinking we are making strong assumptions about the unknown parameters.
- I'm assuming there are only two causes to allocate money. Absolutely nothing would change if there were many causes, but the math would be mind-numbing, so I don't do it.
- Money to cause is defined as ; money to cause is implicitly the leftover budget, . It will be convenient to only have one decision variable running around rather than two.
This objective function is linear in money, so for known , the only way to maximize it is at the extremes. If , we should allocate all money to , . Vice versa if , . This gives us the optimal allocation for known ,
But when are unknown, we can specify multiple approaches to this problem. If we label the objective above as , then we can define the risk from any allocation as This is the statistical decision theory approach to solving problems of decisionmaking under uncertainty. In Appendix A, I derive a concrete expression for risk in this setting:
Obviously, we would like to minimize risk explicitly, but this is infeasible when is unknown. In statistical decision theory, there are two main feasible risk criteria that you could try to minimize:[1]
- Bayes risk: i.e. we try to minimize expected risk given some prior distribution over possible values of .
- Minimax risk: i.e. we try to minimize the maximum risk we could suffer, across all values of
Let's look at what each approach implies.
Bayes risk minimization
Bayes risk is Take each part of this expression separately. where I use the law of iterated expectations to separate out the two terms inside the expectations, even though they are not independent.[2] This leaves the coefficient on each cause area as Thus, according to a Bayes risk minimizer, the marginal social value of a cause is the probability that it is better (), weighted by how much better it is in the worlds where it is better (). This gives us where I've dropped the prior to reduce the notational clutter, and because the choice of doesn't matter for the conclusions.
The key feature of this expression is that it is still linear in money . This makes sense; all we did is take the expectation of a linear function, so the result is still linear. But now it is feasible to optimize, since we do know the prior . Since our objective is linear like before, it is still optimized at the extremes: we set if , and if . We may not know , but if we know its distribution, or even have a prior about it, we should go all in on the cause that is the best in expectation. Thus, Bayes risk minimization tells us to fund the single cause we think is best and bin the others. In other words, worldview diversification is inconsistent with the Bayesian approach.
I think this is very intuitive. If we had perfect information, I think we would unambiguously prioritize the best cause (subject to diminishing value of money, which I'm abstracting away from here). So the Bayesian approach is to simply approximate the best cause with our prior knowledge, and thus approximate this optimal prioritization. I think this is what EAs mean when they say things like "it's hard to philosophically justify worldview diversification"––we are so steeped in Bayesian thinking that any decision incompatible with it seems anti-philosophical, no matter how reasonable it is.
Minimax risk minimization
Minimax risk is
For this to be well-defined, we need upper and lower bounds on . Otherwise, the expression becomes infinite. Call these upper bounds and lower bounds .
Minimax risk can be understood as the following thought experiment: how would a malevolent demon choose to make our cause allocation as ineffective as possible? The linear nature of this problem makes the solution simple: the demon would either select and , or it would select and . The lower bound is chosen to minimize the value created by our money, while the upper bound is chosen to maximize the counterfactual value created by our money if we had allocated it elsewhere. So our risk is either , or it is . Notice that the extreme choices of trigger one of the two indicators, but not the other, leading to these different expressions. Thus, minimax risk is
This is a non-differentiable function, so we can't derive its minimizer in an easy way. But for a max function to be minimized, we can at least say that its two terms have to be equal. Why? Because if , then we could increase a little, which would reduce the LHS and increase the RHS. We can choose a small enough increase in so that the LHS is still larger than the RHS. Since risk is the max of the LHS and the RHS, and the LHS is larger than the RHS in both cases, and we reduced the LHS––we reduced risk! So we could not have been at an optimum before. An identical argument implies that we wouldn't be at an optimum if the LHS was smaller than the RHS, either.
Therefore, minimax risk is minimized when
This is a confusing expression to interpret, so let's consider a simpler special case. Say that we only assume no cause is harmful (). Then The minimax approach is to allocate money to each cause area in proportion to its upper bound value. This is a form of worldview diversification. We are optimally funding multiple cause areas.
It's important to note that in the specific case where ––i.e. the worst-case value of is better than the best-case value of ––even the minimax criterion would not recommend allocating any money to . In general, with potential trivial causes, their upper bound value is so much lower than the upper bound value of big causes that they might round off to zero even if they were funded. So minimax does not require funding absolutely every cause area.
But it's also important to recognize that these upper tail values, which play such an important role in determining the optimal allocation, basically must be pulled out of a hat. Natural distributions do not feature an upper bound. There may be a chance that funding the opera stops human extinction, but it isn't zero. This means that in practice, as decisionmakers, we have to make a judgment about what the upper bound on a cause's value is. Ad hoc assumptions have successfully infiltrated our rigorous first-principles framework, and there's no way around that.
Expected utility maximization
I initially conceived of this effort as using statistical decision theory to think about different ways to allocate resources under uncertainty. But I realized I was overcomplicating it. What if I could microfound worldview diversification in a straightforward expected utility maximization problem?
Imagine that our utility function is now
This is an isoelastic utility function with an inner linear combination. The linearity means that causes A and B are perfect substitutes in creating value for us - we don't care about combining them inherently. However, the isoelastic wrapper represents that we are risk-averse about how much value we create, with the degree of risk aversion governed by .[3] Then our decision problem under uncertainty is
There's no closed-form solution for this in general, so I simulate it:
- I compute the utility from each possible allocation, averaged over 1000 draws of from some log-normal distributions. I use this to calculate the optimal allocation given the distribution of 's.
- I compute the Bayes decision rule by estimating the mean of and , and allocating all the money to whichever has a higher mean, .
- I compute the minimax decision rule by allocating money to each cause proportional to its 95th percentile, .
- I throw in a heuristic decision rule, by allocating money to each cause proportional to its mean value, .
- I repeat this process 1000 times, re-drawing the parameters of distributions from which and are drawn.
- I compare the performance of all of these decision rules using two criteria:
- How often is a rule closer to the optimal allocation than the other rules?
- How large is the mean squared error (MSE) between a rule's chosen allocation and the optimal allocation?
My findings:
- Bayes is most often the best allocation, followed by minimax and then heuristic.
- However, the heuristic rule has the lowest MSE, followed by minimax and then Bayes.
In other words, Bayes is most often right (that's why it's Bayes!) but when it's wrong, it's more wrong than other decision rules. Conversely, the heuristic rule is rarely the best, but it's never extremely wrong. If we want to avoid being extremely wrong, then MSE is the right criterion to target, and thus the heuristic rule (allocating money to causes proportional to our best guess at how good they are) is actually the best approximation to the expected-utility-maximizing decision.
I also find that:
- The degree of risk aversion matters for these conclusions. In my benchmark case, I choose to correspond to a logarithmic utility function. But for very low values of (0.4), Bayes achieves lower MSE, because with such low risk aversion, the expected utility maximizing allocation is much more extreme.
- The choice of how you define "upper tail" when computing the minimax allocation really matters. When I choose different percentiles (e.g. 95th vs 90th vs 80th) I get very different efficacies of minimax decisionmaking. For example, if the "upper tail" is defined as the 80th percentile, minimax decisionmaking is much better than both Bayes and heuristic decisionmaking. This is troubling given the arbitrariness of what counts as an upper tail.
Conclusion
Worldview diversification is a useful and practical way for the EA community to make decisions. In this note I show that it also has reasonable statistical foundations in non-Bayesian decisionmaking. These foundations come from risk aversion – when we are risk averse, worldview diversification offers us a way to be "only a little wrong" at all times, and can thus be optimal.
Appendix: deriving risk
If , the maximal utility is (i.e. the value of spending all money on ). If , the maximal utility is . Thus, the maximum utility is . In contrast, the utility of any allocation is . Thus, risk is
If , this expression becomes
Intuitively, our losses comes from both the gap between the value of the two causes, and how much money we spend on the worse cause ( is spending on , which is the worse cause in this scenario). By a symmetric argument, if , this expression is
Considering both of these cases, we can write risk as
which is the expression derived in the main text.
Technically there is a third approach, minimax regret, but in the special case we consider, where you can achieve zero loss with knowledge of , this is exactly the same as minimax. Also, if you're a real theorist, you might be complaining that I defined as risk instead of as loss, but these are also the same thing when you can achieve zero loss with knowledge of . ↩︎
Denote . Then the expectation is . By the law of iterated expectations, this can be rewritten as . But since only takes on two values, 1 or 0, this can be expanded into . The second term cancels out because of the zero, leaving the overall expression as . Replacing these with our original terms, we get . ↩︎
This utility function represents risk aversion because it is concave. See here for more explanation. ↩︎
SamiPetersen @ 2024-08-30T00:49 (+7)
Hi Karthik, thanks for writing this! I appreciate the precision; I wish I saw more content like this. But if you’ll allow me to object:
I feel there’s a bit of tension in you stating that “I don't think we should sidestep the philosophical aspect of this debate” while later concluding that “Worldview diversification is a useful and practical way for the EA community to make decisions.” Insofar as we are interested in a normatively satisfying foundation for diversifying donations (as a marginal funder), one would presumably need an argument in favour of something like minmax regret or risk aversion—on altruistic grounds.
Your results are microfoundations, as you write. Similarly, risk lovingness microfounds the prediction that a person will buy tickets for lotteries with negative monetary expectations. But that doesn’t imply that the person should do so. Likewise with risk aversion and diversification.
I think the important question here is whether e.g. risk aversion w.r.t. value created is reasonable.
In economics we’re used to treating basically any functional form for utility as permissible, so this is somewhat strange, but here we’re thinking about normative ethics rather than consumption choices. While it seems natural to exhibit diminishing marginal utility in consumption (hence risk aversion), it’s a bit more strange to say that one values additional wellbeing less the more lives have already been benefited. After all, the new beneficiary values it just as much as the previous one, and altruism is meant to be about them.
Here’s a thought experiment that brings out the counter-intuitiveness. Suppose you could pick either (i) a lottery giving a 60% chance of helping two people or else nobody, or (ii) a 100% chance of helping just one person but the lucky person is chosen by a fair coin. Then sufficient risk aversion will lead you to choose (ii) even though all potential beneficiaries prefer (i).
These aren’t meant as particularly good arguments for risk neutrality w.r.t terminal value; just pointers to the kind of considerations I think are more relevant to thinking about the reasonableness of altruistically-motivated funding diversification. There are others too, though (example: phil version / econ version; accessible summary here).
Karthik Tadepalli @ 2024-08-30T07:30 (+3)
Great points!
I feel there’s a bit of tension in you stating that “I don't think we should sidestep the philosophical aspect of this debate” while later concluding that “Worldview diversification is a useful and practical way for the EA community to make decisions.”
I say the former as a justification to avoid making an assumption (diminishing returns to money across causes) that would automatically support a balanced allocation of money without any other normative judgments. But I personally place high premium on decisions being "robustly" good so I do see worldview diversification as a useful and practical way to make decisions (to someone who places a premium on robustness).
In economics we’re used to treating basically any functional form for utility as permissible, so this is somewhat strange, but here we’re thinking about normative ethics rather than consumption choices.
I appreciate the push, since I didn't really mount a defense of risk aversion in the post. I don't really have a great interest in doing so. For one thing, I am axiomatically risk-averse and I don't put that belief up for debate. Risk aversion leads to the unpalatable conclusion that marginal lives are less worth saving, as you point out. But risk neutrality leads to the St Petersburg paradox. Both of them are slightly-contrived scenarios but not so contrived that I can easily dismiss them as irrelevant edge cases. I don't have solutions in mind (the papers you linked look interesting, but I find them hard to parse). So I don't feel passionately about arguing the case for risk-averse decisionmaking, but I still believe in it.
In reality I don't think anyone who practices worldview diversification (allocating resources across causes in a way that's inconsistent with any single worldview) actually places a really high premium on tight philosophical defenses of it. (See the quote at the start of the post!) I wrote this more for my own fun.
SamiPetersen @ 2024-08-30T09:49 (+4)
Thanks for the thoughtful reply!
I understand you don't want to debate risk attitudes, but I hope it's alright that I try to expand on my thought just a bit to make sure I get it accross well—no need to respond.
To be clear: I think risk aversion is entirely fine. My utility in apples is concave, of course. That's not really up for 'debate'. Likewise for other consumption preferences.
But ethics seems different. Philosophers debate what's permissible, mandatory, etc. in the context of ethics (not so much in the context of consumption). The EA enterprise is partly a result of this.
And choosing between uncertain altruistic interventions is of course in part a problem of ethics. Risk preferences w.r.t. wellbeing in the world make moral recommendations independently of empirical facts. This is why I see them as more up for debate. (Here's a great overview of such debates.)
We often argue about the mertis of ethical views under certainty: should our social welfare function concavify individual utilities before adding them up (prioritarianism) or not (utilitarianism)? Similarly, under uncertainty, we may ask: should our social welfare function concavify the sum of individual utilities (moral risk aversion) or not (moral risk neutrality)?
These are the sorts of questions I meant were relevant; I agree risk aversion per se is completely unproblematic.
By the way, this is irrelevant to the methodological point above, but I'll point out the interesting fact that risk aversion alone doesn't get rid of the problem of the St Petersburg paradox:
- A chance of winning £ with linear utility:
. - A chance of winning winning £ with log utility: .
Karthik Tadepalli @ 2024-08-30T16:36 (+1)
I don't mean to say that risk preferences in general are unimpeachable and beyond debate. I was only saying that I personally do not put my risk preferences up for debate, nor do I try to convince others about their risk preferences.
In any debate about different approaches to ethics, I place a lot of weight on intuitionism as a way to resolve debates. Considering the implications of different viewpoints for what I would have to accept is the way I decide what I value. I do not place a lot of weight on whether I can refute the internal logic of any viewpoint.