4. Why existing approaches to cause prioritization are not robust to unawareness

By Anthony DiGiovanni @ 2025-06-02T08:55 (+35)

We’re finally ready to see why unawareness so deeply undermines action guidance from impartial altruism. Let’s recollect the story thus far:

First: Under unawareness, “just take the expected value” is unmotivated.
Second: Likewise, “do what seems intuitively good and high-leverage, then you’ll at least do better than chance” doesn’t work. There’s too much room for imprecision, given our extremely weak evidence about the mechanisms that dominate our impact, and our track record of finding sign-flipping considerations.
Third: Hence, we turned to UEV, an imprecise model of strategies’ impact under unawareness. But there are two major reasons comparisons between strategies’ UEV will be indeterminate: the severe lack of constraints on how we should model the space of possibilities we’re unaware of, and imprecision due to coarseness.

The EA community has proposed several approaches to “robust” cause prioritization, which we might think avoid these problems. These approaches don’t explicitly use the concept of UEV, and not all the thinkers who proposed them necessarily intended them to be responses to unawareness. But we can reframe them as such. I’ll outline these approaches below, and show why each of them is insufficient to justify comparisons of strategies’ UEV. Where applicable, I’ll also explain why wager arguments don’t support following these approaches. Afterwards, I’ll share some parting thoughts on where to go from here.

Why each of the standard approaches is inadequate

(See Appendix D for more formal statements of these approaches. I defer my response to cluster thinking to Appendix E, since that’s more of a meta-level perspective on prioritization than a direct response to unawareness. In this list, the “contribution” of a hypothesis to the UEV for some strategy equals {value under $h$ } x {probability of $h$ given that strategy}. Recall that our awareness set is the set of hypotheses we’re aware of.)

Table 3. Outline of the standard approaches to strategy comparisons under unawareness.

Approach	Reasoning	Sources
Symmetry: The contribution of the catch-all to the UEV is equal for all strategies.	Since we have no idea about the implications of possibilities we haven’t considered, we have no reason to think they’d push in one direction vs. the other, in expectation. Then by symmetry, they cancel out.	St. Jules and Soares.^[1] Cf. Greaves’s (2016) discussion of “simple cluelessness”, and the principle of indifference.
Extrapolation: There’s a positive correlation between (i) the difference between strategies’ UEV and (ii) the difference between the contributions of the awareness set to their UEV.	Even if our sample of hypotheses we’re aware of is biased, the biases we know of don’t suggest the difference in UEV should have the opposite sign to the difference in EV on the awareness set. And we have no reason to think the unknown biases push in one direction vs. the other, in expectation.	Cotton-Barratt, and Tomasik on “Modeling unknown unknowns”.
Meta-extrapolation: Take some reference class of past decision problems where a given strategy was used. The strategy’s UEV is proportional to its empirical average EV over the problems in this reference class.	If there were cases in the past where (i) certain actors were unaware of key considerations that we’re aware of, but (ii) they nonetheless took actions that look good by our current lights, then emulating these actors will be more robust to unawareness.	Tomasik on “An effective altruist in 1800”, and Beckstead on “Broad approaches and past challenges”.
Simple Heuristics: The simpler the arguments that motivate a strategy, the more strongly the strategy’s UEV correlates with the EV we would have estimated had we not explicitly adjusted for unawareness.	More conjunctive arguments have more points of possible failure and hence are less likely to be true, especially under deep uncertainty about the future. Then strategies supported by simple arguments or rules will tend to do better.	Christiano. This approach is sometimes instead motivated by Meta-extrapolation, or by avoiding overfitting (Thorstad and Mogensen 2020).
Focus on Lock-in: The more a strategy aims to prevent a near-term lock-in event, the more strongly the strategy’s UEV correlates with the EV we would have estimated had we not explicitly adjusted for unawareness.	Our impacts on near-term lock-in events (e.g., AI x-risk and value lock-in) are easy to forecast without radical surprises. And it’s easy to assess the sign of large-scale persistent changes to the trajectory of the future.	Tarsney (2023), Greaves and MacAskill (2021, Sec. 4), and Karnofsky on “Steering”.
Capacity-Building: The more a strategy aims to increase the influence of future agents with our values, the higher its UEV.	Our more empowered (i.e., smarter, wiser, richer) successors will have a larger awareness set. So we should defer to them, by favoring strategies that increase the share of future empowered agents aligned with our values. This includes gaining resources, trying to increase civilizational wisdom, and doing research.	Christiano, Greaves and MacAskill (2021, Sec. 4.4), Tomasik on “Punting to the future”, and Bostrom on “Some partial remedies”.

Symmetry

Key takeaway

We can’t assume the considerations we’re unaware of “cancel out”, because when we discover a new consideration, this assumption no longer holds.

According to this approach, we should think the catch-all affects all strategies’ UEV equally. After all, it appears we have no information either way about the strategies’ likelihood of leading to outcomes in the catch-all.

Response

Upon closer inspection, we do have infomation about the catch-all, as we saw when we tried to model it. When we update our model of the catch-all on the discovery of a new hypothesis, the symmetry breaks. Specifically:

Assume Symmetry is true, to start. In the case of digital minds advocacy, for example, let’s say you assign equal contributions of the catch-all $h_{C A}$ to the values of strategies Advocate and Don’t Advocate.
Let ${h_{C A}}^{'}$ be the catch-all after you discover a new hypothesis that’s pessimistic about digital minds advocacy.
Presumably the arguments for ${h_{C A}}^{'}$ being more optimistic than $h_{C A}$ given this discovery aren’t exactly as strong as the arguments for more pessimism. So the contributions of ${h_{C A}}^{'}$ to the values of Advocate and Don’t Advocate are no longer equal.
And, as we’ve seen, it’s very unclear how to pin down the likelihood of discovering this new hypothesis. Thus we should have highly imprecise estimates of the UEV difference between the two strategies.

A common intuition is, “If it really is so unclear how to weigh up these updates about the catch-all, don’t the optimistic and pessimistic updates cancel out in expectation?” But our actual epistemic state is that we have reasons pointing in both directions, which don’t seem precisely balanced. If we treated these reasons as balanced anyway, when we could instead suspend judgment, we’d introduce artificial precision for no particular reason.^[2]

(How about an intuition like, “If I can’t tell whether the considerations I’m unaware of would favor $A$ or $B$ , these considerations are irrelevant to my decision-making”? I’ll respond to this below.)

Extrapolation

Key takeaway

We can’t trust that the hypotheses we’re aware of are a representative sample. Although we don’t know the net direction of our biases, this doesn’t justify the very strong assumption that we’re precisely unbiased in expectation.

Although the post-AGI future seems to involve many unfamiliar mechanisms, perhaps we can still get some (however modest) information about an intervention’s net impact based on our awareness set. And if these considerations are a representative sample of the whole set of considerations, it would make sense to extrapolate from the awareness set. (Analogously, when evaluating coarse hypotheses, we could extrapolate from a representative sample of fine-grained scenarios we bring to mind.)

Response

Our analysis of biased sampling showed that our awareness set is likely highly non-representative, so straightforward extrapolation is unjustified. What about the “Reasoning” for Extrapolation listed in Table 3? The claim seems to be that we should regard our net bias as zero ex ante, after adjusting for any (small) biases we know how to adjust for. But here we have the same problem as with Symmetry. If it’s unclear how to weigh up arguments for our biases being optimistic vs. pessimistic, then treating them as canceling out is ad hoc. The direction of our net bias is, rather, indeterminate.

Meta-extrapolation

Key takeaway

We can’t trust that a strategy’s past success under smaller-scale unawareness is representative of how well it would promote the impartial good. The mechanisms that made a strategy work historically could actively mislead us when predicting its success on a radically unfamiliar scale.

The problem with Extrapolation was that strategies that work on the awareness set might not generalize to the catch-all. We can’t directly test how well strategies’ far-future performance generalizes, of course. But what if we use strategies that have successfully generalized under unawareness with respect to more local goals? We could look at historical (or simulated?) cases where people were unaware of considerations highly relevant to their goals (that we’re aware of), and see which strategies did best. From Tomasik:

[I]magine an effective altruist in the year 1800 trying to optimize his positive impact. … What [he] might have guessed correctly would have been the importance of world peace, philosophical reflection, positive-sum social institutions, and wisdom. Promoting those in 1800 may have been close to the best thing this person could have done, and this suggests that these may remain among the best options for us today.

The underlying model here is: We have two domains, (i) a distribution of decision problems from the past and (ii) the problem of impartially improving overall welfare. If the mechanisms governing successful problem-solving under unawareness are relevantly similar between (i) and (ii), then strategies’ relative performance on (i) is evidence about their relative performance on (ii).

Response

I agree that we get some evidence from past performance, as per this model. But that evidence seems quite weak, given the potentially large dissimilarities between the mechanisms determining local vs. cosmic-scale performance. (See footnote for comments on a secondary problem.^[3])

Concretely, take Tomasik’s example. We can mechanistically explain why a 19th century EA would’ve improved the next two centuries on Earth by promoting peace and reflection (caveats in footnote^[4]). On such a relatively small scale, there’s not much room for the sign of one’s impact to depend on factors as exotic as, say, the trajectory of space colonization by superintelligences or the nature of acausal trade. Obviously, the next two centuries went in some surprising directions. But the dominant mechanisms affecting net human welfare were still socioeconomic forces interacting with human biology (recall my rebuttal of the “superforecasting track records” argument). So on this scale, the consistent benefits of promoting peace and reflection aren’t that surprising. By contrast, we’ve surveyed various reasons why the dominant mechanisms affecting net welfare for all sentient beings might differ from those affecting local welfare.

We might think the above is beside the point, and reason as follows:

Weak evidence is still more informative than no evidence. As long as past success generalizes to some degree to promoting the impartial good, we can wager on Meta-extrapolation. If we’re wrong, every strategy is (close to) equally effective ex ante anyway.

Like we saw in the imprecision FAQ, however, this wager fails because of “insensitivity to mild sweetening”: When we compare strategies’ UEV, we need to weigh the evidence from past performance against other sources of evidence — namely, object-level considerations about strategies’ impact on certain variables (e.g., how they affect lock-in events). And, as shown in the third post, our estimates of such impacts should be highly imprecise. So the weak evidence from past performance, added on top of other evidence we have no idea how to weigh up, isn’t a tiebreaker.^[5]

Simple Heuristics

Key takeaway

The argument that heuristics are robust assumes we can neglect complex effects (i.e., effects beyond the “first order”), either in expectation or absolutely. But under unawareness, we have no reason to think these effects cancel out, and should expect them to matter a lot collectively.

We’ve just seen why heuristics aren’t justified by their past success. Nor are they justified by rejecting detailed world models,^[6] nor by the reliability of our intuitions.

The most compelling motivation for this approach, instead, goes like this (Christiano;^[7] see his post for examples): Informally, let’s say an effect of a strategy is some event that contributes to the total value of the world resulting from that strategy. E.g., speeding up technological growth might have effects like “humans in the next few years are happier due to greater wealth” and “farmed animals in the next few years suffer more due to more meat consumption”. The effects predicted by arguments with fewer logical steps are more likely. Whereas complex effects are less likely or cancel out. So strategies supported by simple reasoning will have higher UEV.

Response

I’d agree that if we want to say how widely a given effect holds across worlds we’re unaware of, logical simplicity is an unusually robust measure. But this doesn’t mean the overall contribution of complex effects to the UEV is small in expectation. More precisely, here’s my understanding of how the argument above works:

We can decompose the UEV of a strategy $s$ into a sum of all possible effects of $s$ , weighted by our credence in each effect.
All else equal, our credence in each effect should decrease with the number of logical steps $N$ in the argument that justifies the existence of the effect (which we call an $N$ -step effect). In particular, our credence should decrease with $N$ sufficiently quickly that the UEV is dominated by 1-step effects. This is because the more logical steps there are in the argument, the more opportunities there are for our reasoning to be flawed.
And we should expect to be aware of most of the 1-step effects, after a modest amount of thinking/research. Thus, we can estimate the UEV to a high level of precision.

The counterargument: In order to say that the UEV is dominated by 1-step effects, we need either (for $N > 1$ ):

a reason to think the $N$ -step effects average out to some small range of values, or
a sufficiently low upper bound on our total credence in all $N$ -step effects.

But all the reasons for severe imprecision due to unawareness that we’ve seen, and the counterargument to symmetric “canceling out”, undermine (1). And we’ve also seen reasons to consider our unawareness vast, suggesting that the $N$ -step effects we’re unaware of are collectively significant. This undermines (2).

Focus on Lock-in

Key takeaway

Even if we focus on near-term lock-in, we can’t control our impact on these lock-in events precisely enough, nor can we tease apart their relative value when we only picture them coarsely.

This approach is a more general version of the logic critiqued in our case study: The long-term trajectory (or existence) of Earth-originating civilization could soon be locked into an “attractor state”, such as extinction or permanent human disempowerment. Plausibly, then, we could have a large-scale, persistent, and easy-to-evaluate impact by targeting these attractor states. And these states seem foreseeable enough to reliably intervene on. So, trying to push toward better attractor states appears to score well on both outcome robustness and implementation robustness. (Which is why I think Focus on Lock-in is one of the strongest approaches to UEV comparisons, in theory.)

Response

Unfortunately, as illustrated by the case study, we seem to have severe unawareness at the level of both the relative value, and likelihood given our strategy, of different attractor states. To recap, if our candidate intervention is “try to stop AIs from permanently disempowering humans”:

Problems for outcome robustness: We’re only very coarsely aware of attractors like “Earth-originating space colonization (‘SC’) by benevolent humans”, “SC by human-disempowering AI”, etc. So, their relative value (from an impartial perspective) seems indeterminate. Considerations like acausal trade, the possibility of other civilizations in our lightcone, various possible path-dependencies in SC, and pessimistic induction worsen this problem.
Problems for implementation robustness: Even if the lock-in events of interest occur quite soon, they’re still fairly unprecedented and riddled with complexity that warrants imprecise estimates. We have no experience with shaping the values of superintelligent agents. And we need to account for our off-target effects on lock-in events other than the one we aim to intervene on. (This is a problem even if we can precisely weigh up our effects on the likelihood of different lock-in events! See Appendix F for a toy example.)

Put differently: There’s a significant gap between “our ‘best guess’ is that reaching a given coarse-grained attractor would be positive” and “our concrete attempts to steer toward rather than away from this attractor will likely succeed and ‘fail gracefully’ (i.e., not land us in a worse-than-default attractor), and our guess at the sign of the intended attractor is robust”. When we account for this gap, we can’t reliably weigh up the effects that dominate the UEV.

This discussion also implies that, once again, we can’t “wager” on Focus on Lock-in. The problem is not that the future seems too chaotic for us to have any systematic impact. Instead, we’re too unaware to say whether, on net, we’re avoiding locking in something worse.

Capacity-Building

Key takeaway

This approach doesn’t help for reasons analogous to those of Focus on Lock-In.

Even if the direct interventions currently available to us aren’t robust, what about trying to put our successors in a better position to positively intervene? There seem to be two rough clusters of Capacity-Building (CB) strategies:

High-footprint: Community/movement building; broad values advocacy; gaining significant influence on AGI development, deployment, or governance; aiming to improve civilizational “wisdom” (see, e.g., the conclusion of Carlsmith (2022)).
- Argument: You should expect the value of the future to correlate with the share of future powerful agents who have your goals. This is based on both historical evidence and compelling intuitions, which don’t seem sensitive to the details of how the future plays out.
Low-footprint: Saving money; selectively making connections with influential actors with very similar priorities; and doing (but only selectively sharing) research.
- Argument: If your future self has more resources/information, they’ll presumably take actions with higher UEV than whatever you can currently do. Thus, if you can gain more resources/information, holding fixed how much you change other variables, your strategy will have higher UEV than otherwise.

Response

High-footprint. Once more, historical evidence doesn’t seem sufficient to resolve severe imprecision. Perhaps the mechanism “gaining power for our values increases the amount of optimization for our values” works at a very general level, so the disanalogies between the past and future aren’t that relevant? But the problems with Focus on Lock-in recur here, especially for implementation robustness: How much “power” “our values” stably retain into the far future is a very coarse variable (more details in footnote^[8]). So, to increase this variable, we must intervene in the right directions on a complex network of mechanisms, some of which lie too far in the future to forecast reliably. Our overall impact on how empowered our values will be, therefore, ends up indeterminate when we account for systematic downsides (e.g., attention hazards, as in the case of AI x-risk movement building). Not to mention off-target effects on lock-in events. For example, even if pushing for the U.S. to win the AI race does empower liberal democratic values, this could be outweighed by increased misalignment risk.^[9]

We might have the strong intuition that trying to increase some simple variable tends to increase that variable, not decrease it. Yet we’ve seen compelling reasons to distrust intuitions about our large-scale impact, even when these intuitions are justified in more familiar, local-scale problems.

Low-footprint. I’d agree that low-footprint CB could make our successors discover, or more effectively implement, interventions that are net-positive from their epistemic vantage point. That’s a not-vanishingly-unlikely upside of this strategy. But what matters is whether low-footprint CB is net-positive from our vantage point. How does this upside compare to the downsides from possibly hindering interventions that are positive ex post? (It’s not necessary for such interventions to be positive ex ante, in order for this to be a downside risk, since we’re evaluating CB from our perspective; see Appendix C.) Or other off-target large-scale effects we’re only coarsely aware of (including, again, our effects on lock-in)?

Real-world implementations of “saving money” and “doing research” may have downsides like:

Suppose that instead of spending money on yourself, you put it in escrow for future promising altruistic opportunities. Perhaps others follow your example and save more, when they would’ve otherwise spent money on a time-sensitive intervention that might have been highly (yet non-robustly) positive.
Or suppose that instead of spending some of your free time reflecting on virtue ethics, you reflect on whether some intervention could be robust to unawareness. Maybe this shapes the salience of different strategies to your future self, and you end up attempting an intervention that’s worse than what you would’ve done by default.

Intuitively, these side effects may seem like hand-wringing nitpicks. But I think this intuition comes from mistakenly privileging intended consequences, and thinking of our future selves as perfectly coherent extensions of our current selves. We’re trying to weigh up speculative upsides and downsides to a degree of precision beyond our reach. In the face of this much epistemic fog, unintended effects could quite easily toggle the large-scale levers on the future in either direction.

Another option: Rejecting (U)EV?

Key takeaway

Suppose that when we choose between strategies, we only consider the effects we can weigh up under unawareness, because (we think) the other effects aren’t decision-relevant. Then, it seems arbitrary how we group together “effects we can weigh up”.

No matter how we slice it, we seem unable to show that the UEV of $A$ dominates that of $B$ , for any given $A$ and $B$ . Is there any other impartial justification we could offer for our choice of strategy?

The arguments in this sequence seem to apply just as well to any alternative to EV that still explicitly aggregates over possible worlds, such as EV with risk aversion or discounting tiny probabilities. This is because the core problem isn’t tiny probabilities of downsides, but the severe imprecision of our evaluations of outcomes.

Here’s another approach. Clifton notes that, even if we want to make choices based purely on the impartial good, perhaps we get action guidance from the following reasoning, which he calls Option 3 (paraphrasing):

I suspend judgment on whether $A$ results in higher expected total welfare than $B$ . But on one hand, $A$ is better than $B$ with respect to some subset of their overall effects (e.g., donating to AMF saves lives in the near term), which gives me a reason in favor of $A$ . On the other hand, I have no clue how to compare $A$ to $B$ with respect to all the other effects (e.g., donating to AMF might increase or decrease x-risk; Mogensen 2020), which therefore don’t give me any reasons in favor of $B$ . So, overall, I have a reason to choose $A$ but no reason to choose $B$ , so I should choose $A$ .^[10]

In our setting, “all the other effects” might be the effects we’re only very coarsely aware of (including those in the catch-all).

I have some sympathy for Option 3 as far as it goes. Notably, insofar as Option 3 is action-guiding, it seems to support a focus on near-term welfare, which would already suggest significant changes to current EA prioritization. I’m keen for others to carefully analyze Option 3 as a fully-fledged decision theory, but for now, here are my current doubts about it:

There are many different ways of carving up the set of “effects” according to the reasoning above, which favor different strategies. For example: I might say that I’m confident that an AMF donation saves lives, and I’m clueless about its long-term effects overall. Yet I could just as well say I’m confident that there’s some nontrivially likely possible world containing an astronomical number of happy lives, which the donation makes less likely via potentially increasing x-risk, and I’m clueless about all the other effects overall. So, at least without an argument that some decomposition of the effects is normatively privileged over others, Option 3 won’t give us much action guidance.
Setting the above aside, suppose Option 3 uniquely recommends $A$ . Should I interpret this as, “I ‘expect’ $A$ to have systematically better effects than $B$ , impartially speaking”? In particular, should I consider the impartial altruistic argument for choosing $A$ over $B$ to be as strong as if “all the other effects” canceled out in expectation? Arguably not. I still care about those other effects, whether or not I can determine which strategy they recommend. I think that, properly disentangled, these effects give me some reasons in favor of $A$ , some in favor of $B$ , and it’s indeterminate how to weigh these reasons up. So I’m not sure I see the moral urgency of the Option 3 argument.
- This matters because, as I’ve argued previously, even if the impartial good gives no reasons for choice, we can choose based on other normative criteria. And these criteria might be more compelling than Option 3.^[11]
- That said, I’m not highly confident in this answer, and I think this is a subtle (meta-)normative question.

Conclusion and taking stock of implications

Stepping back, here’s my assessment of our situation:

Insofar as we’re trying to be “effective altruists”, the reason we’d choose to work on some cause is: We think this cause will have better consequences ex ante than the alternatives, accounting for all the hypotheses that bear significantly on our impact on all moral patients.
- This is a wildly far-reaching objective! So it’s no surprise if strategies that succeed on everyday scales are unjustified here.
If we just make up guesses as to our impact under some of these hypotheses, we can’t honestly say we’re aiming for better consequences ex ante.
- Even if I’m wrong about what exactly constitutes “making up guesses”, this broader point seems very neglected in EA cause prioritization. If a robust-to-unawareness EA portfolio exists, I doubt it would look similar to the status quo.
One hope is that our intuitions could implicitly price in the futures we’re unaware of. But we don’t have reason to think our “best guesses” (including calibration-trained forecasts, induction from past successes, and heuristics) do this — not with enough precision to justify claims about whether our strategy has net-better consequences across the cosmos than another.
Another hope is that some interventions might be backed by deep structural reasons why they’re net-positive, no matter what surprises the future throws at us. But when we aim to pull a robustly good lever on the future, in practice we’re pulling an ambiguous mix of good and bad levers. Even when we do research to find better levers.

So, I’m not aware of any plausible argument that one strategy has better net consequences under unawareness than another. Of course, I’d be excited to see such an argument! Indeed, one practical upshot of this sequence is that the EA project needs more rethinking of its epistemic and decision-theoretic foundations. That said, I think the challenge to impartial altruist action guidance runs very deep.

We might be tempted to say, “If accounting for unawareness implies that no strategy is better than another, we might as well wager on whatever looks best when we ignore unawareness.” This misses the point.

First, as we’ve seen in this post, wager arguments don’t work when the hypotheses you set aside say your expected impact is indeterminate, rather than zero.

Second, my claim is that no strategy is better than another with respect to the impartial good, according to our current understanding of epistemics and decision theory. Even if we suspend judgment on how good or bad our actions are impartially speaking, again, we can turn to other kinds of normative standards to guide our decisions. I hope to say more on this in future writings.

Finally, whatever reason we may have to pursue the strategy that looks best ignoring unawareness, it’s not the same kind of reason that effective altruists are looking for. Ask yourself: Does “this strategy seems good when I assume away my epistemic limitations” have the deep moral urgency that drew you to EA in the first place?

“But what should we do, then?” Well, we still have reason to respect other values we hold dear — those that were never grounded purely in the impartial good in the first place. Integrity, care for those we love, and generally not being a jerk, for starters. Beyond that, my honest answer is: I don’t know. I want a clear alternative path forward as much as the next impact-driven person. And yet, I think this question, too, misses the point. What matters is that, if I’m right, we don’t have reason to favor our default path, and the first step is owning up to that. To riff on Karnofsky’s “call to vigilance”, I propose a call to reflection.

This is a sobering conclusion, but I don’t see any other epistemically plausible way to take the impartial good seriously. I’d love to be wrong.

Appendix D: Formal statements of standard approaches to UEV comparisons

Recalling the notation from Appendix B, here’s some shorthand:

$p (h | s) = \sum_{w_{h} \in W_{h}} p (w_{h} | s)$ .
The value of strategy $s$ under $h$ with respect to some $p$ in $P$ is $E V_{p}^{*} (s | h) = \frac{\sum_{w_{h} \in W_{h}} v (w_{h}) p (w_{h} | s)}{p (h | s)} .$

Then:

Symmetry: For any pair of strategies $s_{1}, s_{2}$ , we have $E V_{p}^{*} (s_{1} | h_{C A}) p (h_{C A} | s_{1}) - E V_{p}^{*} (s_{2} | h_{C A}) p (h_{C A} | s_{2}) = 0$ for all $p$ in $P$ .
Extrapolation: For any pair of strategies $s_{1}, s_{2}$ , we have $E V_{p}^{*} (s_{1} | h_{C A}) p (h_{C A} | s_{1}) - E V_{p}^{*} (s_{2} | h_{C A}) p (h_{C A} | s_{2}) \propto \sum_{h \in H_{A}} E V_{p}^{*} (s_{1} | h) p (h | s_{1}) - \sum_{h \in H_{A}} E V_{p}^{*} (s_{2} | h) p (h | s_{2})$ for all $p$ in $P$ .
Meta-extrapolation: Consider some reference class of past problems where the decision-maker had significant unawareness, relative to a “local” value function $u_{L}$ . Let $S_{1}, S_{2}$ be two categories of strategies used in these problems, and $_{L} (S_{1}),_{L} (S_{2})$ be the average local value achieved by strategies in $S_{1}, S_{2}$ . Then if $s_{1}$ is relatively similar to $S_{1}$ and $s_{2}$ is relatively similar to $S_{2}$ , we have $E V_{p}^{*} (s_{1}) - E V_{p}^{*} (s_{2}) \propto_{L} (S_{1}) -_{L} (S_{2})$ for all $p$ in $P$ .
Simple Heuristics: Let $C (s)$ be some measure of the complexity of the arguments that motivate $s$ . Then for any strategy $s$ , we have that $E V_{p}^{*} (s)$ correlates more strongly with the EV we would have estimated had we not explicitly adjusted for unawareness, the lower $C (s)$ is (for all $p$ in $P$ ).
Focus on Lock-in: Let $T (s)$ be the amount of time between implementation of $s$ and the intended impact on some target variable, and $P (s)$ be the amount of time the impact on the target variable persists. Then for any strategy $s$ , we have that $E V_{p}^{*} (s)$ correlates more strongly with the EV we would have estimated had we not explicitly adjusted for unawareness, the higher $P (s)$ is and the lower $T (s)$ is (for all $p$ in $P$ ).
Capacity-Building: Let $u^{*} (s) = E V_{q}^{*} (s^{'} (s))$ , where $q$ represents the beliefs of some successor agent and $s^{'} (s)$ is the strategy this agent follows given that we follow strategy $s$ . (So $u^{*} (s)$ is an element of the UEV, with respect to our values and the successor’s beliefs, of whatever the successor will do given what we do.) Then $E V_{p}^{*} (s) \propto E_{s^{'} \sim p} u^{*} (s)$ for all $p$ in $P$ ,^[12] and (we suppose) we can form a reasonable estimate $E_{s^{'} \sim p} u^{*} (s)$ . In particular, capacity-building strategies tend to have large and positive $E_{s^{'} \sim p} u^{*} (s)$ .

Appendix E: On cluster thinking

One approach to cause prioritization that’s commonly regarded as robust to unknown unknowns is cluster thinking. Very briefly, cluster thinking works like this: Take several different world models, and find the best strategy according to each model. Then, choose your strategy by aggregating the different models’ recommendations, giving more weight to models that seem less likely to be missing key parameters.

Cluster thinking consists of a complex set of claims and framings, and I think you can agree with what I’ll argue next while still endorsing cluster thinking over sequence thinking. So, for the sake of scope, I won’t give my full appraisal here. Instead, I’ll briefly comment on why I don’t buy the following arguments that cluster thinking justifies strategy comparisons under unawareness (see footnotes for supporting quotes):

“We can tell which world models are more or less likely to be massively misspecified, i.e., feature lots of unawareness. So strategies that do better according to less-misspecified models are better than those that don’t. This is true even if, in absolute terms, all our models are bad.”^[13]
- Response: This merely pushes the problem back. Take the least-misspecified model we can think of. How do we compare strategies’ performance under that model? To answer that, it seems we need to look at the arguments for particular approaches to modeling strategy performance under unawareness, which I respond to in the rest of the post.
“World models that recommend doing what has worked well across many situations in the past, and/or following heuristics, aren’t very sensitive to unawareness.”^[14]
- Response: I address these in my responses to Meta-extrapolation and Simple Heuristics. The same problems apply to the claim that cluster thinking itself has worked well in the past.

Appendix F: Toy example of failure of implementation robustness

Recall the attractor states defined in the case study. At a high level: Suppose you know your intervention will move much more probability mass (i) from Rogue to Benevolent, than (ii) from Rogue to Malevolent. And suppose Benevolent is better than Rogue, which is better than Malevolent (so you have outcome robustness), but your estimates of the value of each attractor are highly imprecise. Then, despite these favorable assumptions, the sign of your intervention can still be indeterminate.

More formally, let:

$[70, 75]$ be the value of the actual world under Benevolent;
$[40, 65]$ be the value under Rogue; and
$[0, 30]$ be the value under Malevolent.

And suppose an intervention $s$ shifts 10% of probability mass from Rogue to Benevolent, and 1% from Rogue to Malevolent.

Then, there’s some $p$ in $P$ (representing the worst-case outcome of the intervention) such that the change in UEV (written $Δ E V_{p}^{*} (s)$ ) is:

$Δ E V_{p}^{*} (s) = 0.10 \cdot (70 - 65) + 0.01 \cdot (0 - 65) = - 0.15$ .

So, $s$ is not robustly positive.

References

Beckstead, Nicholas. 2013. “On the Overwhelming Importance of Shaping the Far Future.” https://doi.org/10.7282/T35M649T.

Carlsmith, Joseph. 2022. “A Stranger Priority? Topics at the Outer Reaches of Effective Altruism.” University of Oxford.

Greaves, Hilary. 2016. “Cluelessness.” Proceedings of the Aristotelian Society 116 (3): 311–39.

Greaves, Hilary, and William MacAskill. 2021. “The Case for Strong Longtermism.” Global Priorities Institute Working Paper No. 5-2021, University of Oxford.

Mogensen, Andreas L. 2020. “Maximal Cluelessness.” The Philosophical Quarterly 71 (1): 141–62.

Tarsney, Christian. 2023. “The Epistemic Challenge to Longtermism.” Synthese 201 (6): 1–37.

Thorstad, David, and Andreas Mogensen. 2020. “Heuristics for Clueless Agents: How to Get Away with Ignoring What Matters Most in Ordinary Decision-Making.” Global Priorities Institute Working Paper No. 2-2020, University of Oxford.

^{^}
Quote (emphasis mine): “And if I expect that I have absolutely no idea what the black swans will look like but also have no reason to believe black swans will make this event any more or less likely, then even though I won't adjust my credence further, I can still increase the variance of my distribution over my future credence for this event.”
^{^}
See also Buhler, section “The good and bad consequences of an action we can’t estimate without judgment calls cancel each other out, such that judgment calls are unnecessary”.
^{^}
Assume our only way to evaluate strategies is via Meta-extrapolation. We still need to weigh up the net direction of the evidence from past performance. But how do we choose the reference class of decision problems, especially the goals they’re measured against? It’s very underdetermined what makes goals “relevantly similar”. E.g.:
- Relative to the goal “increase medium-term human welfare”, Tomasik’s philosophical reflection strategy worked well.
- Relative to “save people’s immortal souls”, the consequences of modern philosophy were disastrous.
The former goal matches the naturalism of EA-style impartial altruism, but the latter goal matches its unusually high stakes. Even with a privileged reference class, our sample of past decision problems might be biased, leading to the same ambiguities noted above.
^{^}
Technically, we don’t know the counterfactual, and one could argue that these strategies made atrocities in the 1900s worse. See, e.g., the consequences of dictators reflecting on ambitious philosophical ideas like utopian ideologies, or the rise of factory farming thanks to civilizational stability. At any rate, farmed animal suffering is an exception that proves the rule. Once we account for a new large set of moral patients whose welfare depends on different mechanisms, the trend of making things “better” breaks.
^{^}
More formally: Our set of probability distributions $P$ (in the UEV model) should include some $p$ under which a strategy that succeeded in the past is systematically worse for the impartial good (that is, the UEV is lower) than other strategies.
^{^}
Recap: Adherence to a heuristic isn’t in itself an impartial altruistic justification for a strategy. We need some argument for why this heuristic leads to impartially better outcomes, despite severe unawareness.
^{^}
Quote: “If any of these propositions is wrong, the argument loses all force, and so they require a relatively detailed picture of the world to be accurate. The argument for the general goodness of economic progress or better information seems to be much more robust, and to apply even if our model of the future is badly wrong.”
^{^}
“Power”, for example, consists of factors like:
- access to advanced AI systems (which domains is it more or less helpful for them to be “advanced” in?);
- influence over governments and militaries (which kinds of “influence”?);
- memetic fitness;
- bargaining leverage, and willingness to use it;
- capabilities for value lock-in; and
- protections from existential risk.
I’m not sure why we should expect that by increasing any one of these factors, we won’t have off-target effects on the other factors that could outweigh the intended upside.
^{^}
We might try hedging against the backfire risks we incur now by also empowering our successors to fix them in the future (cf. St. Jules). However, this doesn’t work if some downsides are irreversible by the time our successors know how to fix them, or if we fail to empower successors with our values in the first place. See also this related critique of the “option value” argument for x-risk reduction.
^{^}
As Clifton emphasizes, this reasoning is not equivalent to Symmetry. The claim isn’t that the net welfare in “all the other effects” is zero in expectation, only that the indeterminate balance of those effects gives us no reason for/against choosing some option.
^{^}
We might wonder: “Doesn’t the counterargument to Option 3 apply just as well to this approach of ‘suspend judgment on the impartial good, and act based on other normative criteria?” It’s out of scope to give a full answer to this. But my current view is, even if the boundaries between effects we are and aren’t “clueless about” in an impartial consequentialist calculus are arbitrary, the boundary between impartial consequentialist reasons for choice and non-impartial-consequentialist reasons for choice is not arbitrary. So it’s reasonable to “bracket” the former and make decisions based on the latter.
^{^}
We take the expectation over s’ with respect to our beliefs p, because we’re uncertain what strategy our successor will follow.
^{^}
Quote: “Cluster thinking uses several models of the world in parallel … and limits the weight each can carry based on robustness (by which I mean the opposite of Knightian uncertainty: the feeling that a model is robust and unlikely to be missing key parameters).”
^{^}
Quote (emphasis mine): “Correcting for missed parameters and overestimated probabilities will be more likely to cause “regression to normality” (and to the predictions of other “outside views”) than the reverse. … And I believe that the sort of outside views that tend to get more weight in cluster thinking are often good predictors of “unknown unknowns.” For example, obeying common-sense morality (“ends don’t justify the means”) heuristics seems often to lead to unexpected good outcomes…”

Magnus Vinding @ 2025-06-02T14:33 (+13)

Kudos for writing up this series, Anthony! :) It's not easy to face deeply unpleasant and inconvenient (potential) conclusions, let alone to stare them down and write about them.

I have a lot of thoughts on what you write, but for now I'll share just three of them (in separate comments).

Magnus Vinding @ 2025-06-02T15:50 (+10)

Toward the very end, you write:

“But what should we do, then?” Well, we still have reason to respect other values we hold dear — those that were never grounded purely in the impartial good in the first place. Integrity, care for those we love, and generally not being a jerk, for starters. Beyond that, my honest answer is: I don’t know.

You obviously don't exclude the following, but I would strongly hope that — beyond just integrity, care for those we love, and not being a jerk — we can also at a minimum endorse a commitment to reducing overt and gratuitous suffering taking place around us, even if it might not be the single best thing we can do from the perspective of perfect impartiality across all space and time. This value seems to me to be on a similarly strong footing as the other three you mention, and it doesn't seem like it stands or falls with perfect [or otherwise very strong] cosmic impartiality. I suspect you agree with its inclusion, but I feel like it deserves emphasis in its own right.

Relatedly, in response to this:

Ask yourself: Does “this strategy seems good when I assume away my epistemic limitations” have the deep moral urgency that drew you to EA in the first place?

I would say "yes", e.g. if I replace "this strategy" with something like "reducing intense suffering around me seems good [even] when I assume away my epistemic limitations [about long-term cosmic impacts]”. That does at least carry much of the deep moral urgency that motivates me. I mean, just as I can care for those I love without needing to ground it in perfect cosmic impartiality, I can also seek to reduce the suffering of other sentient beings without needing to rely on a maximally impartial perspective.

Anthony DiGiovanni @ 2025-06-09T08:28 (+6)

Thanks for this Magnus, I have complicated thoughts on this point, hence my late reply! To some extent I'll punt this to a forthcoming Substack post, but FWIW:

As you know, relieving suffering is profoundly important to me. I'd very much like a way to make sense of this moral impulse in our situation (and I intend to reflect on how to do so).

But it's very important that the problem isn't that we don't know "the single best thing" to do. It's that if I don't ignore my effects on far-future (etc.) suffering, I have no particular reason to think I'm "relieving suffering" overall. Rather, I'm plausibly increasing or decreasing suffering elsewhere, quite drastically, and I can't say these effects cancel out in expectation. (Maybe you're thinking of "Option 3" in this section? If so, I'm curious where you disagree with my response.)

The reason suffering matters so deeply to me is the nature of suffering itself, regardless of where or when it happens — presumably you'd agree. From that perspective, and given the above, I'm not sure I understand the motivation for your view in your second paragraph. (The reasons to do various parochial things, or respect deontological constraints, aren't like this. They aren't grounded in something like "this thing out there in the world is horrible, and should be prevented wherever/whenever it is [or whoever causes it]".)

Magnus Vinding @ 2025-06-13T12:58 (+2)

Yeah, my basic point was that just as I don't think we need to ground a value like "caring for those we love" in whether it has the best consequences across all time and space, I think the same applies to many other instances of caring for and helping individuals — not just those we love.

For example, if we walk past a complete stranger who is enduring torment and is in need of urgent help, we would rightly take action to help this person, even if we cannot say whether this action reduces total suffering or otherwise improves the world overall. I think that's a reasonable practical stance, and I think the spirit of this stance applies to many ways in which we can and do benefit strangers, not just to rare emergencies.

In other words, I was just trying to say that when it comes to reasonable values aimed at helping others, I don't think it's a case of "it must be grounded in strong impartiality or bust". Descriptively, I don't think that reflects virtually anyone's actual values or revealed preferences, and I don't think it's reasonable from a prescriptive perspective either (e.g. I don't think it's reasonable or defensible to abstain from helping a tormented stranger based on cluelessness about the large-scale consequences).

Anthony DiGiovanni @ 2025-06-14T15:25 (+2)

I've replied to this in a separate Quick Take. :) (Not sure if you'd disagree with any of what I write, but I found it helpful to clarify my position. Thanks for prompting this!)

Jim Buhler @ 2025-06-25T07:28 (+5)

I've just realized that I find your objections to Clifton's Option 3 much less compelling when applied to something like the following scenario I'm making up:

Four miles away from you, there's a terrorist you want dead. A sniper is locked on his position (accounting for gravity). No one has ever hit a target from this distance. The sniper will overwhelmingly likely miss the shot because of factors (other than gravity) affecting the bullet you cannot estimate from that far (the different wind layers, the Earth's rotation, etc.). You're tempted to think "well, there's no harm in trying" expect that the terrorist is holding a kid you do not want dead and they cover exactly has much surface area the way they stand. Say your utility for hitting the target is +1trillion and -1trillion for hitting the kid, and you are an EV maximizer, and you're the one who has to tell the sniper whether to take the shot or to stand down. If you thought the sniper's shot was even a tiny bit more likely to hit the target than the kid, you would tell her to take it. But you don't think that! (Something like the principle of indifference seemingly doesn't work here.) You're (complexely) clueless and therefore indifferent between shooting and not shooting. But this was before you remembered that Emily, the sniper, told you the other day that her shoulder hurts everytime she takes a shot. You care about Emily's shoulder pain much less than you care about where the bullet ends (say utility -1 if she shoots). But well, doesn't that give you a very good Option-3-like reason to tell her to stand down?

If I take your objections to Option 3 and replace some of the words to make it apply to my above scenario, I intuitively find them almost crazy. Do you have the same feeling? Is there a relevant difference with my scenario that I'm missing? (Or maybe my strong intuition for why we should tell Emily to stand down is actually not because of Option 3 being compelling but something else?)

MichaelStJules @ 2025-06-27T00:53 (+4)

Against option 3, you write:

There are many different ways of carving up the set of “effects” according to the reasoning above, which favor different strategies. For example: I might say that I’m confident that an AMF donation saves lives, and I’m clueless about its long-term effects overall. Yet I could just as well say I’m confident that there’s some nontrivially likely possible world containing an astronomical number of happy lives, which the donation makes less likely via potentially increasing x-risk, and I’m clueless about all the other effects overall. So, at least without an argument that some decomposition of the effects is normatively privileged over others, Option 3 won’t give us much action guidance.

Wouldn't you also say that the donation makes these happy lives more likely on some elements of your representor via potentially increasing x-risk? So then they're neither made determinately better off nor determinately worse off in expectation, and we can (maybe) ignore them.

Maybe you need some account of transworld identity (or counterparts) to match these lives across possible worlds, though.

Anthony DiGiovanni @ 2025-06-27T13:52 (+2)

Maybe you need some account of transworld identity (or counterparts) to match these lives across possible worlds

That's the concern, yeah. When I said ”some nontrivially likely possible world containing an astronomical number of happy lives”, I should have said these were happy experience-moments, which (1) by definition only exist in the given possible world, and (2) seem to be the things I ultimately morally care about, not transworld persons.^[1] Likewise each of the experience-moments of the lives directly saved by the AMF donation only exist in a given possible world.

^{^}
(Or spacetime regions.)

Magnus Vinding @ 2025-06-02T15:12 (+3)

You write the following in the first post in the sequence (I comment on it here because it relates closely to similar remarks in this post):

if my arguments hold up, our reason to work on EA causes is undermined.

This claim seems to implicitly assume that perfect impartiality [edit: or very strong forms of impartiality] across all space and time is the only reason or grounding we could have for working on EA causes. But that's hardly the case — there are countless alternative reasons or moral stances that could ground support for EA (or work on EA causes). For example, one could be normatively near-termist, or one could gradually discount the interests of beings depending on how distant they seem to be from our potential to predictably help them (and one could then still aim to be effective and indeed impartial within the parameters of those frameworks).

Of course, one might disagree with views that aren't perfectly impartial, and argue that they're implausible, but that's different from saying that they can't coherently ground work on EA causes. (And indeed, the arguments raised in this series could be viewed as lending some support to such alternative views, or at least give us reason to be more curious about them.)

Anthony DiGiovanni @ 2025-06-02T16:07 (+2)

perfect impartiality across all space and time

Sorry this wasn't clear — my arguments throughout this sequence don't just apply to perfect impartiality. I use "impartiality" loosely, in the sense in the first sentence of the intro: "gives moral weight to all consequences, no matter how distant". (I couldn't think of a better, non-clunky term for this.) See also footnote 4 of the first post:

“Nontrivial moral weight to distant consequences” is deliberately vague. I mean to include not only unbounded total utilitarianism, but also various bounded-yet-scope-sensitive value functions (see Karnofsky, section “Holden vs. hardcore utilitarianism”, and Ngo)

I'm happy to grant that normative neartermism is immune to my arguments. When I said "our reason to work on EA causes", I meant "the reason to work on such causes that my target audience actually endorses."

Magnus Vinding @ 2025-06-02T16:31 (+3)

I use "impartiality" loosely, in the sense in the first sentence of the intro: "gives moral weight to all consequences, no matter how distant".

Thanks for clarifying. :)

How about views that gradually discount at the normative level based on temporal distance, like or so? They would give weight to consequences no matter how distant, and still give non-trivial weight to fairly distant consequences (by ordinary standards), yet the weight would go to zero as the distance grows. If normative neartermism is largely immune to your arguments, might such "medium-termist" views largely withstand them as well?

(FWIW, I think views of that kind might actually be reasonable, or at least deserve some weight, in terms of what one practically cares about and focuses on — in part for the very reasons you raise.)

Anthony DiGiovanni @ 2025-06-08T09:31 (+2)

I think such a view might also be immune to the problem, depends on the details. But I don't see any non-ad hoc motivation for it. Why would sentient beings' interests matter less intrinsically when those beings are more distant or harder to precisely foresee?

(I'm open to the possibility of wagering on the verdicts of this kind of view due to normative uncertainty. But different discount rates might give opposite verdicts. And seems like a subtle question when this wager becomes too Pascalian. Cf. my thoughts here.)

Magnus Vinding @ 2025-06-13T11:50 (+3)

Why would sentient beings' interests matter less intrinsically when those beings are more distant or harder to precisely foresee?

I agree with that sentiment :) But I don't think one would be committed to saying that distant beings' interests matter less intrinsically if one "practically cares/focuses" disproportionally on beings who are in some sense closer to us (e.g. as a kind of mid-level normative principle or stance). The latter view might simply reflect the fact that we inhabit a particular place in time and space, and that we can plausibly better help beings in our vicinity (e.g. the next few thousands of years) compared to those who might exist very far away (e.g. beyond a trillion years from now), without there being any sharp cut-off between our potential to help them.

FWIW, I don't think it's ad hoc or unmotivated. As an extreme example, one might consider a planet with sentient life that theoretically lies just inside our future light cone from time t_now, such that if we travelled out there today at the theoretical maximum speed, then we, or meaningful signals, could reach them just before cosmic expansion makes any further reach impossible. In theory, we could influence them, and in some sense merely wagging a finger right now has a theoretical influence on them. Yet it nevertheless seems to me quite defensible to practically disregard (or near-totally disregard, à la asymptotic discount) these effects given how remote they are (assuming a CDT framework).

Perhaps such a position can be viewed from the lens of an "applicability domain": to a first approximation, the ideal of total impartiality is plausibly "practically morally applicable" on all of Earth and on and somewhat beyond our usual timescales. And we are right to strongly endorse it at this unusually large scale (i.e. unusual relative to prevailing values). But it also seems plausible that its applicability gradually breaks down when we approach extreme values.

Indeed, bracketing off "infinite ethics shenanigans" could be seen as an implicit acknowledgment of such a de-facto breakdown or boundary in the practical scope of impartiality. After all, there is a non-zero probability of an infinite future with sentient life, even if that's not what our current cosmological models suggest (cf. Schwitzgebel's Washout Argument Against Longtermism). Thus, it seems that if we limit infinite outcomes from dominating everything, we have already set some kind of practical boundary (even if it's a practical boundary of asymptotic convergence toward zero across an in-theory infinite scope). If so, it seems that the question is to clarify the nature and scope of that practical boundary, not whether it's there or not.

One might then say that infinite ethics considerations indeed count as an additional, perhaps also devastating challenge to any form of impartial altruism. But in that case, the core objection reduces to a fairly familiar objection about problems with infinities. If we make an alternative case, in which we assume that infinities can be set aside or practically limited, then it seems we have already de facto assumed some practical boundary.

Anthony DiGiovanni @ 2025-06-17T13:20 (+3)

In theory, we could influence them, and in some sense merely wagging a finger right now has a theoretical influence on them. Yet it nevertheless seems to me quite defensible to practically disregard (or near-totally disregard, à la asymptotic discount) these effects given how remote they are

Sorry, I'm having a hard time understanding why you think this is defensible. One view you might be gesturing at is:

If a given effect is not too remote, then we can model actions A and B's causal connections to that effect with relatively high precision — enough to justify the claim that A is more/less likely to result in the effect than B.
If the effect is highly remote, we can't do this. (Or, alternatively, we should treat A and B as precisely equally likely to result in the effect.)
Therefore, we can only systematically make a difference to effects of type (1). So only those effects are practically relevant.

But this reasoning doesn't seem to hold up for the same reasons I've given in my critiques of Option 3 and Symmetry. So I'm not sure what your actual view is yet. Can you please clarify? (Or, if the above is your view, I can try to unpack why my critiques of Option 3 and Symmetry apply just as well here.)

Magnus Vinding @ 2025-06-02T16:16 (+2)

I meant "the reason to work on such causes that my target audience actually endorses."

I suspect there are many people in your target audience who don't exclusively endorse, or strictly need to rely on, the views you critique as their reason to work on EA causes (I guess I'm among them).

Magnus Vinding @ 2025-06-02T14:39 (+2)

The first thought I have is mostly an impression or something that stood out to me: it seems to me like the word choices here sometimes don't quite reflect the point being made or the full range of views being critiqued, arguably including the strongest competing views.

For example, when talking about heuristics that are supposed to be "robust", or strategies we can "reliably intervene on", or whether we can "reliably weigh up" relevant effects, etc, it seems to me that these word choices convey something much stronger than what would necessarily be endorsed by weaker and arguably more defensible types of views that differ from yours.

After all, views different from yours might agree that no heuristics are outright robust and that we can't reliably weigh up the relevant effects, while nevertheless holding something weaker, like that some heuristics are slightly better than others in expectation, or that we in general can do meaningfully (or even just marginally) better than nothing.

Of course, I get that language is always a bit vague, and I'm sure you meant to include a very broad range of views, including very "weak" ones. But at least in terms of how I read it, the language often seemed to invoke a much stronger and more narrow set of views than necessary.

Anthony DiGiovanni @ 2025-06-02T15:10 (+2)

Thanks Magnus — I'm not sure I understand your objection or which specific language choices in the posts seem too strong, yet.

nevertheless holding something weaker, like that some heuristics are slightly better than others in expectation, or that we in general can do meaningfully (or even just marginally) better than nothing

Can you say a bit more about why you think I haven't adequately responded to this perspective, in the sequence? ("Degrees of imprecision", "The 'better than chance' argument", and my response to "Meta-extrapolation" explain why I don't think "X is slightly better than Y in expectation" makes sense here.)

Magnus Vinding @ 2025-06-02T15:18 (+2)

I should probably have made it more clear that this isn't an objection, and maybe not even much of a substantive point, but more just a remark on something that stood out to me while reading, namely that the views critiqued often seemed phrased in much stronger terms than what people with competing views would necessarily agree with.

Some of the examples that stood out were those I included in quotes above.

Anthony DiGiovanni @ 2025-06-04T07:12 (+2)

Some of the examples that stood out were those I included in quotes above

I'm still confused, sorry. E.g., "reliably" doesn't mean "perfectly", and my hope was that the surrounding context was enough to make clear what I mean. I'm not sure which alternative phrasings you'd recommend (or why you think there's a risk of misrepresentation of others' views when I've precisely spelled out, e.g., the six "standard approaches" in question).