What to do about near-term cluelessness in animal welfare

By Anthony DiGiovanni 🔸 @ 2025-10-08T20:56 (+87)

(Context: I’m not an expert in animal welfare. My aim is to sketch a potentially neglected perspective on prioritization, not to give highly reliable object-level advice.)

Summary

We seem to be clueless about our long-term impact. We might therefore consider it more robust to focus on neartermist causes, in particular animal welfare.^[1] But if we also take seriously our deep uncertainty about our impact on animals, what implications does this have for animal welfare prioritization?

This post will explain:

why I think we could be clueless about even the near-term impact of many animal welfare interventions (more);
what criteria I think an intervention must satisfy to be robust to near-term cluelessness (more); and
how these criteria compare to existing approaches to robustness (more).

Practical takeaways for cost-effectiveness analyses:

Include estimates of key backfire effects (more), such as:
- large increases in populations of wild animals with net-negative lives;
- substituting consumption of larger animals with (a greater number of) smaller animals;
- increasing the efficiency of farming; and
- indirect pathways to the above backfire effects, e.g., through funging or untargeted capacity building.
Do sensitivity analyses for the intervention’s sign, across multiple probability distributions over variables in the cost-effectiveness estimate.
- The intervention should be positive-EV with respect to all distributions that seem reasonable, at least if we only consider the intervention’s effects that are “robust” in a sense I explain below. (If we were to instead aggregate the distributions, the way we aggregate them might be arbitrary. More on this here.)
Even if your “best guess” is that some intervention is more effective than others, you should deprioritize that intervention if its intended upsides…
- occur relatively far in the future;
- occur within a relatively complex system; or
- can’t be empirically validated.

See footnote for some caveats and context.^[2]

Introduction

Compared to having a robustly positive long-term impact, reducing near-term animal suffering seems quite tractable. We can become relatively confident in (some) interventions’ effects via feedback loops, the track records of similar interventions, and intuitions calibrated on problems of similar complexity.

Still, even just considering our effects on animals in the near term, we face epistemic challenges similar to those of longtermism, though much more manageable. Let’s look at two general categories of these challenges.

First, a classic critique of long-term impact estimates is that they depend on made-up numbers (Greaves; Violet Hour; Masrani), based on intuitions that haven’t been honed by relevant feedback. But suppose we try to weigh the upsides of broad animal welfare movement-building, such as increasing funding for welfare reforms, against the downsides from increasing wild animal suffering (Tomasik). Where do these weights come from? We might go with a rough Fermi estimate, if we had to commit to a precise guess. Yet there are alternatives to committing to a precise guess, which seem more appealing when our decision depends on which guess we pick, as I’ll unpack below (see also this post).

Or, let’s say we’re confident that some charity A wouldn’t increase expected wild animal suffering. But there’s a prospective donor to A whose second favorite funding opportunity is another charity B, which we worry would increase expected wild animal suffering. Then we need to assess the probability of funging, i.e., the possibility that if we fill charity A’s funding gap, we’ll make this other donor more likely to fund B (where they otherwise would’ve funded A anyway). (More on this later.) Perhaps we could try guesswork like GiveWell’s below, but what’s the justification for their precise estimate?:

Our best guess is that the average counterfactual use of domestic government spending that could be leveraged by our top charities is ~75% as cost-effective as GiveDirectly. We think using this figure is a useful heuristic, which roughly accords with our intuitions (and ensures we’re being consistent between charities), but we don’t feel confident that we have a good sense of what governments would counterfactually spend their funds on, or how valuable those activities might be. (Snowden)

Second, take crucial considerations: We’ve discovered insights that flipped the sign of various interventions’ near-term effects, not just long-term. E.g., reducing cow farming plausibly increases suffering by increasing wild animal populations (Shulman; Tomasik), or shifting consumption to smaller animals (Charity Entrepreneurship). Since we’ve been surprised by crucial considerations before, it seems we should expect there are others we’re currently unaware of, that is, unknown unknowns. Even lots of research won’t necessarily turn up all the sign-flipping considerations, given the complexity of the system of different animals and human societies we’re intervening on. How should we adjust our EV estimates to factor in unknown unknowns?

It’s tempting to reply, “Unknown unknowns aren’t action-relevant. Either we have no idea what they imply anyway, or they shrink the EV of everything by the same factor.” However (as elaborated below),^[3] it seems too strong to assume unknown unknowns conveniently cancel out “in expectation”. For some interventions, then, it will be so ambiguous how to adjust for unknown unknowns that we end up clueless about them. Yet, when we look closely at other interventions’ mechanisms of impact on animals, we may still have reasons to consider these interventions positive.

Effective animal suffering reducers therefore face at least a weak form of cluelessness. What kinds of interventions are better than inaction despite these challenges, that is, cluelessness-robust? I propose that we should only consider an intervention cluelessness-robust if:

its positive effects that we consider decision-relevant (i.e., “robust”) outweigh the robust backfire effects;
our standard for “outweighing” doesn’t depend on arbitrary estimates; and
this standard accounts for unknown unknowns.

(Clearly, we care about cost-effectiveness, not just doing better than inaction. But the worry is that even if an intervention looks very cost-effective according to our precise best guess, it might be no better than inaction on closer inspection.)

Principles for a cluelessness-robust intervention

Throughout, the examples are not confident claims. They’re mostly meant to be illustrative.

1. Accounts for robust backfire effects

Informally, an effect of an intervention is “robust” if we have reason to factor it into our decision-making.^[4] So, robust backfire effects are those that we have at least as much reason to factor into our decisions as the robust upsides.

Example

You’re unsure whether to donate to a charity that tries to make fish slaughter more humane via electrical stunning. Here are some backfire effects you might consider, and intuitively appropriate responses to them (I’m not yet claiming there’s a rigorous way of justifying these intuitive responses):

Non-robust backfire effect: Maybe promoting humane slaughter makes society more complacent about exploiting nonhumans, eventually leading to the mistreatment of astronomically many digital beings in the far future. On the other hand, we could also imagine that by expanding society’s moral circle, promoting humane slaughter prevents the mistreatment of astronomically many future beings. We have “complex cluelessness” about how to weigh the first (very bad) effect against the second (very good) effect, that is, they’re not precisely equally likely and severe. Nevertheless, it doesn’t seem like our cluelessness about these effects should override the benefits to the fish, which we’re not clueless about. Overall, then, the “increasing complacency in the far future” backfire effect doesn’t seem robust.
Robust backfire effect: It’s plausible that stunning is botched frequently enough (van Pelt) that stunning interventions cause more pain, in expectation, for the population of fish they’re intended to help. This backfire mechanism doesn’t seem less decision-relevant than the upside mechanism, so it’s robust. This doesn’t mean the intervention is overall negative-EV, but, depending on the empirical details, it could mean the intervention isn’t cluelessness-robust.

More detail

While these examples give the intuition, the hard part is making precise which effects we have “reason” to factor into our decisions (see here for one attempt). And many net-positive interventions will have some backfire effects. But it’s worth deliberately keeping robust backfire effects in mind, because:

We’re clueless about the sign of our actions’ long-term impact due to various speculative effects, both positive and negative. This means the case for doing animal interventions in the first place relies on “bracketing out” the far-future effects of our interventions, in some sense, from our decision-making. Why would we do that? Because of the intuition in the bullet point above: effects we’re clueless about shouldn’t override effects we’re not clueless about. So we might be tempted to over-extend this reasoning, for example, by conveniently bracketing out any effects beyond the “first order”. Instead, we should decide which effects to factor into our decision-making by some minimally ad hoc standard, such as the framework in the linked post, or, “are my beliefs about these effects strongly grounded in evidence, and am I not clueless overall about their sign?” (see Clifton).
It’s somewhat common to treat backfire effects as more speculative than intended effects by default. Yet, we’re tackling particularly unfamiliar problems from an uncommon ethical perspective. That is, we’re trying to make impartial tradeoffs among the welfare of various species, across both the human economy and natural habitats. So it’s not surprising for unintended effects to compete with intended effects. (That said, insofar as the problems in animal welfare are considerably less unfamiliar than longtermist problems, we shouldn’t be too pessimistic either.)

Some key potential backfire risks, which may or may not be robust depending on the specific intervention:

Significantly increasing wild invertebrate populations, typically via reducing agricultural land use.
- The case for this effect being bad seems strongest under suffering-focused ethics, though people with other ethical views have also raised this concern.
- Reducing pasture, as opposed to cropland, is the most robustly bad instance of this effect: Although replacing cropland with natural habitats increases invertebrate populations, it could also prevent more painful deaths from pesticides.^[5]
Substitution of larger animals with smaller ones, also known as the small animal replacement problem.
- This includes not only advocacy that directly encourages consumers to eat fewer large farmed animals, but also, e.g., welfare reforms for larger farmed animals that could increase prices (Tomasik).
Increasing farmed animal production via economic incentives.
- For example, if we push for a subsidy for some more humane farming method to make farmers strictly prefer that method, they might increase production.^[6]
Displacement effects.
- For example, by blocking the development of factory farms in one location, we might increase imports of meat from less humane or more efficient farms elsewhere (Gould).
Funging of donations with work that could plausibly backfire in the above ways (recall the definition above).
- Funging generally looks more worrisome when our standards for robustness are unusually high, since we’d have less reason to expect an “efficient charity market” (Karnofsky). (See Tomasik for more discussion.)
- A more in-depth discussion of funging is out of scope for this post. For now I’ll say, I’m not taking a particular stance on how to approach the problem, especially how to think about different levels of community-wide coordination.

2. Doesn’t depend on arbitrary estimates

Example

You’re deciding whether to fund a corporate campaign for a new chicken welfare reform. This reform appears less effective than the Better Chicken Commitment reforms (assume BCC reforms have hit diminishing returns), but still promising. Except, you wonder about the substitution effect: Maybe when the reform increases the price of chicken and eggs, many consumers will respond by eating a much larger number of fish and shrimp per calorie (Buhler).

Intuitively, you’re not too worried about this downside, since you guess that a large fraction of substitution will be for much larger animals (cows and pigs). But you don’t have very precise, conclusive evidence about the magnitude of the effect, even after consulting studies on price elasticities.^[7] You try to roughly guess the expected increase in suffering-years of farmed fish and shrimp, for each suffering-year of chicken affected by the reform — you come up with numbers from 0.5% to 25%. (The larger figure is partly due to the large number of days in captivity per farmed fish, compared to chickens (Welfare Footprint Institute).^[8]) And, you also feel pretty confused about how much suffering the reform would reduce per chicken. You suppose it might be as effective as reducing 30% of each chicken’s suffering in expectation, but feel you could just as well say it’s only 5%.

Given the 0.5% figure for increased farmed fish and shrimp suffering-years per chicken suffering-year, and the 30% figure for decreased chicken suffering, you’d net-reduce suffering-years by 29.5%. So the campaign would be good. But given the figures of 25% and 5%, respectively, you’d net-increase suffering-years by 20%, so the campaign would be bad. Then it looks like an arbitrary call overall whether you should fund the campaign.

More detail

Once we’ve accounted for an intervention’s robust effects on welfare, we of course need to weigh up the positives vs. negatives — both the amount of welfare affected, and how likely the effect is. In cases like the example above, though, lots of different precise weights seem defensible. Committing to any one set of weights thus looks arbitrary. How should we respond in these cases? I think that if the net sign of the intervention’s robust effects varies based on an arbitrary choice of weights, we can’t call it cluelessness-robust.

Sub-principles:

Not sensitive to comparisons of small probabilities. The intervention’s sign shouldn’t hinge on whether one very unlikely outcome is more likely than another. This is because our intuitions probably aren’t calibrated to distinguish, say, 0.001 from 0.0001. So we might worry that the relative sizes of our small probability estimates are arbitrary. (This isn’t risk aversion; see below.)
1. Example: Genetic engineering to eliminate wild animals’ capacity for pain without appreciably changing their behavior would, if successful, avert a massive amount of suffering. So at first, we might not worry about unlikely downsides of doing this work, like “the intervention actually does change the animals’ behavior in ways that greatly increase other (‘off-target’) populations”, or “we undermine the EA animal welfare movement’s credibility by working on a controversial cause”. But plausibly, “we get the intervention implemented despite pushback, with minimal effects on off-target population sizes” is also unlikely. If so, it could be difficult to robustly conclude the expected upsides from the latter outcome are larger than the downsides.
Not sensitive to different reference classes for forecasting. We might assign the probabilities of the intervention’s possible outcomes based on the past frequencies of relevantly similar outcomes. But if the intervention (or the problem it’s meant to solve) is quite novel, we could have multiple plausible definitions of “relevantly similar outcomes” to extrapolate from, i.e., reference classes. The intervention’s sign shouldn’t differ depending on which reference class we choose.
1. Example: Suppose we’re considering using a “bad cop” tactic, in advocacy for some kind of animal no one has tried it for before (say, insects). And we’re checking how likely this is to backfire, e.g., by making the industry or general public more hostile toward reformers. Across all the animal welfare campaigns on record in the movement’s history, the tactic worked 15% of the time and backfired 5% of the time. But among campaigns for smaller animals, the tactic worked only 5% of the time and backfired 10% of the time. (Although the smaller-animal campaigns are more similar to our case, we have very little data on them, so arguably we should still put some weight on the broader set of campaigns.) Overall, then, it’s not clear we should expect the tactic to be net-good in our case.

3. Accounts for unknown unknowns

Example

You’re deliberating between two interventions for some species of farmed animal. You could fund either a welfare reform campaign, or a campaign to block the development of several new factory farms. Assume (generously!) that you have a robust estimate of the cost per individual animal affected, and that the only robust effects of either intervention are on this species.

On one hand, the welfare campaign is somewhat cheaper. But, in the past you’ve come across ways that welfare reforms might not decrease as much suffering as initially expected, or even increase it — e.g., effects of cage-free reforms on chicken injuries and mortality (Rufener and Makagon; Cotra), botched stunning, and potential increases in wild fish catch in response to decreased demand (St. Jules). You worry the same could apply to this reform, even if you can’t think of a specific backfire mechanism. If the animals aren’t born into factory farms at all, though, they definitely won’t experience net-negative lives. Accounting for this, overall, you favor the development-blocking campaign.^[9]

More detail

Suppose we’ve tried weighing up an intervention’s robust effects, based on all the considerations we’re aware of. And the intervention looks good under every reasonable weighing. Are we in the clear?

Not quite. We’re likely unaware of some considerations about the robust effects. And we should expect that if we knew those considerations, we’d consider some effects more or less likely. Though it’s unclear precisely how to adjust our current estimates to reflect that expectation, we should still do so. (Even when we try to make “conservative” adjustments, our intuitions can fail us pretty badly (Wildeford).) Therefore, if the net sign of the intervention’s robust effects varies across different reasonable ways to adjust for unknown unknowns, it’s not cluelessness-robust.

It might seem impossible to say anything principled or action-relevant about unknown unknowns. But the potential for unknown unknowns seems greater in some domains than others. Most notably, when an intervention’s pathway to its intended benefits is more complex or involves more unfamiliar variables, we should expect there to be more unknown unknowns relevant to this pathway (more below). Arguably, then, we should prioritize interventions whose paths to impact we deeply, mechanistically understand.^[10]

Sub-principles:

Not dependent on predicting relatively long-term events or complex mechanisms. The causal pathway from “we start implementing the intervention” to “intended benefits” should be as quick and simple as feasible. This is because it’s easier to deeply understand near-term, simple target variables: If we don’t have a deep understanding of our target variable, unknown unknowns seem more likely to swamp our impact on this target. (Compare this to the discussion of “implementation robustness” in this post, and the longtermist strategy of focusing on preventing near-term lock-in events.)
1. An important implication: If we expect the intervention’s upsides to occur systematically later than the downsides, then it’s especially plausible that the downsides are greater. For example, it’s been suggested^[11] that even if a farmed animal intervention greatly increases wild animal suffering in the near term, this downside will eventually be offset by a more animal-friendly society helping wild animals. This argument seems weaker when we note that the longer-term upsides are less robust to unknown unknowns.
2. Examples:
  1. The intended benefits of broad moral circle expansion outreach come from various difficult-to-control behavior changes in the future, within a complex social system. So it might be hard to ensure the robust effects are net-positive. Even if outreach inspires more people to try to help animals, this isn’t robustly positive if the popular interventions they’re likely to support are not themselves robustly positive.
  2. Contrast (i) with, e.g., convincing a particularly thoughtful donor of invertebrate sentience, or paying farmers to use more humane pesticides (Tomasik; Grilo). (This isn’t to say either of these strategies are, in practice, clearly robust all things considered. I only claim they’re less prone to the particular problem in (1) than broad moral circle expansion.)
Testable and controllable. Ideally, we should be able to get feedback loops on the intervention’s direct outcomes of interest (not just proxies), and to steer or terminate the intervention if we later conclude it’s non-robust. These are generally well-known criteria for effective philanthropy (Kaufman; Liedholm). But they’re especially important for cluelessness-robustness: If we don’t get feedback on an intervention and/or it’s uncontrollable, our models of the intervention’s effects will be more coarse-grained, hence more brittle to unknown unknowns. And when our intuitions about the intervention’s effects aren’t honed by feedback loops, they’re less likely to implicitly account for unknown unknowns. (That said, in some cases, sufficiently strong evidence about the simple causal pathways underlying an intervention’s impact might be enough.)
1. Examples:
  1. Again, outreach and field-building are pretty weakly controllable. And these activities don’t tend to admit quick feedback loops on the outcomes we ultimately care about — i.e., getting robust interventions implemented, as opposed to the proxy measure of increasing the number of researchers interested in animal welfare.
  2. Most direct interventions for animals seem to allow for feedback loops, but this might vary across classes of interventions. E.g., suppose we support human life-saving charities to reduce wild invertebrate populations, but we’re currently not confident in the net effect on these populations. The data we gain from supporting these charities would likely be too coarse-grained to change our degree of confidence.

Comparison to other approaches to robustness in cause prioritization

EAs, including those working in animal welfare specifically, have often remarked on the importance of robustness in some sense. Given this, I want to clarify two points. (This section isn’t essential if you’re short on time, but probably will give important context.)

First, cluelessness-robustness aligns in some important ways with existing standards of robustness. So at least to some degree, animal advocates with diverse epistemic perspectives could coordinate on cluelessness-robust interventions. In particular, here’s how some principles in the list above relate to properties people typically consider indicative of robustness:

“Accounts for unknown unknowns” and “Testable and controllable” ⇒ cluelessness-robust interventions aren’t “speculative”, that is, they don’t depend on a path to impact that’s unprecedented, lacking in empirical evidence, and premised on a rough best guess.
“Not sensitive to comparisons of small probabilities” ⇒ cluelessness-robust interventions are less prone (though not immune) to Pascal’s muggings.

Second, however, there are also crucial differences. Hence, my framework of cluelessness-robustness is action-relevant for effective animal advocates. Namely:

Cluster thinking and Tomasik’s notion of robustness emphasize the importance of common sense. But evaluating interventions for cluelessness-robustness often requires taking counterintuitive arguments and beyond-first-order effects seriously. This is because we shouldn’t expect our intuitions to precisely price in unknown unknowns, or accurately weigh up the interests of different species, etc.^[12]
When we have significant uncertainty about some question relevant to our impact, a common response is to research that question. On one hand, I would indeed recommend more research into i) which interventions are cluelessness-robust, ii) the cost-effectiveness of cluelessness-robust interventions, and iii) empirical cruxes for (i) and (ii). On the other hand, research has low value of information if the intervention’s impact is sensitive to very hard-to-access information, e.g., the dynamics of AGI takeoff. Based on the principles of cluelessness-robustness, we’d instead focus on interventions whose intended upsides aren’t sensitive to factors we’re resiliently clueless about.
Cluelessness-robustness doesn’t assume risk aversion (though they’re compatible).^[13] For example, even if we think it might be reasonable to assign very low probability to springtails being capable of intense suffering, an intervention focused on these animals could be cluelessness-robust. (At least, as long as we have a non-arbitrary basis for thinking we’re systematically helping springtails rather than harming them.)
Cluelessness-robustness goes beyond accounting for higher-order uncertainty or more sophisticated Bayesian analysis. Neither of these approaches (by themselves) addresses the deeper problems of sensitivity to arbitrary estimates, or unknown unknowns. This is a major reason I’m hesitant to take most grantmakers’ back-of-the-envelope calculations at face value.

Conclusion and future directions

Many of us prioritize near-term animal welfare interventions, despite caring about the far future, because our epistemic standards rule out longtermist interventions. This post recommends that we consistently apply those epistemic standards to animal welfare interventions themselves. The high-level starting point is to check for backfire risks, and make sure that when we say they’re outweighed by the upsides, this verdict is robust to different reasonable judgment calls.

What implications do these standards have for our bottom-line prioritization? I’m not sure yet, and I’m keen to flesh this out more. Tentatively: Humane slaughter for invertebrates (or at least research into this) seems promising, if one can avoid funging with interventions that could be highly negative for some off-target animals. This is because:

Invertebrate farming interventions are relatively robust to effects on wild animals and substitution effects. Invertebrate farming uses much less land,^[14] and kills many more individuals per calorie, than farming of larger animals. If the intervention had a substitution effect via increased prices, consumers would switch to such larger animals (this includes replacement of feed insects with fishmeal). (However, it’s possible that even substitution toward larger animals could make us clueless, if the relative moral weights of invertebrates vs. vertebrates are too ambiguous.)
Compared to other welfare reforms, humane slaughter seems somewhat more robust to deep uncertainty about tradeoffs between different harms, and about the economics of farming:
- Reducing end-of-life suffering is less likely to have complicated side effects on target animals, including accidentally increasing farming efficiency, than reforms for longer-term quality of life.
- Methods that would simply kill farmed insects faster (e.g., grinding (Barrett et al.)) seem straightforwardly good for them.
Given that the default funding landscape tends to neglect invertebrates, all else equal invertebrate work also seems more robust to funging. On the other hand, some less neglected opportunities might have much larger room for more funding.

(I think it’s likely I’ll conclude that other interventions are cluelessness-robust, too. It’s just that humane invertebrate slaughter looks relatively good.)

The tractability of further desk research on cluelessness-robustness is unclear, but some potential next steps include:

writing end-to-end audits of the most promising animal interventions, using a backfire risk checklist and checking for robustness of the intervention’s sign over plausible values for different variables; and
clarifying which effects (positive and negative) we should consider “robust”, e.g., by developing the theory of “bracketing” introduced here.

Acknowledgments

Very few object-level considerations in this post are new (many of them I learned from discussions with Michael St. Jules). And I’m largely indebted to thinking by Jesse Clifton and Anni Leskelä for the high-level ideas here. Thanks to Michael St. Jules, Jim Buhler, Jesse Clifton, Clare Harris, Euan McLean, Joseph Ancion, and Mal Graham for helpful feedback and suggestions. This does not imply any of these people’s endorsement of all my claims.

^{^}
Some EA thinkers have argued that cluelessness in fact cuts against neartermist work (see, e.g., Greaves and Holness-Tofts). This post won’t address such arguments. Instead, I’ll assume a perspective on which: (i) Neither neartermist nor longtermist interventions are justified based on their effects on expected total welfare (as I’ve argued here); but (ii) interventions might be justified based on their most “robust” effects on welfare, in a sense I unpack below.
^{^}
Notes:
- I’ll focus less on giving a philosophically rigorous case for this prioritization approach, and more on presenting the core intuitions behind the approach and its implications. See Clifton and these resources for more philosophical background.
- Regarding my perspective on what counts as a “backfire effect”: I prioritize reducing (intense) suffering; I think it’s plausible (though unclear) that most wild animals experience more suffering than happiness; and I think it’s unclear whether (e.g.) farmed chickens suffer many orders of magnitude more per second than wild insects. But the general approach in this post doesn’t depend on any of these views.
^{^}
See the literature on “unawareness” for more. I give a precise explanation of why the “canceling out” assumption is implausible here. The gist is: Once you discover a new consideration, it seems you should update your beliefs on this fact. And this update will break the symmetry between optimistic and pessimistic unknown unknowns.
^{^}
This isn’t an ideal definition. I want to say that there’s some property of effects that we call “robustness” — which we can try to make more precise, as discussed later in this section — and it’s because these effects have this property that we have reason to factor them into our decision-making.
^{^}
H/t Samy Sekar and Michael St. Jules.
^{^}
H/t Michael St. Jules.
^{^}
I’m definitely not confident that we can’t, in fact, get robust evidence about the magnitude of this substitution effect.
^{^}
H/t Michael St. Jules.
^{^}
(More in-the-weeds note:) If the welfare reform isn’t better than inaction, while the development-blocking campaign is, this doesn’t immediately imply that you should prefer to fund development-blocking over the reform. As some toy math: Taking the “EV” restricted to just the robust effects, we might have EV(reform) = [-1, 1], EV(inaction) = 0, and EV(development-blocking) = 0.5. So development-blocking > inaction, but we can’t compare the reform with either development-blocking or inaction! (See here for some technical discussion.) But I find it intuitive that, in general, we should prefer an option in proportion to how much better its robust effects are than each alternative. So, arguably we should prefer development-blocking because it’s at least better than something else, even if the comparison to “inaction” per se isn’t special. See St. Jules for a similar proposal in the context of difference-making risk aversion, though note that the view I’m suggesting here isn’t inherently about difference-making risk aversion.
^{^}
Note also that in principle, adjusting for unknown unknowns can flip the net sign of an intervention’s robust effects! For example, suppose at first we expect the intervention to have:
- one moderately large positive robust effect, via a more familiar pathway; and
- one slightly smaller negative robust effect, via a much less familiar pathway.
So we consider the intervention to be negative. Once we adjust for unknown unknowns, the negative effect could end up not seeming robust anymore because the pathway to this effect is so unfamiliar. We’d then consider the intervention positive.
^{^}
See, e.g., Meyer Shorb, and John and Sebo. (I’m not claiming that these authors confidently endorse the argument above.) See also an analogous argument by Rouk for putting less weight on the small animal replacement problem.
^{^}
In particular, I don’t see the justification for Karnofsky’s claims, “Correcting for missed parameters and overestimated probabilities will be more likely to cause “regression to normality” (and to the predictions of other “outside views”) than the reverse,” or, “I believe that the sort of outside views that tend to get more weight in cluster thinking are often good predictors of “unknown unknowns.”” More on this here.
^{^}
How does this square with the claim that cluelessness-robust interventions aren’t prone to Pascal’s mugging? The idea is: In cases I tend to find the most repugnantly “Pascalian” (your mileage may vary), we’re usually making (implicit) comparisons of very low probabilities of upsides vs. downsides. The case for reducing the suffering of tiny animals doesn’t seem to have this structure, since we won’t make these animals worse off if it turns out they’re not sentient.
^{^}
See, e.g., the data for prawns here.

JoA🔸 @ 2025-10-09T06:43 (+17)

When discussing considerations around backfire risks and near-term uncertainty, it is common to hear that this is all excessive nitpicking, and that such discussion lacks action guidance, making it self-defeating. And it's true that raising salience of these issues isn't always productive because it doesn't offer clear alternatives to going with our best guess, deferring to current evaluators that take backfire risks less seriously, or simply not seeking out interventions to make the world a bit better.

Thus, because this article centers the discussion on the search for positive interventions through a reasonably action list of criteria, it has been one of my most valuable reads of the year.

I think the more time we spend exploring the consequences of our interventions, the more we realize that doing good is hard. But it's plausibly not insurmountable, and there may be tentative, helpful answers to the big question of effective altruism down the line. I hope that this document will inspire stronger consideration for uncertainty. Because the individuals impacted by near-term second-order effects of an action are not rhetorical points or numbers on a spreadsheet: they're as real and sentient as the target beneficiaries, and we shouldn't give up on the challenge of limiting negative outcomes for them.

saulius @ 2025-10-10T11:59 (+13)

This is great, thanks for writing this. It gave me some clarity about something I’ve been confused about for a long time.