Nuance in Proxies

By Kevin Xia 🔸 @ 2025-03-02T11:28 (+45)

This is a Draft Amnesty Week draft. It may not be polished, up to my usual standards, fully thought through, or fully fact-checked.

Commenting and feedback guidelines:

This draft lacks the polish of a full post, but the content is almost there. The kind of constructive feedback you would normally put on a Forum post is very welcome.

Draft Amnesty Context: In the last year or two, I have noticed several ways in which the proxies that we use to communicate in effective animal advocacy can lose important nuance, leading to misunderstandings. I have even noticed a couple of times where these misunderstandings have found their way into important evaluative work, but not yet in a crucial/decision-relevant way. I intended to turn these observations into a stringent line of thought with a broader point, using real-world examples as citations and hard data to back up how they can be misleading. Now, waiting for an actual flaw to occur before posting something seems awfully pessimistic. I thus wanted to use Draft Amnesty Week as an opportunity to share these more anecdotal and only speculatively actionable observations—making further refinements feel less important. I also love and use proxies myself all the time, so I decided to add an “In defense of..” point to some of them. My observations are from the animal welfare space but may apply elsewhere as well. Thanks to @Johannes Pichler 🔸 and @Therese Veith🔸 for reviewing an even draftier version of this post. Thanks to @Toby Tremlett🔹 and the EA Forum team for hosting Draft Amnesty, it has done wonders in encouraging me. All mistakes are my own.

Nuances in Scale:

The following proxies all relate to the “scale of a problem” one might work on and how framing it on different levels can lose different nuances, each of which may lead to the impact of the work at hand being inaccurately assessed.

The Way We Group Animals

When prioritizing interventions, we sometimes categorize animals into broad groups, such as "farmed animals," "lab animals," or "wild animals," and sometimes prioritize on this level of categorization. Some nuance may be lost here, for example:

Minority farmed animals vs. Majority lab animals: While farmed animals are often considered a higher priority due to their sheer numbers, when we disaggregate this, we might find that there might be more lab mice and rats than say farmed cows. A quick, not-thoroughly-fact-checked scan suggests ~12-24 million lab mice/rats vs. ~31 million cows or ~27 million ducks in the US. A more controversial source even estimates ~115 million lab mice/rats—which is comparable to ~118 million pigs. Similar to how most farmed vertebrates are fishes and chickens, most lab animals appear to be mice and rats, so the farmed vs. lab framing can lose nuance when we talk of farmed animals that are lower in number and lab animals that are higher.
Minority wild animals vs. majority farmed animals: Something similar might be observed in wild animals vs. farmed animals, since most wild animals for whom we have figures are insects and earthworms (I vaguely remember reading somewhere about aquatic invertebrates likely being more numerous?). I think the case here is less strong since wild mammals and birds still outnumber farmed land animals by 3 to 15/30, but I feel like we sometimes implicitly treat or communicate about any work on wild animals as magnitudes larger in scale as work on farmed animals. This may not be the case, for other reasons (see next Proxy).

In defense of this proxy: There may well be dynamics within these groups that influence one another. Working on farmed ducks may have downstream benefits for work on farmed chickens, which may not/less be the case for work on lab mice/rats.

The Way We Use Groups of Animals

A related issue is that we sometimes use the total number of animals in a given category to estimate potential impact, rather than the number that are realistically reachable through an intervention.

For example:

The number of animals in a region vs. the number of animals affected: China is often considered the highest priority country to help farmed animals, to a large extent due to the scale of industrial animal agriculture there. As such, one might broadly assume that working in China is of high priority, but in order to leverage the fact that the group of farmed animals is large in China, work in China would need to leverage the scale of the issue in a nationwide relevant way. If an intervention only affects a small fraction—say through individual outreach, reaching corporations that don’t cover a relatively large amount of farmed animal production or targeting individual institutions that don’t exercise nationwide influence—then the large population may not be a reliable proxy for actual impact.
- Analogously, working on farmed fish in a country without much fish farming may miss the mark regardless of whether the intervention leverages the nationwide scale - even if the country would otherwise be of high priority.
The number of animals in a group vs. the number of animals affected: Similarly, the sheer group of animals an intervention targets doesn’t necessarily tell us how large the intervention is in scale. Working on wild animals, farmed insects or farmed aquatic animals, in and of itself, may not be a sufficient indicator for an intervention to actually have a large-scale impact. For example, if the market dynamics in the space where you work are very decentralized and you need to address numerous stakeholders, each of whom influences a small portion of the issue you are addressing, you might end up being less cost-effective than a targeted intervention at, say lab mice, if it were to turn out that there are much fewer stakeholders who have influence over many more animals.

In defense of this proxy: In the case of country prioritization, there may be undetected cultural influences that spread throughout the country, even with interventions that seemingly target non-nation-wide issues. Countries large in scale may also often be very neglected, making work there important to understand cultural nuance. In the case of group prioritization, large numbers in animal groups often come from animals being small-bodied or other contexts, in which the number of animals you will realistically affect will correlate with the number of animals there are.

The Way We Use Numbers of Animals

Another potentially under-communicated nuance is the relationship between the number of animals and the extent of suffering. A focus purely on numbers might miss crucial differences in duration and intensity of suffering, as well as how much an intervention alleviates it.

For example:

Farmed animals vs. Wild animals: Many more wild animals exist than farmed animals, but for many wild species, the intensity of their suffering remains unclear—with some even arguing that their lives may be net positive. By contrast, farmed animals often endure months or years of confinement and distress. The number of affected individuals alone does not determine an intervention’s impact—the duration and intensity of suffering matter significantly. More importantly, the sheer scale of suffering as a whole doesn’t determine the intervention's impact either, but how much suffering you can alleviate. This may turn out relevant in combination with the previous two proxies— farmed fish and wild mammals don’t differ too much in scale; there may even be more suffering in farmed fishes and you may quite plausibly be in a better position to alleviate suffering in farmed fishes through available interventions.
Chickens vs. Shrimp: The OG point and my first observation in this list was with regards to chicken vs. shrimp interventions. When it first came out, I wondered about the fact that in Rethink Priorities’ Cross-Cause Cost-Effectiveness Model, Cage-Free Chicken Campaigns were two orders of magnitudes more effective than Shrimp Slaughter Campaigns (in fact, the Shrimp Slaughter Campaigns didn’t even necessarily outperform GHD campaigns). Now, whether this is accurate or not (see @Vasco Grilo🔸's work on SWP’s Humane Slaughter Initiative), and whether my explanation of this was right or not, a plausible mechanism that I used to explain this was that while Shrimp Farming may involve animals and may even cause much more suffering, the available and scalable interventions may affect each of them to a much smaller degree.

Nuance in Contribution

Beyond the scale of interventions, another challenge arises in how/why we attribute impact. I initially intended for this to be its own post, more in detail, and all the other things you’d wish for your post to have, but this is something I have been thinking about writing for close to a year now, so at this point, I thought it might be more helpful to just throw this out here in this post for now.

The Way We Think of Contributing to Impact

In a context, in which more than one actor has helped in making an impact, we may often think in terms of how big their respective roles were and split the impact accordingly. Say, Organization A played twice as big of a role in securing a corporate commitment that helps 600,000 chickens as Organization B; we may frame their impact as having saved 400,000 and 200,000 chickens respectively. However, the reason we make impact claims is not (or, more accurately, shouldn’t primarily be) credits and glory, but rather to inform future marginal impact. As such, it is not directly relevant how big of a role an organisation played, but more so, or only, whether it played a necessary role (or whether their work would continue to play a necessary role) in making impact happen. I think all of this is implied in the basic concept of counterfactual impact, but can lead to weird dynamics, such as

Several organizations may be equally justified in claiming the same impact, leading to perceptions of 'double counting.' It is entirely plausible that more than one actor was necessary to make something happen such that both can make full claims of impact. Assuming we know of both organizations above that their contributions were necessary, both can claim having helped 600,000 chickens, without needing to help 1,200,000 chickens in total. I think this becomes clear when we also factor in all other moving parts, such as the funding that made their work possible, the training that made their campaigns work, and even things as mundane as the tech infrastructure required or the actions leading to/influencing the commitment from the industry’s side; if claiming impact was a zero-sum game, their respective claims would reduce drastically and not be as informative about the value of pursuing their work. In actuality, assuming past performance predicts future impact, we can claim that funding Organization B will lead to ~600,000 chickens more being helped than otherwise would have.^[1]
An Organization doing sufficient work in making something happen, not being able to claim any impact. For example, Organization A could consult and refer an advocate for a grant which they later receive. A couple of weeks later, Organization B may refer them to the same grant, causing the peculiar situation that neither of the two organizations could/should claim their involvement in facilitating this outcome.^[2]

Side note: As someone working in the Meta space, I find this one particularly interesting, because of all the indirect and obtuse ways you can make necessary contributions. It may suggest that we ought to focus on particularly minor yet necessary contributions (assuming minor contributions -> less work -> less cost -> more cost-effective). In practice, this may not necessarily be the case and may be more complicated than just finding these minor contributions, but it’s an idea I haven’t heard much of yet.^[1]

In Defense of this proxy: There is a reasonable chain of assumptions that can make this proxy “close enough” in many cases (bigger contribution -> less likely replaceable -> more likely necessary). The “direct work” space may face fewer situations in which this distinction matters and the “meta” space may be MEL-savvy enough to take it into consideration, effectively eliminating the use case for clearing this up.^[3]

So what now?

Honestly, the hardest part of writing this post was figuring out what the call to action should be. A lingering thought kept saying: "Yeah, duh, proxies lose nuance—almost by definition." But ultimately, I realized that reading something like this earlier would have been useful for my own thinking. And in conversations with fellow advocates, these points were often considered insightful.

I don’t think we should use proxies less often, and as mentioned in the introduction, I haven’t even identified a case where misunderstandings about these proxies led to a concrete mistake. So consider this less of a call to action and more of a preemptive nudge.

If nothing else, I hope this serves as a friendly reminder: Proxies are useful, but when making decisions—whether in prioritization, funding, or assessing impact—it’s worth pausing to ask whether they’re still pointing in the right direction. :)

^{^}
Now, this may be worth its own post, but in practice, these dynamics can get much more complicated for e.g., funding decisions. Imagine, that Organizations A and B both have the same budget of $60,000 (and that their past performance is a perfect prediction for future impact). They can both individually make a claim that they help 10 chickens per $, but if a single funder would fund both organizations, they would only help 5 chickens per $. Superficially, it seems that the ultimately most cost-effective decision requires careful coordination with other funders (and entities involved) to ensure to select the best combination of organizations within the pool of organizations in ones own interest, which, in combination, will lead to the most cost-effective outcome. Ironically, in practice, it may very well be the case that the contribution size of one organization (in combination with its likelihood of being funded elsewhere) may be a very practical way to approximate this dynamic. This, however, I am not sure about.
^{^}
Most meta organizations in the farmed animal space I have met are very much aware of these intricacies (shoutout to @lauren_mee for the most enjoyable nerd-outs about MEL in our line of work). This may be a non-issue here (see my defense of this proxy below).
^{^}
There may be a use case in cases where readers need to cross-compare between direct and meta-level work. This is currently just an intuition.

Felix_Werdermann 🔸 @ 2025-03-05T00:50 (+5)

Thank you, @Kevin Xia 🔸 , for the text!
I also find the Shapley value very interesting for attributing impact—I wasn't familiar with it before, so thanks for the hint, @Vasco Grilo🔸 !

I think it depends on what decisions are being guided by the "impact share." If the goal is to determine how a donor should allocate their money, then in your first example, the Shapley value is probably more suitable than simple counterfactuals. However, if Organization A has already decided that it has fulfilled its role in securing a corporate commitment and now Organization B is deciding whether to do the same, then counterfactuals are useful here (which are identical to Shapley values with only one actor).

Even though the Shapley value is a good reference point for donors when distributing funds, I don’t think the best overall strategy is necessarily to donate to the charities with the highest cost-effectiveness in terms of Shapley value. Instead, donations should also be "coordinated." This becomes particularly clear in the second example: If Organization A and Organization B had nearly the same costs for referring to a grant, their cost-effectiveness would also be nearly the same, and a donor would most likely have to support either no charity or both charities for that purpose. It is obviously smarter to fund only one (or none) of them in this case.

One solution would be to first fund the project with the highest cost-effectiveness (in Shapley value) and then recalculate the Shapley values. In the second example, this would mean that first, Organization A or Organization B is funded (whichever has slightly lower costs), and then the other organization is no longer funded for this purpose.

However, in the first example, problems could arise if the total donation budget is insufficient to fund both organizations, meaning that in the end, the money has no effect at all.

Even though this scenario may seem unrealistic (since Organization A’s actions would likely still have a positive impact, even if Organization B does nothing), this problem also appears in a slightly modified model that may be more realistic. Let’s assume that if Organization A or Organization B acts alone, they would help 200,000 chickens. The Shapley value per organization would still be 300,000 chickens, but if the funds are not sufficient to support both organizations, funding one of them would only help 200,000 chickens. In that case, it would be better to fund a third charity, Charity C, which could help 250,000 chickens (in Charity C’s campaign, no other organizations would play a role).

Vasco Grilo🔸 @ 2025-03-02T19:55 (+4)

Great post, Kevin! I like cost-effectiveness analyses because they require being explicit about the relationships between proxies and impact.

Nuance in Contribution

The relevant contribution of each influenceable actor is given by its Shapley value. Here is why it works in situations where counterfactual value fails, and here is a good explainer of Shapley value with Venn diagrams.

Felix_Werdermann 🔸 @ 2025-03-06T23:11 (+3)

On Nuance in Scale:

I actually found the point about the number of lab mice/rats quite interesting—I wasn’t really aware of it before, even though I did know that mice and rats make up the majority of lab animals and that the total number of farmed land animals is dominated by chickens.

Overall, however, I believe that the proxies mentioned here are quite reasonable. Specifically, in defense of them, I would add:

Minority farmed animals vs. majority lab animals: Work on farmed cows will also raise public awareness about the living conditions of farmed animals in general, which will ultimately benefit other species (e.g., chickens).
The number of animals in a region vs. the number of animals affected: It is always preferable to prioritize interventions that reach a large percentage of animals in a given country over those that do not. Even if some of these interventions have not yet been widely tested in certain countries or may not be directly implementable in the same way as in Western countries, a stronger focus on, for example, China could also lead to the discovery of large-scale intervention strategies within the country itself.
Farmed animals vs. wild animals: Generally, there are still many unresolved questions regarding wild animal suffering. However, I don’t think anyone claims that the number of animals is the sole determining factor in prioritization—it is simply one of several factors.

Vasco Grilo🔸 @ 2025-03-05T14:37 (+2)

Assuming we know of both organizations above that their contributions were necessary, both can claim having helped 600,000 chickens, without needing to help 1,200,000 chickens in total.

This problem cannot be mitigated by thinking probabilistically. If there is probability p_s_A (p_s_B) of organisation A (B) being successful acting alone, p_s of organisations A and B being successful acting together, p_A (p_B) of organisation A (B) acting, and impact N given success, the expected counterfactual value of:

A acting is CV_A = ((1 - p_B)*p_s_A + p_B*p_s - p_B*p_s_B)*N.
B acting is CV_B = ((1 - p_A)*p_s_B + p_A*p_s - p_A*p_s_A)*N.

The sum of the expected counterfactual values of A and B is CV = CV_A + CV_B = ((1 - p_A - p_B)*(p_s_A + p_s_B) + (p_A + p_B)*p_s)*N. This can be as large as 2*N when A and B can never succeed alone (p_s_A, p_s_B = 0), A and B always succeed acting together (p_s = 1), and A and B are certain to act (p_A, p_B = 1).

The problem is solved using Shapley values. The expected Shapley value of:

A acting is SV_A = ((1 - p_B)*p_s_A + p_B*p_s/2 - p_B*p_s_B)*N.
B acting is SV_B = ((1 - p_A)*p_s_B + p_A*p_s/2 - p_A*p_s_A)*N.

The sum of the expected Shapley values of A and B is SV = SV_A + SV_B = ((1 - p_A - p_B)*(p_s_A + p_s_B) + (p_A + p_B)/2*p_s)*N. This can only be as large as N when A and B can never succeed alone (p_s_A, p_s_B = 0), A and B always succeed acting together (p_s = 1), and A and B are certain to act (p_A, p_B = 1).

SummaryBot @ 2025-03-03T15:10 (+1)

Executive summary: Proxies are useful tools for prioritization and impact assessment in effective animal advocacy, but they often oversimplify complex issues, potentially leading to misunderstandings and suboptimal decision-making.

Key points:

Grouping Animals Can Oversimplify Prioritization: Broad categories (e.g., "farmed" vs. "lab" animals) may obscure meaningful distinctions in numbers, suffering, and intervention effectiveness.
Scale Proxies Can Mislead Impact Estimates: The total number of animals in a category (e.g., farmed in China) doesn’t always correlate with intervention effectiveness if only a small fraction is reached.
Numbers Alone Don’t Capture Suffering: Counting animals without considering suffering intensity and intervention scalability can lead to misplaced priorities (e.g., shrimp vs. chickens).
Attributing Impact Can Be Complex: Multiple organizations may justifiably claim full impact for the same outcome, creating a perception of "double counting," but focusing on counterfactual necessity is more informative.
Proxies Remain Useful but Require Caution: While proxies help decision-making, it’s crucial to periodically reassess whether they still accurately reflect impact and priorities.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.