A Critique of Animal Charity Evaluators (ACE)

By VettedCauses @ 2024-11-08T17:08 (+58)

This is a linkpost to https://vettedcauses.com/reviews/animal-charity-evaluators

Hi everyone,

Recently, I decided to read one of ACE’s charity evaluations in detail, and I was extremely disappointed with what I read. I felt that ACE's charity evaluation was long and wordy, but said very little.

Upon further investigation, I realized that ACE’s methodology for evaluating charities often rates charities more cost-effective for spending more money to achieve the exact same results. This rewards charities for being inefficient, and punishes them for being efficient.

ACE’s poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result. After realizing this, I decided to start a new charity evaluator for animal charities called Vetted Causes. We wrote our first charity evaluation assessing ACE, and you can read it by clicking the attached link.

Best,

Isaac

Animal Charity Evaluators @ 2024-11-08T23:39 (+82)

Thank you for spending time analyzing our methods. We appreciate those who are willing to engage with our work and help us improve the accuracy of our recommendations and reduce animal suffering as much as possible.

Based on previously received feedback and internal reflection, we have significantly updated our evaluation methods in the past year and will be publishing the details next Tuesday when we release our charity recommendations for 2024. From what we can tell from a quick skim, we think that our changes largely address Vetted Causes’ concerns here, as well as the detailed feedback we received last year from Giving What We Can (see also our response at the time) as part of their program that evaluates evaluators. Our cost-effectiveness analyses no longer use achievement or intervention scores, but rather directly calculate cost-effectiveness by dividing impact by cost, as you suggest. That being said, our work will never be perfect so we invite anyone reading this with the expertise to improve the rigor of our work to reach out, now or in the future.

Although your comments are related to methods that we no longer use, we’d like to spend more time understanding and engaging with them, learning from them, and potentially correcting any misconceptions. Unfortunately, we won’t have the opportunity to do so until after our charity recommendations are released next week. Additionally, it might be a comfort to know that for the past few months, Giving What We Can has been assessing ACE’s new evaluation methods along with a panel of other experts and that they intend to publish the results later this month.

Thank you.

- The ACE team

VettedCauses @ 2024-11-09T03:55 (+15)

Hi,

Thank you for your response!

we have significantly updated our evaluation methods in the past year and will be publishing the details next Tuesday when we release our charity recommendations for 2024. From what we can tell from a quick skim, we think that our changes largely address Vetted Causes’ concerns here, as well as the detailed feedback we received last year from Giving What We Can (see also our response at the time) as part of their program that evaluates evaluators. Our cost-effectiveness analyses no longer use achievement or intervention scores, but rather directly calculate cost-effectiveness by dividing impact by cost, as you suggest.

We are glad to hear that ACE has changed their evaluation methods, and we hope that the changes effectively address the concerns listed in our review.

We look forward to seeing ACE’s new charity recommendations when they are released next week.

Animal Charity Evaluators @ 2024-11-21T22:23 (+16)

Hi Isaac! Now that we’ve announced our 2024 Recommended Charities, we’ve had more time to process your feedback. Thanks again for engaging with our work.

As mentioned before, we’ve substantively updated our evaluation methods this year. This was informed in part by detailed feedback we received as part of Giving What We Can’s 2023 ‘Evaluating the Evaluators’ project, some of which aligns with your feedback.

One of these changes is that we now seek to conduct more direct cost-effectiveness analyses, rather than the 1-7 scoring method that we used last year. This more direct approach is possible in part thanks to Ambitious Impact’s recent work to allow quantification of animal suffering averted per dollar. Of course, these kinds of calculations are still extremely challenging, limited, and subject to significant uncertainties; we describe our methods and their limitations on our website. For example, while cost-effectiveness = impact divided by cost, it can be difficult to measure impact meaningfully in a way that is also quantifiable, so we rely on other criteria to help us make our assessments.

Another major change was introducing a formal Theory of Change assessment to understand the reasoning, evidence base, and limitations around each charity’s main programs. In our 2023 Evaluations, we discussed these considerations in our Recommendations Decisions meetings but did not systematically incorporate them into our public reviews. Together, we think these changes allow for a more nuanced assessment of charities’ work and (we hope) more informative and accessible reviews.

Regarding the impact of our recommendations, this year, we conducted an assessment of ACE’s programs and our counterfactual influence on funding. As part of this work, we surveyed donors to our Recommended Charity Fund (RCF) and asked them where they’d donate if ACE didn’t exist. This indicated that over 60% of our RCF donors would donate less to animal charities if ACE were not to exist, of whom around 12% would not donate to animal charities at all. We aim to publish these influenced-giving reports on November 29th. We hope this reassures you that animals are not worse off because of ACE’s charity recommendations.

In terms of your specific feedback on last year’s methodology:

‘Charities can receive a worse Cost-Effectiveness Score by spending less money to achieve the exact same results’ / ‘Charities can receive a better Cost-Effectiveness Score by spending more money to achieve the exact same results’ / ‘Charities can rearrange their budget and achieve the exact same results (with the exact same total expenditures), but their Cost-Effectiveness Score can significantly change.’
- Your findings here are correct. Because the weighted averages in this model depended on the percentage of expenditure for each factor, they sometimes produced unintended and unhelpful results. In part due to this, we interrogated the outputs of our models in our Recommendation Decisions meetings at the time and considered cost-effectiveness scores alongside other decision-relevant factors (such as their Impact Potential and Room For More Funding), rather than taking cost-effectiveness as the only relevant factor to consider when evaluating charities or prioritizing giving opportunities. This was informed in part by each charity’s uncertainty scores, which helped inform how much weight to assign to cost-effectiveness and other criteria in our final recommendations decisions. As you would expect given their work, Legal Impact for Chickens’ uncertainty scores were among the highest of our 2023 evaluated charities. Of all our 2023 Recommended Charities, our Recommendations Decisions discussions played the biggest role for Legal Impact for Chickens given that our models were not as well-suited to their work compared to those for other charities.
- How we addressed this in our 2024 Evaluations: As noted above, we now do direct cost-effectiveness analysis rather than using a weighted factor model. We think the role of our Theory of Change-focused discussions in our 2023 Recommendation Decisions meetings should have been more systematic and more clearly communicated in our 2023 reviews, which is one reason why we introduced the new Theory of Change assessment this year.
‘Charities can have 1,000,000 times the impact at the exact same price, and their Normalized Achievement Scores and Cost-Effectiveness Score can remain the same.’
- This isn’t the case, but we didn’t publish the full details about our method for assessing the impact of books, podcasts, and other interventions, so we see why this wasn’t clear. Essentially for each intervention in our Menu of Interventions we identified proxies for its likely impact. For books, we had intended to include sales/views as well as a rating of the overall audience response/reviews. In practice, this wasn’t possible for various reasons given the wide variation in types of publication (e.g., some publications had not been released yet, or had been provided directly to the audience with no feedback collected), so we had to factor in such considerations on a more case-by-case basis in our Recommendations Decisions discussions. Issues such as this highlighted to us the inherent limitations of seeking to distill a charity’s work in a weighted factor model, given e.g. the large variation in tactics used by animal advocacy charities and the challenges involved in obtaining the necessary data to score their achievements based on pre-set criteria.
- We used additive rather than multiplicative scoring because our objective was to create weighted factor models that reflect the quality of achievements (rather than, e.g., estimating the number of animals helped by books being written). Since we’ve transitioned to directly estimating cost-effectiveness, we now use a straightforward multiplication for factors like “likelihood of implementation.”
- How we addressed this in our 2024 Evaluations: Same as the point above: we have now updated to a more direct cost-effectiveness analysis rather than using this weighted factor model, and have also introduced a new Theory of Change assessment.
‘Charities can increase their Normalized Achievement Scores and Cost-Effectiveness Score by breaking down actions into smaller steps, even if the overall results remain unchanged.’
- This actually isn’t the case (sorry if this wasn’t clear). Breaking down an achievement into smaller steps would drive up the ‘Achievement quantity’ score, but would be offset by lower ‘Achievement quality’ scores for each achievement. However, there was still a risk of this introducing inconsistency into the model, which is another reason why we updated our methods this year.
‘The most important factor in determining the Normalized Achievement Score of an intervention (Impact Potential Score) is decided before the intervention even begins. This makes the maximum Normalized Achievement Score for certain interventions relatively low, even if they have extremely high impact.’
- We developed this model because past evaluations have shown that the intervention type drives much of the impact of a charity’s achievements. Starting with a baseline intervention score and adjusting it still allows for particularly strong implementations to at least partially make up for a lower intervention score. That said, we agree with you on this model’s shortcomings. As with the cost-effectiveness model, we interrogated the model’s outputs in our Recommendation Decisions meetings and had a mechanism to weight Impact Potential lower in our decision-making when we were less certain about its relevance.
- How we addressed this in our 2024 Evaluations: We have updated this model and now only use it in a very limited way to supplement a qualitative assessment of charities’ work during the charity selection phase, rather than during the Evaluations themselves.
‘Legal Impact for Chickens did not achieve any favorable legal outcomes, yet ACE rated them a Recommended Charity.’
- When ACE considers impact for animals, we consider all the ways that animal suffering might be reduced when interventions are implemented. While they did not secure a litigation win, Legal Impact for Chickens’ Costco lawsuit garnered significant media attention that put pressure on the companies being litigated. While their work is more ‘hits-based’ than some of our other Recommended Charities, we think the considerable impact of any future legal wins means high expected value for this work overall, especially now that funding from ACE, Open Philanthropy, and the EA Animal Welfare Fund has allowed them to hire more litigators. Check out Alene Anello’s recent EA Forum post for an update on Legal Impact for Chickens’ latest achievements.

Thanks again for your engagement with our evaluations. We hope you get in touch with us directly if you come across new evidence-based methods to meaningfully capture cost-effectiveness or to improve the evaluation of animal charities. We might also reach out to you via email in the coming weeks as we go through retrospectives and plan for next year’s evaluation. Because of the complexity of the animal welfare cause area, the many uncertainties and knowledge gaps in the field of charity evaluation, and the urgency and scope of suffering, we embrace productive collaboration.

Thank you.

- The ACE team

VettedCauses @ 2024-11-23T02:49 (+1)

Hi,

Thank you for taking the time to read our review and for responding to each of our points. We really appreciate ACE’s willingness to engage with feedback and acknowledge problems.

Regarding your clarifications related to the calculation of Normalized Achievement Scores:

‘Charities can have 1,000,000 times the impact at the exact same price, and their Normalized Achievement Scores and Cost-Effectiveness Score can remain the same.’
This isn’t the case, but we didn’t publish the full details about our method for assessing the impact of books, podcasts, and other interventions, so we see why this wasn’t clear. Essentially for each intervention in our Menu of Interventions we identified proxies for its likely impact. For books, we had intended to include sales/views as well as a rating of the overall audience response/reviews. In practice, this wasn’t possible for various reasons given the wide variation in types of publication (e.g., some publications had not been released yet, or had been provided directly to the audience with no feedback collected), so we had to factor in such considerations on a more case-by-case basis in our Recommendations Decisions discussions.

We are glad to hear that ACE was accounting for these factors behind the scenes.

‘Charities can increase their Normalized Achievement Scores and Cost-Effectiveness Score by breaking down actions into smaller steps, even if the overall results remain unchanged.’
This actually isn’t the case (sorry if this wasn’t clear). Breaking down an achievement into smaller steps would drive up the ‘Achievement quantity’ score, but would be offset by lower ‘Achievement quality’ scores for each achievement. However, there was still a risk of this introducing inconsistency into the model, which is another reason why we updated our methods this year.

Thank you for clarifying this. From the publicly available rubrics for calculating Achievement Quality Scores, it did not seem like breaking down an achievement into smaller steps would decrease the Achievement Quality Score at all. However, given that ACE was accounting for factors outside of the publicly available rubrics, it makes sense that this decrease could occur.

That being said, we believe it is important for ACE to fully disclose its methodology to the public and avoid relying on hidden evaluation criteria. This transparency would allow people from outside the organization to understand how ACE's charity evaluation metrics (i.e. Normalized Achievement Scores) were calculated.

We might also reach out to you via email in the coming weeks as we go through retrospectives and plan for next year’s evaluation. Because of the complexity of the animal welfare cause area, the many uncertainties and knowledge gaps in the field of charity evaluation, and the urgency and scope of suffering, we embrace productive collaboration.

We appreciate your openness to collaboration. Feel free to reach out to us at any time at hello@vettedcauses.com

abrahamrowe @ 2024-11-08T17:39 (+36)

Did you ask ACE to review this before publishing? It seems like the kind of thing that would be worth getting feedback on before publishing. I didn't look at this for more than a couple minutes, but I saw immediately that there might be some conceptual disagreements between you and ACE - for example, I noticed that in your first example, you assume in your example (I believe), that if LIC didn't spend 200k on the lawsuit against Costco, they wouldn't spend it on anything else. It's unclear to me that this is the counterfactual, or how ACE is conceptualizing those funds. There might be reasoning behind their decisionmaking that would be useful to your critiques they could share.

I also felt like this felt pretty politically motivated. Not sure if that is your intention, but paragraphs like this:

ACE's recommendations determine which animal charities receive millions of dollars in donations.^[1] Thus far, we have reviewed 5 of ACE's "Top 11 Animal Charities to Donate to in 2024" and only one of them (Shrimp Welfare Project) appears to be an effective charity for helping animals. ACE's poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result.

Without any evidence feels pretty intense. ACE is kind of low hanging fruit to pick on in the EA space, so this read to me like more of that, without necessarily the evidence base to back it. Reading your report, I felt kind of like "oh, there are interesting assumptions here, would be interested to learn more", and not "ACE is doing an extremely bad job."

E.g. I think the questions that would be good to ask in a critique of ACE might be:

If ACE didn't exist, how would the funds the direct be spent otherwise? Would that be better or worse for animals?
Is historical track record / cost-effectiveness the only lens on which to evaluate charities?
- If the answer is yes, seems very hard to start new things!
- I don't know if the LIC legal case is this, but celebrating the potential impact of promising bets that didn't pan out seems good to me.

I also think getting feedback on statements like this would be really helpful:

The correct formula for calculating cost-effectiveness is simply impact divided by cost. Rather than using this simple formula, ACE has elected to create a methodology that does not properly account for impact or cost.

I think ACE has wanted to do this at points in their history — my impression is just that it is incredibly difficult, so they've approached it from other angles instead. I also don't think it's clear to me that ACE's goal is to report cost-effectiveness. I think clarifying this with them, and getting a sense of why they don't do what you see as the simple approach would be useful for making this critique stronger. And, I don't think people should make giving decisions based only on historic cost-effectiveness - just because an opportunity was impactful doesn't mean the organization needs more funds to do that work, that it will scale, work in the future, etc.

I don't disagree that ACE might be directing funds to ineffective charities! I don't really think non-OpenPhil EA donors should give to farmed animal welfare, for example. But, I don't think it is obvious to me that ACE going away means money going to more effective charities - I expect it would mostly be worse - people giving to animal charities with basically no vetting.

That being said, critique of critical organizations is great in my opinion, so appreciate you putting this out there!

Joey 🔸 @ 2024-11-08T18:37 (+13)

"I don't really think non-OpenPhil EA donors should give to farmed animal welfare, for example." Wow, this is interesting! I would love to know what you mean by this?

abrahamrowe @ 2024-11-08T19:47 (+8)

(I responded privately to this but wrote up some related reflections a while ago here).

Lorenzo Buonanno🔸 @ 2024-11-08T22:15 (+12)

Having read your reflections, I'm still curious as to why you don't think non-OpenPhil donors should give to farmed animal welfare, if you feel comfortable sharing it publicly. I guessed four options, ordered from most to least likely, but I might have misunderstood the post

We should donate to wild animal welfare instead, as it's more cost-effective
There are no donation opportunities that counterfactually help a significant amount of farmed animals
There is no strong moral obligation to improve future lives, and donations to farmed animal welfare necessarily improve future lives, as farmed animal lives are very short
Tomasik-style arguments on the impact of animal farming on the amount of wild animal suffering

Is it a combination of these? As a concrete example, I'm curious if you believe that the Shrimp Welfare Project shouldn't be funded, should be funded by "non-EA" donors, or will be funded anyway and donors shouldn't worry about it.

By the way, thank you for nudging towards sharing evaluations with the evaluated organization before posting, I think it's a really valuable norm.

abrahamrowe @ 2024-11-08T22:29 (+10)

Thanks! My wording in the above message was imprecise, but I mean something like farmed vertebrates. SWP is probably among the two most important things to fund, in my opinion.

Basically I think the size of good opportunities in farmed animal advocacy is smaller than OpenPhil's grantmaking budget and there are few scalable interventions, though I don't think I want to go into most the reasons publicly. Given that they've stopped funding many of what I believe are more cost-effective projects, and that EA donors are basically the only people willing to fund those, EA donors should be mostly inclined to fund things OpenPhil can't fund instead.

So some combination of 1+2 (for farmed vertebrates) + other factors

VettedCauses @ 2024-11-10T05:09 (+5)

I also felt like this felt pretty politically motivated. Not sure if that is your intention, but paragraphs like this:
ACE's recommendations determine which animal charities receive millions of dollars in donations.^[1] Thus far, we have reviewed 5 of ACE's "Top 11 Animal Charities to Donate to in 2024" and only one of them (Shrimp Welfare Project) appears to be an effective charity for helping animals. ACE's poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result.
Without any evidence feels pretty intense. ACE is kind of low hanging fruit to pick on in the EA space, so this read to me like more of that, without necessarily the evidence base to back it. Reading your report, I felt kind of like "oh, there are interesting assumptions here, would be interested to learn more", and not "ACE is doing an extremely bad job."

What claims did we make that we did not provide evidence for?

abrahamrowe @ 2024-11-10T10:03 (+7)

…we have reviewed 5 of ACE's "Top 11 Animal Charities to Donate to in 2024" and only one of them (Shrimp Welfare Project) appears to be an effective charity for helping animals. ACE's poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result.

I understand these are forthcoming, but no evidence is provided for this entire part - part of the reason I pushed on this is I think seeing your alternative evaluations would be very helpful for interpreting the strength of the critique of ACE. Without seeing them, I can’t evaluate the latter half of the quoted text. And in my eyes, if these are similar to the evaluation here of LIC, it’s pretty far from demonstrating that ineffective charities are receiving recommendations, etc. And, given that you’ve only evaluated <50% of their charities so far, it seems preemptive to make the overall claim. I think the overall claim is very possibly true, but again, I think to make the argument that animals are directly suffering as a result of this, you’d have to demonstrate that those charities are worse than other donation options, that donors would give to the better options, etc.

Sarah Cheng @ 2024-11-08T20:54 (+32)

Thank you for doing this work. I’m very supportive of productive criticism on the Forum. As a moderator, I’d like to recommend this post for tips on how to make criticism more productive. EA is a collective project, and I think that steps such as sharing this feedback with ACE directly and writing a less aggressive title for your post would improve the outcomes of this work.

VettedCauses @ 2024-11-08T20:58 (+29)

Thank you for your feedback. We will view the tips, and keep them in mind during our future reviews!

Edit: I have also changed the title of the post. For transparency, the original title was: Animal Charity Evaluators (ACE) is Extremely Bad at Evaluating Charities.

MichaelStJules @ 2024-11-08T19:29 (+24)

Their evaluation process has been updated (e.g. here), and I'm inclined to wait to see their new evaluations and recommendations before criticizing much, because any criticism based on last year's work may no longer apply. Their new recommendations come out November 12th.

FWIW, I am sympathetic to your criticisms, as applied to last year's evaluations. I previously left some constructive criticism here, too.

VettedCauses @ 2024-11-08T22:30 (+4)

Hi Michael,

ACE re-evaluates their Recommended Charities every two years. In our review of ACE, all charities mentioned were evaluated in 2023 (the most recent published review cycle). Therefore, every charity mentioned in our review will still be recommended in ACE's upcoming list of Recommended Charities.

When the new reviews come out, we will be sure to read them though!

Ben_West🔸 @ 2024-11-09T02:34 (+19)

Thanks for writing this. I feel like the following is the crux of your criticism of LIC:

ACE acknowledges the lawsuit was dismissed, but still celebrates this achievement. They note that this achievement would inspire similar lawsuits. Would it be good to inspire more lawsuits that cost $200,000 and are dismissed?

You state this as though the answer is "obviously no" but the answer feels extremely nonobvious to me. I note that you excluded the some key things when quoting ACE:

LIC’s first lawsuit, a shareholder derivative case against Costco’s executives for chicken neglect, was featured on TikTok and in multiple media outlets, including CNN Business, Fox Business, The Washington Post, and Meatingplace... We thought the achievement has strong potential for indirect impact, and it received a high amount of media attention. - ACE

The Facebook fan page still as of this writing has a post about the lawsuit pinned to the top because apparently the owner decided to boycott after learning about the cruelty.

It sounds like the Costco board also had to take official action:

In a letter dated August 15, 2023, Costco’s board stated that it had “formed a Board committee to review and investigate the demand’s allegations.” LIC’s shareholder clients then met with investigators retained by the committee. - LIC

Is it worth $200k to get a bunch of bad publicity for Costco, force the board to form a committee and hire an investigator, etc.?

I don't know, I'm pretty willing to believe that the answer is "no", but it doesn't seem obvious to me. I could pretty easily believe that the CEO of the next company they sue would to change their policies instead of having to deal with the embarrassment of asking the board to form a committee to investigate.

VettedCauses @ 2024-11-09T05:07 (+10)

Hi Ben,

Thank you for your response!

I will address your points, but first I would like to clarify what we believe the crux of the problem is with LIC being deemed a top 11 animal charity by ACE.

In Problem 1 of our review, we state the following:

We go on how to detail how if LIC had spent less than $2,000 on the lawsuit (saving over $200,000) and achieved the exact same outcome, ACE would have assigned LIC a Cost-Effectiveness Score of 1.8. The lowest Cost-Effectiveness Score ACE assigned to any charity in 2023 was 3.3. This means if LIC had spent less than $2,000 on the lawsuit, LIC's Cost-Effectiveness Score would have been significantly worse than any charity ACE evaluated in 2023.

Instead, LIC spent over $200,000 on the lawsuit, and LIC rewarded them for this inefficiency by giving them a Cost-Effectiveness Score of 3.7, and deeming LIC a top 11 animal charity.

This is the crux of the problem, and it is really an issue with ACE deeming LIC a top 11 animal charity, not with LIC itself. ACE elected to give LIC this distinction, and LIC merely accepted it.

I would also like to note encouraging or valuing lawsuits that fail to state valid legal claims (but burden defendants/garner publicity) risks causing the legal system to take animal rights/welfare cases less seriously. If courts observe a pattern of weak or legally insufficient cases being filed for publicity/to burden the defendant, they will become skeptical of all animal rights/welfare lawsuits--even those with strong legal merit. Prior to being deemed a top 11 animal charity by ACE, every single lawsuit filed by LIC failed to state a valid legal claim.

I note that you excluded the some key things when quoting ACE: LIC’s first lawsuit, a shareholder derivative case against Costco’s executives for chicken neglect, was featured on TikTok and in multiple media outlets, including CNN Business, Fox Business, The Washington Post, and Meatingplace... We thought the achievement has strong potential for indirect impact, and it received a high amount of media attention. - ACE

ACE’s review of LIC contains a section titled “Our Assessment of Legal Impact for Chickens’ Cost Effectiveness”, and the quote you have provided is not part of this section. Our entire review of ACE is about ACE incorrectly calculating cost-effectiveness; consequently, this is the section we decided to focus on. ACE’s review of LIC is over 5,000 words, and we cannot include every quote from ACE’s review of LIC.

Additionally, the quote you’ve provided gives no metrics to gauge how much media attention was received. If media attention is a strong justification for stating a $200,000 lawsuit that failed to state a valid legal claim is “particularly cost-effective” (as ACE put it), ACE should provide metrics regarding how much media attention was received. Ironically, the Facebook post you mentioned appears to have more metrics than ACE's review of LIC regarding amount of media attention caused by the Costco lawsuit, since the Facebook post lists the numbers of likes and comments the post received.

The Facebook fan page still as of this writing has a post about the lawsuit pinned to the top because apparently the owner decided to boycott after learning about the cruelty.

The Facebook post you referred to received 56 likes and 83 comments. To my understanding, the post is also not pinned to the top, it is simply the last post the Facebook page has made (it appears that the page has not posted in over 2 years). I do not think this is very strong evidence that LIC’s $200,000 lawsuit that was dismissed for failing to state a valid legal claim was “particularly cost-effective” (as ACE put it).

It sounds like the Costco board also had to take official action

Correct, the Costco board took official action by rejecting LIC’s demands.

Is it worth $200k to get a bunch of bad publicity for Costco [...]?

Could you please define what “a bunch of bad publicity for Costco” means? And could you provide evidence that this level of publicity was caused by LIC’s lawsuit?

Is it worth $200k to [...] force the board to form a committee and hire an investigator, etc.?

Costco’s board formed a committee to review and investigate LIC’s demands. The committee then recommended that the board reject the demand, which they did. This does not appear to be a very good outcome.

Is it worth $200k to get a bunch of bad publicity for Costco, force the board to form a committee and hire an investigator, etc.?
I don't know, I'm pretty willing to believe that the answer is "no", but it doesn't seem obvious to me. I could pretty easily believe that the CEO of the next company they sue would to change their policies instead of having to deal with the embarrassment of asking the board to form a committee to investigate.

It is ACE’s job to write charity reviews that provide the empirics necessary to answer questions like the one you’ve asked. From your own statement, it seems like ACE has failed to do this. ACE did not provide metrics on how much media attention the Costco lawsuit caused, and did not provide any insight into how much of a burden it was to form a committee to review and investigate LIC's demands (I don’t recall ACE’s review even mentioning this).

Ben_West🔸 @ 2024-11-10T18:02 (+7)

Yes, thank you, I understand that weighting by budget results in the phenomenon you described. I didn't comment on this since it sounds like ACE is planning to change it anyway.
I was referring to the publicity listed in ACE's review. The stories appear to be about the lawsuit so I am not entirely sure what you mean by "could you provide evidence that this level of publicity was caused by LIC’s lawsuit". See e.g. CNN, Fox.
To clarify: I don't care about causing burdens to Costco per se. The reason that burdens are relevant is because future companies might prefer to avoid that burden and instead change their policies. I agree it would be good to have a better model of when this would happen and would be excited for someone to make such a model!

Joey 🔸 @ 2024-11-08T18:34 (+18)

So, I have some mixed views about this post. Let's start with the positive.

In terms of agreement: I do think organizational critics are valuable, and specifically, critics of ACE in the past have been helpful in improving their direction and impact. I also love the idea of having more charity evaluators (even in the same cause area) with slightly different methods or approaches to determining how to do good, so I’m excited to see this initiative. I also have quite a bit of sympathy for giving higher weight to explicit cost-effectiveness models when it comes to animal welfare evaluations.

I can personally relate to the feeling of being disappointed after digging deeper into the numbers of well-respected EA meta organizations, so I understand the tone and frustration. However, I suspect your arguments may get a lot of pushback on tone alone, which could distract from the more important substance of the post and concepts (I’ll leave that for others to address, as it feels less important, in my opinion).

In terms of disagreement: I will focus on what I think is the crux of the issue, which I would summarize as: (a) ACE uses a methodology that yields quite different results than a raw cost-effectiveness analysis; (b) this methodology seems to have major flaws, as it can lead to clearly incoherent conclusions and recommendations easily; and (c) thus, it is better to use a more straightforward, direct CEA.

I agree with points A and B, but I am much less convinced about point C. To me, this feels a bit like an isolated demand for methodological rigor. Every methodology has flaws, and it’s easy to find situations that lead to clearly incoherent conclusions. Expected value theory itself, using pure EV terms, has well-known issues like St. Petersburg Paradox, optimizer's curse, and general model mistakes. CEAs in general share these issues and have additional flaws (see more on this here). I think CEAs are a super useful tool, but they are ultimately a model of reality, not reality itself, and I think EA can sometimes get too caught up in them (whereas the rest of the world probably doesn’t use them nearly enough). GW, which has ~20x the budget of ACE, still finds model errors and openly discusses how softer judgments on ethics and discount factors influence outcomes (and they consider more than just a pure CEA calculation when recommending a charity).

Overall, being pretty familiar with ACE’s methodology and CEAs, I would expect, for example, that a 10-hour CEA of the same organizations would be quite a bit further from the truth of the actual impact or effectiveness of an organization. It's not clear to me that spending equal time on pure CEAs versus a mix of evaluative techniques (as ACE currently does) would lead to more accurate results (I would probably weakly bet against it). I think this post overstates the importance of discarding a model due to a flaw that can be exploited.

A softer argument, such as “ACE should spend double the percentage of time it currently spends on CEAs relative to other methods” or “ACE should ensure that intervention weightings do not overshadow program-level execution data,” is something I have a lot of sympathy for.

VettedCauses @ 2024-11-08T19:02 (+7)

Hi Joey,

Thank you for taking the time to read our review!

(a) ACE uses a methodology that yields quite different results than a raw cost-effectiveness analysis; (b) this methodology seems to have major flaws, as it can lead to clearly incoherent conclusions and recommendations easily; and (c) thus, it is better to use a more straightforward, direct CEA.
I agree with points A and B, but I am much less convinced about point C.

I would like to point to Problem 1 and Problem 4 from the review:

Charities can receive a worse Cost-Effectiveness Score by spending less money to achieve the exact same results.
Charities can have 1,000,000 times the impact at the exact same price, and their Cost-Effectiveness Score can remain the same.

Effective giving is all about achieving the greatest impact at the lowest cost. ACE’s methodology is not properly accounting for impact, or for cost.

Using the equation impact / cost at least results in impact being in the numerator, and cost being in the denominator. To me, this alone makes a straightforward, direct CEA a better methodology than the one used by ACE.

To me, this feels a bit like an isolated demand for methodological rigor. Every methodology has flaws, and it’s easy to find situations that lead to clearly incoherent conclusions.

I absolutely agree that every methodology has flaws, and we did not mean to imply otherwise. However, the incoherent conclusions described in our review of ACE's methodology are not one off instances. hey are pervasive problems that impact all of ACE's reviews.

Thank you for your feedback!

Vasco Grilo🔸 @ 2024-11-10T20:20 (+12)

Great analysis, Isaac! I worry the Animal Welfare Fund (AWF) has similar problems (see below), but they are way less transparent than ACE about their evaluations, and therefore much less scrutable. Instead of mostly deferring to AWF, I would rather have donors look over ACE's evaluations, discuss their findings with others, and eventually publish them online, even if they spend much less time on these activities than you did.

AWF only runs cost-effectiveness analysis (CEAs) for a minority of applications. According to a comment by Karolina Sarek, AWF's chair, on June 28 (this year):

In the past, we tended to do CEAs more often if: a) The project is relatively well-suited to a back-of-the-envelope calculation b) A back-of-the-envelope calculation seems decision-relevant. At that time, a) and b) seem true in a minority of cases, maybe ~10%-20% of applications depending on the round, to give some rough sense. However, note that there tends to be some difference between projects in areas or by groups we have already evaluated versus projects/groups/areas that are newer to us. I'd say newer projects/groups/areas are more likely to receive a back-of-the-envelope style estimate.

Comparisons across grants also seem to be lacking. From Giving What We Can's (GWWC's) evaluation of AWF in November 2023 (emphasis mine):

Fourth, we saw some references to the numbers of animals that could be affected if an intervention went well, but we didn’t see any attempt at back-of-the-envelope calculations to get a rough sense of the cost-effectiveness of a grant, nor any direct comparison across grants to calibrate scoring. We appreciate it won’t be possible to come up with useful quantitative estimates and comparisons in all or even most cases, especially given the limited time fund managers have to review applications, but we think there were cases among the grants we reviewed where this was possible (both quantifying and comparing to a benchmark) — including one case in which the applicant provided a cost-effectiveness analysis themselves, but this wasn’t then considered by the PI in their main reasoning for the grant.

GWWC looked into 10 applications:

Of the 10 grant investigation reports we reviewed, three were provided by the AWF upon our general request for representative grants; two were selected by us from their grants database; two were selected by the AWF after we provided specifications; and three were selected by the AWF based on our request for grant applications by organisations that applied to both the AWF and ACE’s MG.

Karolina also said on June 28 that AWF has improved their methodology since GWWC's evaluation:

However, since then, we've started conducting BOTEC CEA more frequently and using benchmarking in more of our grant evaluations. For example, we sometimes use this BOTEC template and compare the outcomes to cage-free corporate campaigns (modified for our purposes from a BOTEC that accompanied RP's Welfare Range Estimates).

I do not doubt AWF has taken the above steps, but I have no way to check it. I think donating to ACE over AWF is a good way of incentivising transparency, which ultimately can lead to more impact.

KarolinaSarek🔸 @ 2024-11-11T17:08 (+17)

Hey Vasco! I agree that AWF should be more transparent, and since I started working on it full-time, we have more capacity for that, and we are planning to communicate about our work more proactively.

In light of that, we just published a post summarizing how 2024 went, what changes we recently introduced, and what we are planning. We touched on updates to our evaluation process as well. Here is the relevant section from that post:

"Grant investigations:
Updated grant evaluation framework: We've updated our systematic review process, enabling us to evaluate every application using standardized templates that vary based on the required depth of investigation. This framework ensures a thorough assessment of key factors while maintaining flexibility for grant-specific considerations. For example, for the deep evaluations, (which are the vast majority of all evaluations), key evaluation areas include assessment of the project’s Theory of Change, scale of counterfactual impact, likelihood of success, back-of-the-envelope cost-effectiveness and benchmarking, and the expected value of receiving funding. It also includes forecasting grant outcomes. You can read more about our process in the FAQ.
Introduced new decision procedures for marginal grants: We introduced an additional step in our evaluation that enables us to make better decisions about grants that are just below or just above our funding bar. Since AWF gives grants on a rolling basis rather than in rounds, it is important to have a process for this to ensure decisions are consistent."

We also slightly updated our website and added a new question to the FAQ - I'm copying that below:

"How Does the EA Animal Welfare Fund Make Grant Decisions?

Our grantmaking process consists of the following stages:

Stage 1: Application Processing. When we receive an application, it's entered into our project management system along with the complete application details, history of previous applications from the applicant, evaluation rubrics, investigator assignments, and other relevant documentation.

Stage 2: Initial Screening. We conduct a quick scope check to ensure applications align with our fund's mission and show potential for high impact. About 30% of applications are filtered out at this stage, typically because they fall outside our scope or don't demonstrate sufficient impact potential.

Stage 3: Selecting Primary Grant Investigator and Depth of the Evaluation. For applications that pass the initial screening, we assign investigators who are most suitable for a given evaluation. Based on various heuristics, such as the size of the grant, uncertainty, and potential risk, the Fund’s Chair also determines the depth of the evaluation.

Stage 4: In-Depth Evaluation. Every grant application undergoes a systematic review. For each level of depth of investigation required, AWF has an evaluation template that fund managers follow. The framework balances ensuring that all key factors have been considered and that evaluations are consistent, while leaving space for additional, grant-specific crucial considerations. For the deep evaluations, (which are the vast majority of all evaluations), the primary investigator typically examines:

Theory of Change (ToC) - examining how activities translate into improvements for animals and whether the evidence supports its merits
Scale of counterfactual impact - assessing the problem's scale, neglectedness, and strategic importance
Likelihood of success - evaluating track record, team competence, and concrete plans
Cost-effectiveness and benchmarking- conducting calculations to estimate impact per dollar and compare it to relevant benchmarks
Value of funding - analyzing counterfactuals and long-term sustainability
Forecasting - forecasting the probability that the project will succeed or fail and due to what reasons (validity of the ToC or performance in achieving planned outcomes )
In the case of evaluations that require the maximum level of depth, a secondary investigator critically reviews the completed write-up, raises additional questions and concerns, and provides alternative perspectives or recommendations.

Stage 5: Collective Review and Voting. After the evaluation, each application undergoes a thorough collective assessment. The Fund Chair and at least two Fund Managers review the analysis. All Fund Managers without conflicts of interest can contribute additional insights and discuss key questions through dedicated channels. Finally, each Fund Manager assigns a score, which helps us systematically compare the most promising grants.

Stage 6: Final Recommendation Looking at the average score, the Fund Chair approves grants that are clearly above our funding bar and rejects those clearly below it. For grants near our funding threshold, we conduct another step where all found managers compare those marginal grants against each other to select the strongest proposals.

Once decisions are finalized, approved grants move to our grants team for contracting and reporting setup.

Throughout this process, we maintain detailed documentation and apply consistent standards to ensure we select the most promising opportunities to help animals most effectively."

Vasco Grilo🔸 @ 2024-11-11T20:00 (+2)

Thanks, Karolina! Great updates.

tobycrisford 🔸 @ 2024-11-08T17:59 (+12)

If you're correct in the linked analysis, this sounds like a really important limitation in ACE's methodology, and I'm very glad you've shared this!

In case anyone else has the same confusion as me when reading your summary: I think there is nothing wrong with calculating a charity's cost effectiveness by taking the weighted sum of the cost-effectiveness of all of their interventions (weighted by share of total funding that intervention receives). This should mathematically be the same as (Total Impact / Total cost), and so should indeed go up if their spending on a particular intervention goes down (while achieving the same impact).

The (claimed) cause of the problem is just that ACE's cost-effectiveness estimate does not go up by anywhere near as much as it should when the cost of an intervention is reduced, leading the cost-effectiveness of the charity as a whole to actually change in the wrong direction when doing the above weighted sum!

If this is true it sounds pretty bad. Would be interested to read a response from them.

Of course, the other thing that could be going on here, is that average cost-effectiveness is not the same as cost-effectiveness on the margin, which is presumably what ACE should care about. Though I don't see why an intervention representing a smaller share of a charity's expenditure should automatically mean that this is not where extra dollars would be allocated. The two things seem independent to me.

VettedCauses @ 2024-11-09T21:14 (+5)

Hi Toby,

Thank you for your reply!

Of course, the other thing that could be going on here, is that average cost-effectiveness is not the same as cost-effectiveness on the margin, which is presumably what ACE should care about.

I'm not certain if by cost-effectiveness on the margin, you meant cost-effectiveness in the future if additional funding is obtained. If that's the case, the following information could be helpful.

ACE does 2 separate analyses for past cost-effectiveness, and room for future funding. For example, those two sections in ACE's review of LIC are:

Cost Effectiveness: How much has Legal Impact for Chickens achieved through their programs?
Room For More Funding: How much additional money can Legal Impact for Chickens effectively use in the next two years?

Our review focuses on ACE's Cost-Effectiveness analysis. Additionally, ACE states (under Criterion 2) that a charity's Cost-Effectiveness Score "indicates, on a 1-7 scale, how cost effective we think the charity has been [...] with higher scores indicating higher cost effectiveness."

tobycrisford 🔸 @ 2024-11-10T11:29 (+1)

This is very helpful, thanks!

MathiasKB🔸 @ 2024-11-08T18:29 (+11)

I strongly upvoted this post because I'm extremely interested in seeing it get more attention and, hopefully, a potential rebuttal. I think this is extremely important to get to the bottom of!

At first glance your critiques seem pretty damning, but I would have to put a bunch of time into understanding ACE's evaluations first before I would be able to conclude whether I agree your critiques (I can spend a weekend day doing this and writing up my own thoughts in a new post if there is interest).

My expectation is that if I were to do this I would come out feeling less confident than you seem to be. I'm a bit concerned that you haven't made an attempt at explaining why ACE might have constructed their analyses this way.

But like I'm pretty confused too. It's hard to think of much justification for the choice of numbers in the 'Impact Potential Score' and deciding the impact of a book based on the average of all books doesn't seem like the best way to approach things?

VettedCauses @ 2024-11-09T20:50 (+1)

Hi Mathias,

Thank you for your comment!

I can spend a weekend day doing this and writing up my own thoughts in a new post if there is interest

We would definitely be interested in hearing your thoughts. We've set post notifications on for your profile, and look forward to seeing your post!

David T @ 2024-11-08T19:28 (+8)

It feels like this needs a response from both ACE and Legal Impact for Chickens. (I'm not suggesting it should be a quick one, some things are important enough to warrant careful review. I agree with @abrahamrowe it would probably have been better to ask for their comments before publishing)

I think it is possible for a charity focusing on taking legal action to be impactful without [consistent] legal success, which the review doesn't really acknowledge. A large part of the theory of change around suing corporate bad behaviour is the idea that it will deter bad behaviour in future, by making standards compliance more cost effective than defending lawsuits
Deterrent effects however are a more complicated theory of change than actually winning cases and forcing actors to change. And it may be very difficult to have a deterrent effect if cases are typically dismissed.
To that extent I'm quite surprised to learn that Legal Impact for Chickens apparently hasn't yet had any victories, based on what I had heard about that organization. I don't think this necessarily reflects badly on the organization, which is a young charity focused on a legal process which inevitably takes time. But it does mean the error bars for their impact are rather large, and could mean a nonzero possibility they aren't [yet] having an impact at all. It would be interesting to hear more about metrics used (both by LIC and ACE, and other charities with similar theory of change for that matter) to evaluate the impact of an unsuccessful lawsuit, and how substantial those are.
Some of the questions raised about ACE's weightings are quite independent from the example given. It would be interesting to hear from ACE if and how evaluation criteria for their [apparently mostly subjective] impact scoring takes into account the idea that a charity could achieve a higher score by subdividing campaigns, and if and how they intend to update impact assessments in cases like the example of books either failing to reach a non trivial number of people or being phenomenally successful even if the case they make for veganism was not originally assessed as particularly evidence-based.

I think this would have been an interesting contribution to the Animal Welfare vs GHD debate week. From the limited amount I read of it, it seemed that even people (on different sides of the debate) whose analysis was very thorough weren't taking account the more straightforward possibility that some of the highlighted top animal advocacy charities simply weren't close to being as effective [yet] at achieving their goals as suggested, regardless of philosophical positions and empirical claims about welfare levels.

VettedCauses @ 2024-11-08T19:48 (+1)

Hi David,

Thank you for your reply!

I think it is possible for an charity focusing on taking legal action to be impactful without [consistent] legal success, which the review doesn't really acknowledge. A large part of the theory of change around suing corporate bad behaviour is the idea that it will deter bad behaviour in future, by making standards compliance more cost effective than defending lawsuits

I definitely agree that this is possible! However, as you said

it may be very difficult to have a deterrent effect if cases are typically dismissed.

ACE evaluated 3 “legal actions” in their review of LIC:

2 of the legal actions were dismissed under Rule 12(b)(6) for failing to state a valid legal claim. 12(b)(6) dismissals occur very early on in the legal process, making any legal expenses incurred by the Defendants relatively low. Additionally, encouraging or valuing lawsuits that fail to state valid legal claims but cost the defendant money risks causing the legal system to take animal rights/welfare cases less seriously. If courts observe a pattern of weak or legally insufficient cases being filed to burden defendants, they will become skeptical of all animal rights/welfare lawsuits--even those with strong legal merit.
The 3rd legal action ACE evaluated was not actually a legal action, but rather a public comment submission (ACE still classified it as a legal action). The public comment was rejected, and it is difficult to see how this would have a positive impact.

I don't think this necessarily reflects badly on the organization, which is a young charity focused on a legal process which inevitably takes time.

ACE endorses LIC as a top charity. Currently, I don’t think this endorsement is justified given LIC’s track record, and I don't think ACE provided a very strong justification for it. Here is a quote from ACE's review of LIC:

"We think that out of all of Legal Impact for Chickens' achievements, the Costco shareholder derivative case is particularly cost effective because it scored high on achievement quality."

The Costco shareholder derivative case cost LIC over $200,000 and was dismissed for failing to state a valid legal claim. It is difficult to understand why ACE thinks this is a particularly cost effective achievement.

Some of the questions raised about ACE's weightings are quite independent from the example given.

Could you elaborate on what you mean by this?

I think this would have been an interesting contribution to the Animal Welfare debate week.

I wasn’t aware of that week. Maybe we’ll be able to prepare something for it next year!

Thank you for your feedback!

David T @ 2024-11-08T20:11 (+2)

ACE endorses LIC as a top charity. Currently, I don’t think this endorsement is justified given LIC’s track record, and I don't think ACE provided a very strong justification for it.

I agree with this, and particularly agree that the quote you highlighted below does not seem like good justification. I also think your comment (elsewhere in this thread) that their track record is a "bad one" might be going a little too far.^[1] As I say, I was surprised to find that LIC had not yet had any legal success, given that I'd heard about them mostly through hearing positive commentary on their cost effectiveness

Could you elaborate on what you mean by this?

I meant that there were criticisms you raised about the overall methodology that had wider implications than just LIC. Possibly I could have worded that better.

I wasn’t aware of that week. Maybe we’ll be able to prepare something for it next year!

There was an animal welfare vs GHD debate week on this forum. Honestly, I hope they don't repeat it!^[2]

^{^}
I think a charity aiming to encourage compliance that never filed any lawsuits unless they were almost certain to succeed would probably underperform too, and $200k is not an especially expensive legal case, though there are certainly more proven cost effective ways to save lives for that sort of money. That said, I haven't read the lawsuit and wouldn't know enough about relevant law to know whether the basis for dismissal was blindingly obvious or not...
^{^}
I think there are probably more specific and less polarizing topics for debate. And polarizing topics more likely to yield concrete results, which probably includes this one.

Daymoon @ 2024-11-18T06:28 (+4)

I recently explored a charity from ACE’s 2024 Recommended Charity list that initially seemed like a good fit for the region and cause I want to support. However, upon closer examination, I found significant gaps in the evidence provided to justify their inclusion. Specifically, the impact analysis lacked clear data on cost-effectiveness and how the total number of animals impacted was calculated.

The organization’s focus on corporate pledges, while valuable, is hard to verify. Additionally, there was no evidence of systematic follow-up to ensure these commitments are implemented. Given that their last audited financial and activity reports were from 2022 (as published at the organisation`s website), I am concerned about whether ACE’s evaluation relied on up-to-date and complete information.

As someone passionate about animal welfare, I appreciate ACE’s mission, but I believe their evaluation process must ensure greater transparency, robust evidence, and accountability to maintain credibility and guide donors effectively.