animal welfare has an evidence problem

By matthes @ 2026-06-05T22:04 (+377)

Why I stopped donating to animal welfare charities but feel more motivated than ever to redirect money and talent to the cause.

I have wanted to write this post for a while. It is an uncomfortable thing to bring up. Many people in the animal welfare space are working really hard, and this post might leave some feeling defeated. But I think this is one of the most important things to talk about in animal welfare right now. My intention is not to be a downer or create infighting. Instead, I hope this post inspires lots of people to tackle this major neglected problem.

key takeaways

Even some of the most prominent animal welfare interventions have surprisingly weak evidence behind them. In some cases, the available evidence even suggests that the intervention may be causing harm.
Specifically
- We have very limited data on electrical shrimp stunning that doesn't support a confident conclusion as to whether it's good or bad.
- We have mixed evidence on whether transitioning egg producers to cage-free improves welfare overall.
- We have evidence that the substitution effect of alternative proteins is weak, at best.
Significant additional funding and talent should be allocated to raise the bar for animal welfare interventions by building R&D infrastructure that can rapidly generate high-quality action-relevant research results.
This should be the #1 priority for new animal welfare funding, ahead of scaling existing work.

introduction

After I completed my fancy computer science PhD at Oxford that was supposed to set me up to work on AI risk, I decided to pivot into animal welfare.

I was excited to join a field that seemed to have lots of charities executing on interventions that (at least on the surface) appeared to be pretty straightforwardly good.

However, when I tried to look for high impact projects to support, I was surprised by the lack of good options.

I found that even the most well known (and well funded) interventions had limited evidence, sometimes pointing in the “wrong” direction.

In fact, I struggled to find any evidence-based interventions that I felt comfortable supporting.

I stopped donating to the cause.

three salient animal welfare interventions and their evidence bases

In this post I want to use three of the most salient interventions in animal welfare as examples: shrimp stunning, cage-free reforms, and alternative proteins.

All three interventions make intuitive sense, but their evidence bases are limited at best, and at worst suggest the interventions may be causing harm.

There will always be some uncertainty about the exact effects of a particular intervention. It is not always clear how to compare one harm to another. Collecting data in the real world at scale can be difficult.

But with each of these examples, ask yourself if you would feel comfortable supporting an intervention on humans that had a similar amount/kind of evidence backing its intervention.

shrimp stunning and slaughter

One of the most salient EA charities is the Shrimp Welfare Project (SWP). Their most well known programme is the Humane Slaughter Initiative (HSI)^[1]. In this programme, SWP provides electrical stunning equipment to Whiteleg shrimp producers for free if they commit to using it on a certain number of shrimp per year, and combine it with ice slurry slaughter. The first commitment was signed in 2023.

However, at the time, there had only been one study^[2] evaluating the effect of electric shock on Whiteleg shrimp. In this 2018 study, Weineck et al. compared ice slurry (sub 4°C) to in-water electric shock (10s at 120V across 17cm, saltwater) using heart rate data.

The study uses a very small sample size (N = 6 for each intervention) which makes me uncomfortable recommending any action based on it. But below are the key takeaways based on this limited data.

For ice slurry:

Immersion in ice slurry caused a rapid and massive drop in heart rate “amplitude” within seconds.
Returning shrimp to warm water after 5 minutes allows the regular heart activity to return.
When shrimp first hit the ice slurry, they perform sudden full-body contractions (tail flips), but this also happens if you first cut their head off (check the supplementary material for a video).

For electric shock:

An electric shock causes the heart to become arrhythmic (they have to turn off the recording equipment during the shock so this is less clear).
The shrimp recover their ability to move after 5-10 minutes.
Their heart activity never returns to normal, indicating “permanent damage or alteration in the function of the heart”.
The electric shock also causes a strong tail flip.

Based on this data, it is unclear if electric shock followed by ice slurry provides any benefit over ice slurry alone, provided the animals are kept in ice slurry until they are fully dead. (It is unclear how long that would take, though.)

And yet, ice slurry is often regarded as “the bad way” to kill shrimp. In fact, Mercy for Animals has been actively campaigning against ice slurry slaughter^[3].

As a result of this pressure, all major UK supermarkets have now made commitments to improve shrimp slaughter, with most announcing plans to introduce electrical stunning^[4].

Many shrimp harvests do not use ice slurry, or do not use it properly (not cold enough, not long enough, etc.). Here^[5] are some example videos. In those cases, introducing electrical stunning actually has the potential to increase suffering, as the shrimp are likely to wake up and suffer from the shock damage until dying of asphyxiation or from their injuries. (And we don’t really know how long that takes.)

Finally, it is also unclear how the results of the study translate to industrial slaughter machines. Neither the authors of the study nor the manufacturers provide sufficient information on electrical parameters for a fair comparison. Additionally, one of the two stunners distributed - the Optimar machine - is a “dry” stunner that shocks shrimp out-of-water. It is unclear how in-water results would translate.

In 2021, Tesco and Hilton Seafood published a report^[6] on their use of a modified Optimar stunner in Vietnam. The results are a bit vaguely presented, but with at least one machine setting “a significant proportion” of shrimp did not “show signs of recovery” within 10 minutes. This gives some reassurance for this particular machine, although I know from talking to people in the field that this machine was a modified version of the commercially available one.

Earlier this year, a preprint^[7] of a second scientific study was released. It is not peer-reviewed yet, but available online. This study is also small (N = 4-6 per intervention) which again leads me to have limited confidence in conclusions. It compares electrical shock (2-3 V/cm for 1-20s, various combinations) followed by cold shock at different temperatures with cold shock alone (various temperatures) and electrical shock alone (various parameters). The authors in this study measured electrical activity at the supraesophageal ganglion (the “brain”) rather than the heart.

I want to flag that I found parts of the results section hard to parse and sometimes details seemed to contradict each other. But key insights include:

Electric shock only^[8]:

At lower shock voltage and duration all shrimp recovered coordinated movement as well as their righting reflex within seconds or minutes.
At a high shock voltage and duration some animals recovered coordinated movement, but none recovered their righting reflex. Eventually, all animals in this group died within 2 hours.

Ice slurry only^[9]:

Cold ice slurry (-2.5°C - 0°C) led to a fast drop in signal from the neural electrodes. The temporal resolution of the plots is low (1 min interval), but the first post-immersion data point (median over N = 5, after 2 min (?)) at 0°C is already at ~10% of the pre-immersion signal.
Slightly warmer water (2.5°C - 5°C) had much worse outcomes. On average, it took 5 minutes for a large drop in neural activity, and for most of the 30 min observation window, median neural activity stayed higher than in the colder group. This means that it’s really important to use a lot of ice and ensure sufficient mixing of the slurry.

Electric shock followed by ice slurry (0°C)^[10]:

At lower shock voltage and duration, neural activity decreased on average, but sometimes increased. Only 3/6 animals had a 90% drop of neural activity within 30 minutes. Overall, this group had worse outcomes (more neural activity) than 0°C ice slurry alone.
At higher shock voltage and duration, neural activity drops below 10% of pre-stun levels within the first 2 min (the first data point) for all (N = 4) shrimp.

Overall, there are only two scientific studies on the topic of using electric shock on Whiteleg shrimp. Both have limited sample sizes. Both show some recovery from electric shock. Both find that immersion in proper ice slurry leads to a rapid drop in vital signs. Neither is representative of industrial stunning machines^[11].

In conclusion, evidence for electrical stunning is extremely limited and we shouldn't feel comfortable recommending anything with confidence. However, if I was forced to interpret this small data set, I would expect that

A sufficiently strong electrical shock with proper ice slurry (which is hard to implement in practice) does not provide much improvement over proper ice slurry alone.
Insufficient electrical stunning with proper ice slurry may be worse than ice slurry alone.
Electrical stunning without proper ice slurry slaughter poses real potential for causing harm.

(Disclaimer: One of my main projects at the moment is working on improving shrimp slaughter with electric shock without having to rely on ice slurry. At this point in time, I am still optimistic that this is possible with sufficient R&D. We just can't be confident based on available data. I am hoping to make our findings public later this year. )

cage-free

Cage-free advocacy is a huge topic in the EA animal welfare world. Three of the four currently “featured grants” by the EA Animal Welfare Fund are for cage-free reform work^[12].

Coefficient Giving lists cage-free reform (alongside alternative proteins) as one of its four big grantee wins and has given out over $40 million in funds^[13] for it.

Cages being bad makes some intuitive sense. We also know that chickens will go out of their way to access nesting opportunities. Bare cages deprive chickens of many behaviours, such as perching, dust bathing, and foraging.

But what do we actually know about what happens when a real-world farm switches from caged to cage-free?

When the industry argues for cages, they often bring up that mortality is higher in cage-free systems. The most cited source for this is probably a field study conducted by the US Coalition for Sustainable Egg Supply^[14].

They compared conventional cages (CC, 6 hens per cage), enriched colony cages (EC, 60 hens per cage), and cage-free aviaries (AV) at a single commercial-scale egg farm in the US.

Hens in the cage-free system performed the most natural behaviours (flying, perching, dust bathing, foraging) and had stronger leg and wing bones. However, the study also found that cage-free systems had

more severe foot lesions
more keel abnormalities
increased aggression
increased mortality

The mortality in cage-free systems was over twice as high as the others:

From the Research Results Report Appendix by CSES. A comparison of cumulative mortality.
CC = conventional cages; EC = furnished/enriched cages; AV = cage-free aviaries

The most common causes of death in the cage-free system were

hypocalcemia (low calcium)
egg yolk peritonitis (egg yolk ends up in the wrong place in the body and causes inflammation and infection)
dehydration
vent cannibalism (other hens pecking at the cloaca; much more common in cage-free than caged - 11.5% vs 0.7% relative mortality rate)
emaciation
rotten (corpse too rotten to assess cause of death)

These aren't sudden, painless deaths. Increased vent pecking itself is also a sign of increased environmental stress. Overall, this suggests that hens in the cage-free systems generally experienced more distress.

A counterpoint I sometimes hear is that the difference in mortality between caged and cage-free systems disappears as farmers gain experience with cage-free systems.

For this claim people typically cite a large meta-analysis from 2021^[15]. The authors use data from over 6k flocks over 16 countries to argue that “except for conventional cages, mortality gradually drops as experience with each system builds up: since 2000, each year of experience with cage-free aviaries was associated with a 0.35–0.65% average drop in cumulative mortality, with no differences in mortality between caged and cage-free systems in more recent years.”

However, I believe that it is an overstatement to say that there is “no differences in mortality” given the actual data. This is the relevant figure in the meta-analysis:

Time series of cumulative layer mortality rate (standardised to 60 week mortality and double arcsine transformed).

It depicts the FRA_16 data set collected by The French Poultry Technical Institute (ITAVI).

This data set

covers 15 years (2002 - 2016)
is one of the two largest data sets in the meta-analysis in terms of hens included, accounting for about a third (32%) of all the hens in the meta-analysis
was collected in the same country with the same methodology
includes both caged and cage-free systems (including furnished cages, and free-range)

making it the most important one included in the meta-analysis when assessing trends.

The meta-analysis transformed the raw data for easier comparison with other data sets. This includes an adjustment to account for mortality correlating with age (farms that keep chickens alive for longer will have higher mortality), and a double arcsine square root transformation (not unusual for a meta-analysis).

Here is the data set, as presented by ITAVI, including data from free-range systems (which together make up the majority of cage-free production in France):

From the 2017 ITAVI report. The vertical axis is in percent.
pondeuses sol = cage-free (indoor); pondeuses plein air = free-range; pondeuses label = free-range (premium label); pondeuses biologiques = organic; anciennes cages = conventional cages; cages aménagées = furnished cages

While it is true that mortality in cage-free systems dropped over the first few years of the data set (indicating that farmers gain experience with the new system), progress eventually slowed down. Mortality becomes noisy. And while the very last mortality data point for indoor cage-free systems is at an all-time low and close to the one of furnished cages, the overall trend is less clear. Notably, outdoor cage-free systems actually saw an increase in mortality towards the end of the data set.

What about the other data sets in the study, though?

The other major contributor is USA_13, accounting for another third (33%) of the total hens included in the meta-analysis^[16].

However, USA_13 only covers a single year (2013) so cannot be used to assess trends.

In this data set (USA_13), mortality (cumulative at 60 weeks) is indeed statistically indistinguishable from cages. But, despite being the biggest data set overall, USA_13 doesn't even make it into the top 5 when ranking by number of hens in cage-free systems.

Let's list all data sets that compare at least one caged system with at least one cage-free system. For each source, I checked each possible pair-wise comparison for statistical significance using z-scores where standard errors were reported. Where errors weren't available, I made a note summarising the difference.

Data from the supplementary material of the meta-analysis. Cumulative transformed 60w mortality in percent.
CC = conventional cage; FC = furnished cage; AV = aviary (all types); ST = single tier aviary; MT = multi tier aviary
orange = evidence for higher mortality in cage-free systems; blue = evidence for higher mortality in caged systems; grey = no significant difference

When there was no significant difference, I highlighted the relevant evidence cell grey. When cage-free systems exhibited higher mortality, I used orange. And when caged systems had higher mortality, I used blue.

The majority of relevant data sets showed higher mortality in cage-free systems. Some found no significant difference. Only one study (STA_16) found that furnished cages had a higher mortality than single tier aviaries.

As a utilitarian, I am open to the idea that a shorter, happier life outweighs a longer, but otherwise worse life (even if it creates demand for more animals in the system). However, given that chickens don't just drop dead very suddenly, but instead die slowly from violence, disease, and starvation/thirst, cage-free systems appear more stressful.

The Welfare Footprint Institute (WFI) argues the opposite by adding up hours of pain experienced by an average chicken in different systems^[17]. With their model, cage-free aviaries have lower pain scores than conventional cages. However, these results are heavily dependent on what harms are included, how they are scored, and how different types of pain are weighed. Deprivation of natural behaviours accounts for most of the difference in scores. Chronic fear and stress from violence is not included.

Ultimately, what we need is long-term data on behavioural and physiological indicators of welfare in different systems.

WFI itself has highlighted a general lack of research in poultry welfare^[18].

While I expect that the best environment for a chicken does not involve a cage, I do not feel comfortable supporting a blanket push for cage-free reforms, given the currently available data on what happens when farms/countries switch.

(There is also a lot of talk about keel bone fractures (KBF) being more common in cage-free systems, but I feel quite unsure about the data on both prevalence and severity of KBFs, so decided to leave it out of this post in the end.)

alternative proteins

Alternative Proteins are one of Coefficient Giving’s most funded animal welfare interventions. A year ago, CG had spent over $34 million total on grants in this space^[13].

The idea of alternative proteins is that we could reduce or even end animal farming and fishing by replacing animal products with equivalent alternatives.

However, we have little evidence for this substitution effect. I recommend Jacob Peacock’s paper on how “price-, taste-, and convenience-competitive plant-based meat would not currently replace meat” for a good summary of this literature^[19]. The report is a few years old now, but even then we already had multiple strong studies showing that the availability of equivalent substitutes does not lead to major changes in meat consumption.

~~See also this~~^[20] ~~more recent meta-analysis that came to a similar conclusion about alternative proteins and other meat reduction interventions.~~ Edit: See Seth's comment below!

Unlike the other two examples, alternative proteins are unlikely to cause harm. And it still seems possible that they might help in some circumstances and/or in the long run. But while I would personally love to see more fake meats in my supermarket, I do not think that we can consider it an evidence-based intervention by EA standards.

this is a field-wide problem

I worry that some people will go away from this post thinking that there are a small number of individuals responsible for these particular examples that need to be identified.

All three intervention examples I give have been extensively discussed, recommended by charity evaluators, and financed by major funders.

These are some of the best interventions the field was able to identify.

And yet, their evidence bases seem to be very limited and/or contested.

my recommendations to funders

animal welfare should not be de-funded

Another major worry I have about writing this post is that it may result in funding leaving the animal welfare field. I do not believe that this is the correct response.

In fact, I think that addressing this issue will require a lot of additional resources. We need to build organisations that can rapidly identify and answer action-relevant research questions.

Many people are currently anticipating a major influx of money into animal welfare. By default, this may lead to funders lowering the bar for funding interventions. Instead, we should be using the money to raise the bar.

A recent forum post^[21] by @Zoë Sigle 🔹(writing for Senterra Funders) suggests that major funders will allocate additional money by: scaling existing work (~50-60%), testing new interventions (~30-40%), and growing movement infrastructure (~10-20%).

I am concerned about this allocation plan. We just do not know that much about animal suffering and how to alleviate it. R&D should be the #1 priority right now, ahead of scaling existing work.

we should be taking ownership of the entire evidence pipeline

We (the EA animal welfare community) should be using significant resources to generate useful evidence. But simply funding the broad field of animal welfare science is likely to create scattered research results that are difficult to translate into action.

We should be involved at every stage of the process. Including

generating actionable research questions
designing experimental plans
conducting the studies
analysing the raw data
interpreting the results of the analysis and translating them into actionable recommendations

I think entire organisations could and should be founded for this. Until now, this was simply not possible. Research is expensive and slow, especially at universities. But we're about to have the luxury to aim higher.

(To be clear, I don't claim that this isn't happening at all right now. There are grants being made to advance our understanding of animal suffering. But we haven't been able to be ambitious enough so far.)

when bet making doesn't make sense

In the absence of great funding options that can absorb lots of money, it makes sense to take some bets. It may make sense to fund a lot of proposals that only have, say, a 5% chance of working out.

I think that alternative proteins may fall into this bucket. While it doesn’t meet my personal donation bar for an evidence-based intervention, I understand why it has absorbed large amounts of philanthropic money earmarked for animal welfare. It might just eventually work.

But taking such bets is only appropriate if the risk of causing harm is sufficiently small.

I believe that we have not been sufficiently cautious when taking bets that could be causing significant direct harm to animals (beyond just the lost funding that could be spent elsewhere).

conclusion

To me, the evidence problem is the most important thing to work on in animal welfare right now. While I personally stopped donating to existing charities, I am hoping to redirect significant funds and talent to this.

Please get in touch if you think you might be able to help solve this challenge with talent/funding/ideas/connections.

I want to thank everyone who listened to me talk about this, including both those who warned me against writing/speaking publicly about it, and those who encouraged me to. Special thanks to everyone who read a draft of this post and gave me feedback.

I wrote this post myself, but used various LLMs to critique the draft while iterating.

I have a lot more thoughts and concerns on the details of all the studies I mention (and more that I don’t), but tried to keep my discussions somewhat brief to keep this readable. I may write separate posts about them in the future. I am happy to discuss details, but hope that the bulk of the conversation below this post can be about how we solve the evidence problem (rather than discussing the specific examples).

^{^}
https://web.archive.org/web/20260204045354/https://www.shrimpwelfareproject.org/humane-slaughter-initiative
https://web.archive.org/web/20260202232305/https://www.shrimpwelfareproject.org/blog/mou-with-mer-seafood
^{^}
Weineck, Kristin, et al. "Physiological changes as a measure of crustacean welfare under different standardized stunning techniques: Cooling and electroshock." Animals 8.9 (2018): 158.
http://dx.doi.org/10.3390/ani8090158
^{^}
https://web.archive.org/web/20240608074244/https://mercyforanimals.org/stoptescocruelty/
^{^}
https://www.seafoodsource.com/news/foodservice-retail/with-aldi-s-d-commitment-all-major-uk-supermarkets-have-set-time-bound-shrimp-welfare-standards

Tesco: "100% of our farmed Penaeus vannamei are electrically stunned by 2026" (https://www.tescoplc.com/media/4dgphwua/10447v11-en-tesco-farmed-decapod-crustacean-welfare-policy.pdf)

M&S: "There are now electrical stunners in place at M&S vannamei farms in Honduras, Vietnam and Thailand with plans for implementation in 2024. The use of electrical stunners will remove the use of ice slurry for warm-water prawns in these locations." (https://corporate.marksandspencer.com/sites/marksandspencer/files/marks-spencer/Aquaculture/MS-Decapod-Welfare-Policy-2024.pdf)

Sainsbury's: "In collaboration with shrimp welfare project and our partner supplier, we are trialling electrical stunning with all our fresh and frozen shrimp farmers and collaborating with Stirling University to verify this method at which point we will roll out fully to all our source shrimp farms, estimated end of 2026" (https://corporate.sainsburys.co.uk/sustainability/explore-by-a-z/responsible-sourcing-practices/responsible-seafood-sourcing/)
Waitrose: "In 2023 we started working with our supplier and a selection of our supplying farms to trial electrical stunning in our warm water prawns, with support from the Shrimp Welfare Project. From these learnings we will continue to improve the process and roll out to our entire warm water prawn supply chain by the end of 2026.2 (https://www.waitrose.com/ecom/content/sustainability/responsible-sourcing/fish-and-seafood)

Co-op: "We also recognise that the most humane method of slaughter for prawns is electrical stunning followed by mechanical killing and whilst this is not yet a standard practice across the global prawn industry, we are actively exploring its implementation. We are collaborating with our suppliers and specialists to identify how electrical stunning can be effectively introduced on prawn farms and in 2025, we will outline a comprehensive roadmap to integrate electrical stunning into our supply chain by 2027. Our goal is to implement this humane method as soon as possible, ideally ahead of schedule." (https://www.coop.co.uk/our-suppliers/farmers/fish)

Lidl: "low stress killing methods (electrical stunning) will be implemented by the end of 2026." (https://corporate.lidl.co.uk/sustainability/seafood)
^{^}
Some YouTube videos of harvests with no or poor ice slurry:
https://www.youtube.com/watch?v=2ZCvuZqTuMc
https://www.youtube.com/watch?v=_fdfZgvJY8M
https://www.youtube.com/watch?v=NON35bUFNts
^{^}
https://web.archive.org/web/20250106232055/https://www.compassioninfoodbusiness.com/media/7444897/tesco-and-hilton-seafood-case-study-improving-the-welfare-of-whiteleg-shrimp-at-harvest.pdf
^{^}
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6315379
^{^}
Behaviour recovery for shrimp that received an electric shock, only:
^{^}
Neural activity for shrimp that were immersed in ice slurry (without electric shock):
^{^}
Neural activity for shrimp that were immersed in 0°C ice slurry after electric shock:
^{^}
Additionally, both studies implanted conductive electrodes in the test animals. It is plausible that this significantly affects how current flows through the shrimp's body.

I also feel confused about what a signal from an electrode on a heart or a ganglion actually tells us. The plots of the recorded “power” are hard to interpret without a control signal to assess what the noise floor is.
^{^}
https://web.archive.org/web/20260427231047/https://funds.effectivealtruism.org/funds/animal-welfare
^{^}
I came to this number by adding up all grants in the grants database currently available online, using their own categorisation. https://coefficientgiving.org/grant-publishing-process/

I believe that the cage-free number is a (potentially big) underestimate, as many "Broiler Chicken Welfare" grants (which make up $80M) include cage-free work.
^{^}
https://www2.sustainableeggcoalition.org/document_center/download/public/CSESResearchResultsReport.pdf
https://www2.sustainableeggcoalition.org/document_center/download/final-results/ResearchResultsReportAppendix.pdf
^{^}
Schuck-Paim, C., Negro-Calduch, E. & Alonso, W.J. Laying hen mortality in different indoor housing systems: a meta-analysis of data from commercial farms in 16 countries. Sci Rep 11, 3052 (2021). https://doi.org/10.1038/s41598-021-81868-3
^{^}
^{^}
https://welfarefootprint.org/laying-hens/
^{^}
https://welfarefootprint.org/2023/07/07/major-gaps-in-poultry-welfare-research/
^{^}
https://rethinkpriorities.org/research-area/price-taste-and-convenience-competitive-plant-based-meat-would-not-currently-replace-meat/
^{^}
Green, Seth Ariel, Benny Smith, and Maya Mathur. "Meaningfully reducing consumption of meat and animal products is an unsolved problem: A meta-analysis." Appetite (2025): 108233.
https://doi.org/10.1016/j.appet.2025.108233
^{^}
https://forum.effectivealtruism.org/posts/Zo3uNBiZ5GN35hkfq/the-usd120-million-question-how-will-it-help-animals

cynthiaschuck @ 2026-06-07T11:26 (+214)

Hi Amanda, thank you for your thoughtful analysis. I do believe taking a step back and scrutinizing the evidence and direction where we're heading is extremely important, and I agree with your conclusion that increasing R&D and talent is very much needed. I also agree that the evidence gaps are enormous. As a research group working specifically to address them, we're very much aligned. In that same spirit, I share some considerations below on the cage-free transition example, into topics to make the text lighter.

The problem with the CSES study as a reference for mortality / welfare conclusions.

The CSES study is a highly cited reference for the argument that cage-free aviaries are not necessarily better. It was funded by the American Egg Board and facilitated by another industry-funded organization focused on building consumer confidence and maintaining the industry's viability. Unfortunately, this study had multiple design flaws and biases in favor of cages (a more detailed analysis, done some years ago, is available here). A few examples:

Several observations point to stockmanship being ‘very deficient’ in the aviary (the farmer had never managed one), but not in cages, with which the farmer had decades of experience. Inspection frequency also seems to have been higher in cages.The research authors themselves declared they were still learning about what to do in the aviary during the research, which led to multiple failures. In fact, inspection was so infrequent/deficient that a large fraction of dead birds were found in advanced stages of decomposition.
Vet/disease treatment was employed for cages, but not aviaries
The overwhelming majority (~90%) of good practices for cage-free aviaries were *not* adopted (e.g., aviary chicks were raised in cages, with no development of skills to perch or navigate the aviary, densities were higher than typical, laying birds in aviaries were confined for hours every day, increasing the risk of feather pecking, breeds adapted for decades to cages were used in aviaries, chicks did not have access to litter, among many others poor management decisions - see a summary table in this document)
Key practices to reduce the risk of feather pecking / cannibalism in aviaries were not used (in fact, CSES management choices are known to promote these issues).
CSES mortality data does not add up, there are multiple inconsistencies in the dataset and in the mode of data collection/reporting, discussed here.

Mortality Data

Farm animal welfare in natural and more extensive systems, including cage-free aviaries, depends more heavily on good management practices and stockmanship. As such, we should expect greater variability in terms of mortality, as well as greater absolute mortality in the first production cycles following a transition. For this reason, the fact that there is a greater number of orange cells (mortality higher in aviaries) in the datasets of the meta-analysis is expected, as most are comparisons of established caged systems with newer cage-free systems. The studies span two decades during which cage-free systems underwent major changes. This is why we explicitly modeled the year of data collection as a predictor. Mortality rates are changing systematically over time. Modeling that trend directly, and then reporting recent mortality separately, is necessary for relevance. Another point is that brown-feathered genotypes (associated with higher mortality) are more common in cage-free systems, naturally increasing mortality because of breed, not system.

The data below, from an internal database from a breeder for countries around the world (in 2018), can also be useful.

Mortality as an indicator of welfare

Mortality may or may not correlate well with welfare. It is widely used because it is easy to measure, routinely collected, and economically important. However, mortality captures whether animals survive, not what their lives are like while they are alive.
The industry is often very good at keeping animals alive and productive until the end of the production cycle, even under conditions associated with extremely poor health and welfare (what we refer to as the "hospital bed effect"). Conversely, some more extensive or naturalistic systems may have higher mortality because animals are exposed to more hazards. Healthy pasture-raised animals, for example, may experience higher mortality due to predation. There may often be a trade-off between behavioral freedom and protection from mortality risks (‘children who never play outdoors are less prone to injuries and fatal accidents’, yet few would argue this is better for their well-being).
Mortality is most informative when comparing otherwise similar systems. When comparing systems that differ fundamentally in housing and behavioral opportunities, mortality should be interpreted together with other welfare metrics.

Life Quality/Well-being in cages x cage-free

The 2021 WF analysis was very conservative (i.e. favored caged systems) in a number of ways as we discussed in the book and elsewhere. For example, we considered prevalence for ailments in cage-free systems as reported, without any adjustments for improvements over time (despite evidence that the frequency of various harms was going down, similar to mortality).

Also, we did not consider positive welfare (opportunities are naturally more frequent in cages) nor the longer lifecycle of caged hens (welfare is typically worse at the end of life), and end-of-life events such as induced molting, still practiced in many countries in caged systems.

We also did not consider the negative impacts of learned helplessness, lack of agency, and depression-like states in cages - there is now new evidence for such depressive states in caged hens.

Importantly, we did not make any adjustment for what I believe to be robust evidence that the pain from an injury or disease is perceived as more intense and longer in cages than in cage-free systems. Barren, confined environments disable multiple endogenous analgesic mechanisms while simultaneously activating several neurobiological pathways that intensity nociceptive signaling and delay healing. Should that be taken into account, it would further reduce time and intensity of pain in cage-free aviaries.

Fear in Cages vs Cage-free systems

Several studies have found that hens reared or housed in cage-free systems are less fearful than hens kept in conventional cages. Aviary-reared birds show reduced fear responses in tonic immobility, novel object, and novel environment tests, spend more time near humans and novel objects, use elevated areas more readily, and perform better in spatial memory tasks than cage-reared birds (Hansen et al., 1993; Tahamtani et al., 2015; Brantsaeter et al., 2016). They suggest that the more complex environments in cage-free systems may reduce fearfulness and improve behavioural adaptability. See box 9.2 of the laying book for more details.

Behavioural and Physiological Indicators as a standard for welfare-related decisions

Welfare is multidimensional, and cumulative experience matters. So unfortunately no single behavioural or physiological measure, or restricted set of measures, can provide a complete picture of welfare, nor even for humans (for which calibration is possible). Behavioral, immunological, neurological, and physiological measures are valuable for inferring states associated with specific experiences, often at specific points in time. However, they are insufficient for overall welfare assessments, as well as confounded by multiple factors, and more reliable for acute rather than chronic harms (particularly immunology and physiology). Because they are typically species-, harm- and context-specific, they also do not enable comparisons across harms, systems and species. Several attempts have been made in recent years to design an umbrella measure of welfare (e.g., telomere length, cognitive bias tests), but so far unsuccessful. That's not to say indicators are not useful: we rely heavily on them for our work, and believe having more monitoring systems and research would be extremely needed. But for overall welfare assessments and system comparisons as in the case of cage-free transitions, we need welfare metrics, which integrate evidence from multiple welfare dimensions, providing a stronger basis to infer cumulative experience. In case it's useful, here we discuss in more detail the differences between welfare metrics and welfare indicators.

Thank you again for this critical analysis!
Cynthia

Vasco Grilo🔸 @ 2026-06-10T20:11 (+4)

Hi Cynthia. Thanks for the great context.

I wonder how much the results of the CSES study would change if the management practices were similarly good for both conditions (instead of worse for the cage-free chickens). You replied to my related question below that "My [your] general sense is that option A leads to a greater welfare increase".

Relatedly, I [Vasco] wonder how much welfare varies within production systems. For example, I am interested in knowing which of the following results in a greater increase in welfare. Layers going from:
A. Median furnished cages in the European Union (EU) to median cage-free aviaries in the EU. By median furnished cages in the EU, I mean ones with higher welfare per chicken-year than 50 % of the furnished cages in the EU.
B. 10th percentile furnished cages in the EU to 90th percentile furnished cages in the EU.
Do you have sense of how these compare?

Cynthia Schuck @ 2026-06-11T18:02 (+3)

Hi Vasco, my sense is that moving from cages (even if furnished)to cage-free aviaries is an improvement, for the reasons I mentioned in my earlier response. Right now, with the evidence gaps there are, it's very hard though to make a reliable distinction between 10th percentiles.

M. Y. Zuo @ 2026-06-17T21:34 (+3)

There are confounding issues with the term “cage free” since some folks frankly cheap out and actually introduce novel downsides that don’t usually exist with typical caged setups… which means it needs to be very precisely defined and checked on the ground.

LewisBollard @ 2026-06-06T18:49 (+101)

Thanks for writing this post! Thanks too for sharing it in advance, and sorry I didn't have time to review it. For the same reason, I won't have time now to go into the detail that I'd love to on your specific claims. (It's great to see such a vibrant discussion in the comments! I'd love to see more critical discussion of the evidence like this in the EA animal welfare space.) So I just wanted to share three high-level reflections.

First, I agree with you on the need for more evidence collection on the animal welfare outcomes of popular interventions. We've funded >$20M of such evidence collection work (e.g. the Welfare Footprint Institute, Rethink Priority's review of evidence across multiple areas, the shrimp slaughter study you cite, a Guelph study on broiler genetics welfare outcomes, and a dozen other studies and meta-analyses). We've also done a ton of internal analysis on this, e.g. we have a complicated model of estimated rates of factory farm meat displaced by alt protein in various scenarios. And yet we need to do a lot more! I'm really excited that the likely inflow of more animal welfare funding will enable a lot more high-quality evidence generation. And I completely agree we should prioritize it.

Second, I think we need to be realistic about how much certainty that evidence collection will create -- and how quickly. I think your cage-free example is a good one. There have been 50+ studies and large-scale data collection efforts on relative mortality in caged vs. cage-free housing systems and we still don't have a clear answer on it. And yet mortality is just one of 20+ metrics we could look to as partial proxies for what we really care about: "what is the relative overall welfare of caged vs. cage-free birds?" I think we should also fund studies and commercial data collection on all those other metrics (or at least the ones current tech allows us to measure). But that will take many years and, even then, people will reasonably debate the relevant importance of each of those 20+ metrics to the welfare of birds.

(I should note that I really admire the work you're doing to try to significantly accelerate the pace of this kind of research, starting with shrimp slaughter. I think I agree with you on everything you think should be funded to accelerate it further, and am confident we will fund that full agenda -- please tell me if there's something we're not covering. And I hope that advances in AI can accelerate the evaluation and syntheses of existing evidence too.)

Third, I think we need to continue to act to alleviate suffering even in the face of significant uncertainty. I'm jealous of global health direct service provision (e.g. anti-malarial bednets, cash payments) where I think you can have perhaps the closest to total certainty of positive impact of any areas. But I think for basically any advocacy intervention in EA (whether in AI safety, bio, farm animal welfare, or even global aid advocacy) you have to accept significant uncertainty not just about the immediate impacts but also about second, third, etc. order effects. That's not an excuse for ignoring the uncertainty. But I think when a cluster of evidence points to a high likelihood that something is robustly net positive -- as I believe it does for cage-free (sorry I don't have time to go into all the specifics here!) -- than we should move forward on it. I think the opportunity costs of waiting for a higher confidence level of evidence are too high.

Thanks again for writing this! One of the things I love most about EA is the application of critical thinking and evidence to disrupt commonly accepted wisdom, and I think this is a good example of this. I disagree on some of the specifics -- wish I had the time to explain more -- but I'm really glad you wrote and published this.

NickLaing @ 2026-06-06T20:18 (+27)

A few comments here coming from a global health background.

"I think your cage-free example is a good one. There have been 50+ studies and large-scale data collection efforts on relative mortality in caged vs. cage-free housing systems and we still don't have a clear answer on it.

If there had been 50 studies which compared mortality between two global health intervention and overall results was unequivocal, we would probably conclude that there was no major difference between the two, rather than say we didn't have a clear answer and needed more research. I would imagine this situation is more nuanced but I would like to understand that nuance.

Mortality in hens seems pretty easy and likely inexpensive to measure? All the hens are in a barn in one place with a 3 month lifespan. Compared to mortality in humans this measurement situation seems very easy. I would imagine farmers will be collecting this data for profitability reasons as well. I'm naive here so might be missing something important.

"But I think when a cluster of evidence points to a high likelihood that something is robustly net positive -- as I believe it does for cage-free (sorry I don't have time to go into all the specifics here!) -- than we should move forward on it."

The specifics are the crux for me. Before this post I thought it was super obvious that cage free was way better and it was just a magnitude-of-good question. After this post, I would appreciate a post on the forum actually laying out why there is a "high likelihood of cage free bring robustly net positive". This is a post with specifics, so I think it's important to refute specifics as well as outline general principles. You're right there are some good comments on these specifics though, thanks especially @Vasco Grilo🔸.

I think it would be worth it if someone in your kind of position (if not you) did take the time to disagree on the specifics here.

LewisBollard @ 2026-06-07T14:47 (+46)

Thanks Nick. I'm really glad to see that Cynthia Shuck of the Welfare Footprint Project has just commented around the specifics on cage-free, so I'll defer to her on that. I'll just add a few quick thoughts on your comments here.

To clarify: the results are not unequivocal. Different studies point to very different results. This is both due to study design, but also due to lots of variation in actual conditions (lab vs on-farm, small farms vs. big farms, experienced farmers vs. inexperienced, healthy pullets vs. unhealthy, etc).
Mortality on-farm is indeed easy and cheap to measure, but very few farmers will allow researchers on-farm to measure this or publicly share their own results. So a lot of these studies end up being in unrepresentative lab conditions or on individual farms that may not be representative.
That said, I think the on-farm data that Cynthia cites (the "database from a breeder for countries around the world") is probably the best data we have. And it does indeed conclude there isn't a major difference between the two systems.

"Before this post I thought it was super obvious that cage free was way better and it was just a magnitude-of-good question. After this post, I would appreciate a post on the forum actually laying out why there is a "high likelihood of cage free bring robustly net positive"."

I don't think this post should be a major update for you on whether cage-free is better than caged. It's just not clear that mortality is that good a proxy for total welfare. E.g. everyone agrees that mortality is highest in free-range and pasture-based farming, but few conclude from that that these systems are worse for total welfare than intensive confinement.
Given how hard it is to measure animals' actual subjective wellbeing, I think we should look to lots of different sources of evidence and reasoning, not just the few indicators we can objectively measure. My personal view that cage-free is better is probably based on something like: ~40% priors that pretty much all animals like to move around and have more space, ~30% preference studies showing that hens really value the things they can only access in cage-free (nesting boxes, perches, dustbathing, etc), ~10% my sense that pretty much all independent animal welfare scientists agree it's better, ~10% my experience with rescued battery cage hens who have a very strong preference to stay out of cages, ~10% that mortality is pretty much the only indicator suggesting it could be worse and actually that data is very messy. (TBC: not an official CG view, and it's possible I'd change those numbers a bunch on reflection. I share them just to suggest that resolving the mortality debate wouldn't be transformative for my views.)

Vasco Grilo🔸 @ 2026-06-10T19:48 (+6)

~30% preference studies showing that hens really value the things they can only access in cage-free (nesting boxes, perches, dustbathing, etc)

The above is an argument against barren battery cages, but not against all types of cages? All caged chickens in the European Union (EU) must have “a nest”, “litter such that pecking and scratching are possible”, and “appropriate perches of at least 15 cm”. Relatedly, I estimate moving hens from battery to furnished cages increases the welfare of chickens 70.6 % as much as moving hens from battery cages to cage-free aviaries.

Aidan Alexander @ 2026-06-06T23:47 (+7)

"If there had been 50 studies which compared mortality between two global health intervention and overall results was unequivocal, we would probably conclude that there was no major difference between the two, rather than say we didn't have a clear answer and needed more research." -- True, but the goal of cage-free isn't reducing mortality, it's reducing suffering. Mortality is one possible (but deeply confounded) indicator of suffering we can look at. If, when doing so, we conclude that there is no major difference between mortality in caged and cage-free systems, this has little to no bearing on whether cage-free reduces suffering.

NickLaing @ 2026-06-07T04:43 (+7)

First I wasn't making any big claims about what's most important, I'm just responding to Lewis's comment there which confused me a bit, and suggesting that perhaps more research in that in particular might not be so useful.

On your comment I would though expect mortality to not be a direct, ideal measure of suffering but still be important as one of the few objective measures we have. If chickens are dying early that could indicate health issues which might also cause suffering Some of the same things that would kill a chicken would be heavily correlated with what makes them suffer I imagine?

I agree it's not going to be the most important metric but it is objective at least.

geoffrey @ 2026-06-07T05:59 (+15)

A quick comment jumping off the $>20M of evidence collection work...

I've always been curious how much money has been spent to-date on global health research. If we make an (unrealistic) assumption that we could repeat GiveWell for animals, then we'd also have to fund the research that came before. Getting some kind of number, even an ultra-lower bound, would help anchor discussions on evidence-generation in animal welfare.

A starting point could be working backwards from malaria nets. What are the studies that are cited in GiveWell's spreadsheet model for malaria nets? How much did each of those studies cost? How much did it cost to train the researchers who conducted those studies? What's the bottom-line number we get from adding up all this up?

Would love to hear if this exists / see someone research it

michaeljohnston0 @ 2026-06-16T16:08 (+1)

Well, back of the envelope, unvetted: global health medical R&D is $6B annually (so that's, every year, 300x the $20M figure all-time figure 😬). But also ... is cage vs. cage-free just a medical issue, or is it virtually everything salient about the environment? And would that make it more comparable to medical + economic + psychology + ... where would we stop?

Neglectedness, for these sentient beings that do not have access to democratic, economic, or substantial portions of charitable resources is... high.

P.S. I skimmed your profile and was heartened to read this post. I shifted more fully into animal welfare work a few years ago now and have continually been drawn in by learning things like just how neglected the cause area is and hence how much potential for more impact exists, but also how much investment is warranted.

david_reinstein @ 2026-06-09T17:14 (+4)

There have been 50+ studies and large-scale data collection efforts on relative mortality in caged vs. cage-free housing systems and we still don't have a clear answer on it.

Do you think this is a case of too many studies with small amounts of funding and limited researcher ability/willingness to follow up, and limitations to doing meta-analysis on these? Do you think that larger-scale, more systematic, multi-year larger team collaborative studies and projects could be more effective at generating reliable evidence?

Vasco Grilo🔸 @ 2026-06-06T19:55 (+4)

Hi Lewis.

But I think when a cluster of evidence points to a high likelihood that something is robustly net positive -- as I believe it does for cage-free (sorry I don't have time to go into all the specifics here!).

I think cage-free egg campaigns may easily harm soil ants and termites more than they benefit chickens.

Thanks again for writing this! One of the things I love most about EA is the application of critical thinking and evidence to disrupt commonly accepted wisdom, and I think this is a good example of this.

I very much agree.

Mjreard @ 2026-06-07T15:07 (+53)

However correct or compelling these critiques turn out to be,^[1] I want to praise the incredibly constructive framing here. Most 'criticism' on the Forum does not evince 1) the conviction that the underlying problem is still extremely important, 2) the resolve to continue making progress on it, and 3) the trust that actors will update on evidence/arguments. Wonderful.

^{^}
FWIW, my summary of the discourse is:
The evidence on shrimp stunning does seem terrifyingly thin
The cage-free controversy is comparatively well studied (but still too little), and very sensitive to assumptions about inputs to quality of life
Meat alternative uptake is a known issue. The question is how long we should keep trying things before deciding the theory of change is fundamentally flawed (much longer, imo!)

david_reinstein @ 2026-06-09T17:08 (+6)

I agree that this framing is very strong, but I also think that criticism on the forum is often very strong and constructive.

matthes @ 2026-06-05T22:05 (+33)

Vasco Grilo🔸 @ 2026-06-06T08:13 (+31)

Hi Amanda. Great post.

The Welfare Footprint Institute (WFI) argues the opposite by adding up hours of pain experienced by an average chicken in different systems^[17]. With their model, cage-free aviaries have lower pain scores than conventional cages. However, these results are heavily dependent on what harms are included, how they are scored, and how different types of pain are weighed. Deprivation of natural behaviours accounts for most of the difference in scores. Chronic fear and stress from violence is not included.

I wonder if you are overestimating the uncertainty here. I speculate there is like a 2/3 chance that cage-free layers have higher welfare than ones in furnished cages. The results from WFI's analysis are below. I think the lines cover the 95 % confidence intervals (CIs). Based on the harms considered, cage-free layers experience less annoying and hurtful pain than ones in furnished cages, and it is unclear whether they experience more or less disabling and excruciating pain. However, "The analysis primarily aimed to estimate the minimum welfare improvement associated with transitioning to cage-free housing. In this sense, we preferred to err on the side of caution than potentially overestimated reform benefits in any particular aspect. Thus, it is reasonable to anticipate that the actual benefits of this transition may surpass the estimates [below]" (see the 6 bullets in the piece elaborating on why). In agreement with this, Rethink Priorities (RP) got 3 animal welfare scientist to replicate 4 cumulative pain estimates from WFI, and found these "consistently erred on the conservative side".

We have very limited data on electrical shrimp stunning that doesn't support a confident conclusion as to whether it's good or bad.

Here is a review of stunning methods for shrimp from Nino O'Shea-Nejad published in February.

Overall, the evidence indicates that electrical stunning has greater potential than chilling to induce insensibility, while chilling may suppress behaviour without eliminating sensory processing. Crucial uncertainty remains regarding defining species-appropriate electrical stunning parameters and translating neural evidence into reliable operational standards. On this basis, the report outlines implications for research priorities, regulatory standards, and industry practices aimed at ensuring that humane stunning practices reliably induce insensibility.

Unlike the other two examples, alternative proteins are unlikely to cause harm.

Alternative proteins may decrease the welfare of farmed animals by decreasing the population of farmed animals with positive lives? I estimated cage-free layers and slower growth broilers have negative, but close to neutral lives, and I would not be surprised if they have positive lives. 64.3 % (= 1 - 0.357) of layers in the European Union (EU) were outside cages in 2025, and funding work on alternative proteins does not have an immediate effect. It could be that it mostly reduces the consumption of eggs in 10 years, when the fraction of cage-free layers will likely be higher, and mortality lower due to improvements from experience. So I can see, for example, Coefficient Giving's (CG's) grants to Dansk Vegetarisk Forening (DVF) leading to fewer layers in the EU with positive lives.

In addition, alternative proteins may harm wild animals?

Significant additional funding and talent should be allocated to raise the bar for animal welfare interventions by building R&D infrastructure that can rapidly generate high-quality action-relevant research results.

I agree impact-focussed funders underrate such research relative to supporting existing interventions. In addition, I think they underrate research on animal sentience, and, more broadly, on comparing welfare across species. I believe there is huge uncertainty, and ways of decreasing it. Here is some context about my uncertainty. In Bob Fischer’s book about comparing welfare across species, the tentative sentience-adjusted welfare range of shrimps is 8.0 % of that of humans. Welfare range is defined there as the difference between the maximum and minimum welfare per unit time among “realistic biological possibilities”. For sentience-adjusted welfare ranges proportional to “individual number of neurons”^“exponent”, and “exponent” from 0 to 2, which covers the best guesses that I consider reasonable, the sentience-adjusted welfare range of shrimps is 10^-12 to 1 times that of humans.

Even some of the most prominent animal welfare interventions have surprisingly weak evidence behind them. In some cases, the available evidence even suggests that the intervention may be causing harm.

I would be curious to know your thoughts about effects on soil invertebrates. I would still have very little idea about whether cage-free egg campaigns, and electrically stunning shrimps increase or decrease welfare in expectation even if I was certain they increased the welfare of chickens and shrimps. I think their effects on soil invertebrates may be much larger or smaller than those on their target beneficiaries. This is another reason for prioritising research on comparing welfare across species.

matthes @ 2026-06-07T10:52 (+9)

Thanks for engaging!

Re: cage-free welfare estimates

As I say in the post, I think that the WFI analysis does not include some potentially significant harms (such as chronic stress/pain from violence or parasites - which are likely higher in cage-free systems).

I also think that it's not obvious how to score pain from behavioural deprivation, which accounts for a majority of the difference:

More generally, I think we currently just do not have enough data to make strong claims around the total cumulative pain chickens experience. Under "research gaps", WFI itself states the following:

Surprisingly little research has been dedicated to the understanding of the impact of different welfare challenges at the individual level (where suffering occurs). Little is known about clinical evolution of the various welfare issues affecting commercial layers (e.g. healing times, duration of exposure), the likelihood of different clinical outcomes, rates of recurrence, and how adversely welfare harms are perceived by the individuals affected. Similarly, knowledge on case-fatality rates, comorbidity patterns and the prevalence of different conditions over the laying cycle is scant.

The independent assessment that Rethink Priorities organised only looked at two layer hen harms: peritonitis in conventional housing; and fractures during depopulation and transport in conventional housing. Neither was related to behavioural deprivation.

Additionally:

Raters were restricted to the references that Welfare Footprint Institute cited in the chapter where the original estimates appeared. Therefore, disagreement among raters is due to how different scientists use the same evidence, rather than which evidence they examine in the first place.

Re: shrimp stunning

The review you link relies heavily on data from other species. My post addresses all the published evidence for Whiteleg shrimp.

matthes @ 2026-06-07T11:12 (+13)

I don't want to be overly dismissive of WFI here. They are trying to do something really difficult and important. They have spent years on this work.

However, I think many have been overly confident in acting on their conclusions when they themselves highlight a need for more research.

Vasco Grilo🔸 @ 2026-06-07T12:15 (+3)

I agree with all your points. I suspect your guess for the probability of cage-free systems being better than furnished cages is lower than my guess of 2/3. However, I do not think this matters much. In practice, I would still generally prioritise research on decreasing the uncertainty over cage-free egg campaigns.

As I say in the post, I think that the WFI analysis does not include some potentially significant harms (such as chronic stress/pain from violence or parasites - which are likely higher in cage-free systems).

It also excludes foot lesions and air quality, which are discussed in section "Important Consideration" of Chapter 9 of the book Quantifying Pain in Laying Hens by Cynthia Schuck-Paim and Wladimir Alonso from WFI. Here are their conclusions.

[...] the reporting of the relatively low incidence of the more severe and painful manifestations [of foot lesions] in layers [16,21–23] makes it unlikely that consideration of this harm would affect the estimates substantially to the point of changing any of the conclusions.
[...]
[...] it is not unreasonable to suppose that through the potentially detrimental effect on the respiratory system [24] and on mucous membranes, high concentrations of ammonia can lead to a prolonged state of discomfort [26]. This is an important welfare concern, which is likely to increase the estimated time in pain endured in cage-free facilities, depending on the prevalence of (cage and cage-free) facilities where manure is not regularly removed and the ventilation flow is insufficient, and on the time endured in discomfort (e.g. in temperate regions, higher levels of ammonia in cage-free relative to cage housing have been found in winter, but not in summer [27,28]).

Michael St Jules 🔸 @ 2026-06-06T13:17 (+29)

Thanks for writing this! I'm generally supportive of more critical review of evidence and analyses like this and would like to see more of it done, as well as more/better data collection, as you point to. I'm also glad you're working on new shrimp stunners and better validating them.

Re cage-free vs caged systems

My sense is that it's reasonable to infer that in the long-run, we don't know whether mortality rates are higher in caged or cage-free systems, and that the differences in average mortality rates are "small", based just on the graphs in this post and WFI's meta-analysis to which this post refers.

I agree that it's "an overstatement to say that there is “no differences in mortality” given the actual data" (in recent years, reading the graphs here).

I think it's still reasonable to treat them as if there's no expected difference in mortality rates in the long-run, but I also think it's reasonable to disagree with that, and this is a potentially important and fairly subjective judgement call, so should be flagged pretty explicitly. It could also mask differences in causes of mortality and associated harms by system, like the kind you point out, which could be important.

Cynthia Schuck-Paim from Welfare Footprint cited more US and EU data here, which show a very small difference in average mortality rates (3.75 in cage-free vs 3.64 in caged), much smaller than the standard deviation in mortality rates (2.62 and 2.34). I don't know what the "Standard" column in the table means, though. (Thanks to @mvolz for flagging WFI's responses.)

On the other hand, this could still allow that mortality rates increase substantially for several years as farms transition, and that might be quite bad, both in potentially extremely painful deaths and in indicating higher prevalences of other causes of suffering. I think it's worth being upfront about that, because there could be reasonable disagreement about whether it's worth it for the benefits of cage-free.

I think one trouble here is that WFI has to make a large number of judgement calls, given the available evidence. I think they are pretty transparent about them, but there can be a lot to check.

Some questions I have are:

If mortality rates between the systems do roughly converge on average in the long-run, but start higher early in the transition to cage-free, how long does the ~convergence take in practice across geographies, including in the Global South, especially for our marginal cage-free work now?
WFI's currently published analysis estimates generally less expected suffering in cage-free systems across pain intensity categories. At what point in the transition period to cage-free are these estimates based on, what mortality rates do they reflect, and how representative would they be of transitions across geographies? @cynthiaschuck @Wladimir J. Alonso
How much do increased egg costs and the resulting reduced egg demand mitigate the potential harms of cage-free transitions? Could the reduction in number outweigh any potential increases in harms on average per hen?
(May be answered by 2 or 3.) Are we, on net, increasing or decreasing excruciating pain with cage-free transitions in expectation?
1. Are we just counterfactually moving forward these potentially painful cage-free transitions, if and because industry would (be forced to shift) eventually anyway, and so reaping the benefits earlier? Would egg replacements and other alt proteins or a global catastrophe (e.g. from TAI) mean we don't get to reap the benefits after mortality rates would have converged?
2. Does our other work focused on excruciating pain (e.g. humane chicken slaughter) make up for potentially more excruciating pain in cage-free systems?
3. How are we making tradeoffs between excruciating and disabling pain? How should we make them? (Personally, I'm averse to increasing excruciating pain overall in my portfolio, with high uncertainty about how much worse it is than disabling pain, and would probably have a part of my portfolio dedicated to preventing excruciating pain.)
Does nest deprivation really cause (so much) disabling pain? The supporting evidence I've seen seems somewhat confounded, and I would assign a lower probability to disabling pain than in WFI's published analysis, but would guess there's still a decent chance of it. (I expect WFI to have updated estimates in their upcoming book.)

Aidan Alexander @ 2026-06-06T16:29 (+9)

“How much do increased egg costs and the resulting reduced egg demand mitigate the potential harms of cage-free transitions? Could the reduction in number outweigh any potential increases in harms on average per hen?” — this is an important point. I’ve always seen part of the point of win-lose welfare reforms (unlike win-wins like FWI’s work) as increasing the price, thereby decreasing demand and increasing the competitiveness of substitutes. In a similar vein, I wonder how much potential increased mortality increases cost and therefore demand and how the reduction in demand nets out against the increased number of chicken-days are needed per egg because of mortality

NickLaing @ 2026-06-06T20:31 (+6)

"How much do increased egg costs and the resulting reduced egg demand mitigate the potential harms of cage-free transitions? Could the reduction in number outweigh any potential increases in harms on average per hen?"

This seems a weird, potentially epistemiclally dangerous situation. If a cage free system is mostly better for chickens just because it is more expensive and drives down demand, then that's a dishonest play on our end. If we're going for better hen welfare, I don't think this should be part of our calculation.

Aidan Alexander @ 2026-06-06T16:24 (+4)

Lots of interesting ideas. This one I didn’t understand “Does our other work focused on excruciating pain (e.g. humane chicken slaughter) make up for potentially more excruciating pain in cage-free systems?” — if you’re willing, could you please explain this like I’m 5?

Michael St Jules 🔸 @ 2026-06-06T16:42 (+8)

Suppose cage-free increases excruciating pain compared to caged.

Is the total increase in excruciating pain across all (the sum total of) cage-free work from our community smaller than the total reduction in excruciating pain from all of our other work, e.g. CO2 stunning for chickens, other humane slaughter work, other welfare work, across species?

If yes, and if there are no other increases in excruciating pain (or they're small enough), then we could still say our community is preventing more excruciating pain than it's causing, overall. Our portfolio could still be robustly positive across different views on pain intensity tradeoffs. I think I'd be pretty satisfied with that, even if it is causing some excruciating pain (that is outweighed by reductions).

(Alternatively, we could try to compensate the specific animals we expect to cause more excruciating pain to, but that seems much harder on worldviews according to which excruciating pain matters way more than disabling pain.)

Aidan Alexander @ 2026-06-06T16:56 (+6)

Oh right got it. Thank you. But if (I emphasize: IF) we thought cage-free was negative we could choose to not do that bit right? The sign of the overall portfolio doesn’t seem relevant to that decision

Michael St Jules 🔸 @ 2026-06-06T17:06 (+4)

Yes, we could just remove it if we thought it was net negative overall.

Cage-free could turn out to be one of the best things we can do to reduce disabling pain, but slightly bad for excruciating pain, so that it's just unclear whether it's net good or bad under wide uncertainty about pain intensity tradeoffs. If it looks very good on views where disabling-excruciating pain tradeoffs are more modest and only somewhat bad and in the end outweighed on views with much more weight to excruciating pain, removing it from the portfolio could be a mistake.

See also my related post Hedging against deep and moral uncertainty.

Aidan Alexander @ 2026-06-06T17:58 (+6)

Got it, makes sense. I'm not sure it makes sense to think of these pain levels as discrete categories, but I think your point holds even if we're just using them to gesture at rough areas on a spectrum

Jilly MacKay @ 2026-06-17T13:55 (+24)

Hi matthes,

This is an extremely interesting post to read, thanks for putting the time and effort into it.

I'm new to the EA forum, one of our PhD students mentioned it to me and I'm so glad she did. I'm Jill, I'm an animal welfare researcher at the University of Edinburgh and I'm a specialist in research methodology. To get my cards on the table, I'm a huge proponent of open scholarship approaches, and advocate of greater adoption of open scholarship within animal welfare science. This paper might be a good precis of my biases in that area: https://doi.org/10.3389/fvets.2021.745779

Your recommendations regarding the entire evidence pipeline should all be incorporated into recommendations from the Manifesto for Open Science, to my thinking. I do think there is a 'gap' in your analysis, in that animal welfare science is highly interdisciplinary, but in my opinion has been slow to reckon with the impact of the reproducibility crisis on our understanding of human behaviour, which is frequently more needed than the fundamental biological research on suffering.

This has given me a lot to think about though, thank you, I may return!

Jill

Fai @ 2026-06-18T16:51 (+3)

Welcome to the EA forum! And thank you for your first contribution!

Jilly MacKay @ 2026-06-18T17:29 (+1)

Thanks! I'm enjoying reading all these insightful posts!

Ariel Simnegar 🔸 @ 2026-06-06T01:51 (+24)

Thanks for the well-argued and insightful post!

On alt proteins, if we ever substantially beat price parity (say by 50%), it’s just hard for me to see how we wouldn’t get mass consumer adoption. This comes from a model of most people’s stated preferences as downstream of what’s most convenient and protects their egos most. That’s the model I think best explains most people’s historical responses to moral catastrophes. Under this model, people maintaining they’d stick to factory farmed meat over cultivated meat is hopefully a temporary cope which will go away once cultivated meat becomes much more convenient than factory farmed meat. So for me, that model is the main reason why I still think alt proteins are a good use of funds. (Probably the stronger argument against alt proteins is that it may be unclear whether reducing meat consumption is good when accounting for wild animal effects!)

On cage-free campaigns and shrimp stunning, my main caveat would be that since animal welfare interventions affect such a huge number of individuals, I’d still expect the magnitude of their direct welfare effects (ignoring indirect effects) to be huge relative to global health. Of course, that only underscores your point that the sign deserves more research, and I look forward to reading comments from others far more knowledgeable than I am on this.

Would Rethink Priorities’ animal welfare research group look like a good investment under your views?

Jeff Kaufman 🔸 @ 2026-06-06T12:28 (+16)

On alt proteins, if we ever substantially beat price parity (say by 50%), it’s just hard for me to see how we wouldn’t get mass consumer adoption.

For an exact substitute like precision fermented egg whites I think I agree. But alt proteins are usually more like the difference between eating different animals (turkey bacon instead of pig bacon) and people often pay >>2x for preferred animals. Even if this gets down to only as different as cuts of meat within a given animal people often pay >>2x for specific cuts. And then people really love variety, so while I see a path to replacing say 2/3 of a typical person's meat consumption the remaining portion is far harder.

Ariel Simnegar 🔸 @ 2026-06-06T13:28 (+5)

I agree with your scenario. I still think it’s plausible that if 2/3 of most people’s meat consumption was replaced, people would come around to the idea that they can reduce factory farming at no cost to their convenience. Hopefully this would lead to an erosion of people’s defense mechanisms around animal welfare, helping animal welfare take off as a mass movement like historical successful ethical movements. But that’s totally speculative and not grounded in much other than desperation.

Also for the record, I hope you know I don’t consider you among “most people” here—I respect you as a person and the sincerity of your views on animal consciousness.

Vasco Grilo🔸 @ 2026-06-06T16:30 (+5)

Hi Ariel.

On cage-free campaigns and shrimp stunning, my main caveat would be that since animal welfare interventions affect such a huge number of individuals, I’d still expect the magnitude of their direct welfare effects (ignoring indirect effects) to be huge relative to global health.

Are you confident about this? I would say there is huge uncertainty. For sentience-adjusted welfare ranges proportional to "individual number of neurons"^"exponent", and "exponent" from 0 to 2, which covers the best guesses that I consider reasonable, GiveWell's top charities may increase the welfare of their target beneficiaries way more cost-effectively than cage-free egg corporate campaigns, and the Shrimp Welfare Project's (SWP's) Humane Slaughter Initiative (HSI).

Ariel Simnegar 🔸 @ 2026-06-06T17:36 (+9)

Hey Vasco! Taking into account moral uncertainty over the neuron count exponent, your plot would still make the animal interventions you listed look far higher EV than GiveWell. The probability mass where the exponent is between 0 and 1, making the animal interventions look several OOMs better than GiveWell, would swamp the cases where the exponent is >1.

(Yes, this runs into the two envelopes problem, but I think there are good arguments for using human welfare as the unit of account.)

Furthermore, I personally don’t find neuron count exponents >1 as plausible as you do. If I’m interpreting this plot from your linked source post correctly, for broiler chickens, exponent 1 implies welfare range 1/500 and exponent 2 implies welfare range 1/100,000. I agree that these numbers would make GiveWell look better, but I don’t find those welfare ranges intuitively plausible.

Vasco Grilo🔸 @ 2026-06-06T19:33 (+2)

Taking into account moral uncertainty over the neuron count exponent, your plot would still make the animal interventions you listed look far higher EV than GiveWell. The probability mass where the exponent is between 0 and 1, making the animal interventions look several OOMs better than GiveWell, would swamp the cases where the exponent is >1.

You are distributing the probability mass roughly evenly across the potential models (values of the exponent)? I worry about giving weights to models based on practically no evidence. In Bob Fischer's book about comparing welfare across species, there is just this justifying the weights (I read the whole book).

When we generated the mixture model, we assigned 60 percent weight to the simple additive model, 30 percent to the neurophysiological model, and 10 percent to the equality model. We did this because we suspect that collecting empirical data on the presence or absence of welfare-related traits is a more reliable methodology for generating welfare range estimates than using either the neurophysiological or equality models. However, the proper weight to give is the subject of a reasonable debate.

People usually give weights that are at least 0.1/"number of models", which is at least 3.33 % (= 0.1/3) for 3 models, when it is quite hard to estimate the weights. However, giving weights which are not much smaller than the uniform weight of 1/"number of models" could easily lead to huge mistakes. As a silly example, if I asked random people with age 7 about whether the gravitational force between 2 objects is proportional to "distance"^-2 (correct answer), "distance"^-20, or "distance"^-200, I imagine I would get a significant fraction picking the exponents of -20 and -200. Assuming 60 % picked -2, 20 % picked -20, and 20 % picked -200, one may naively conclude the mean exponent of -45.2 (= 0.6*(-2) + 0.2*(-20) + 0.2*(-200)) is reasonable. Yet, there is lots of empirical evidence against this which the respondants are not aware of. The right conclusion would be that the respondants have practically no idea about the right exponent because they would not be able to adequately justify their picks.

If I’m interpreting this plot from your linked source post correctly, for broiler chickens, exponent 1 implies welfare range 1/500 and exponent 2 implies welfare range 1/100,000. I agree that these numbers would make GiveWell look better, but I don’t find those welfare ranges intuitively plausible.

You are reading the graph correctly. Why do you find an exponent of 2 implausible? My position is not so much that I find it plausible. It is more that I do not know how to check the plausibility of values ranging from 0 to 2 or so, and therefore do not want to rule them out.

I’d still expect the magnitude of their direct welfare effects (ignoring indirect effects) to be huge relative to global health

You cannot rule out indirect effects if you are confident the exponent is 0 to 1? In this case, I estimate effects on soil invertebrates are much larger than those on target beneficiaries.

The graph above covers microarthropods (springtails and mites) and nematodes, which are not covered in Bob's book. However, I have very little idea about whether cage-free egg campaigns increase or decrease welfare due to potentially dominant effects on soil ants and termites alone. These are macroarthropods like shrimps and black soldier flies (BSFs), which are covered in Bob's book.

NickLaing @ 2026-06-06T08:16 (+21)

While reading this I felt quite confused and even a bit discombobulated. The cage free campaign institutions tell me how many hens can be saved from cages for a dollar. I always interpreted that as just straightforwardly good, even if we weren't sure how much good. For animal Welfare debate weeks and cross-cause priortisation models we assume that these interventions hugely move the needle - we debate other things like moral weights and the importance of uncertainty...

But the best research institutions like @Rethink Priorities and @Ambitious Impact have models which calculate huge reductions in suffering per dollar directed towards these very interventions. They must have good evidence based models for this.

It seems to me very unlikely to be that these big, thoughtful orgs are directionally wrong but the data presented here seems reasonable at a glance so what's up?

I would love to read a detailed data - based refutation / counter piece to this, which lays out the best argument that uncaging hens and stunning shrimp really do make a clear, huge difference. Or even just direct me to the best stuff already written. I donate (small amounts) to cage-free and this article more than any other has made me rethink that.

Also looking forward to seeing some more detailed pushback on this in the thread too!

Jeff Kaufman 🔸 @ 2026-06-06T12:21 (+21)

It seems to me very unlikely to be that these big, thoughtful orgs are directionally wrong

That doesn't seem so unlikely to me. There are many patterns that push towards doing things over research: donors prefer it, volunteers prefer it, hard to justify research when that means doing nothing about atrocity today, things that looked good on BOTEC often don't get more investigation as reliance increase, etc. Add on top of this the poor epidemics of the animal welfare movement and I really wouldn't be surprised at all.

Ariel Simnegar 🔸 @ 2026-06-06T14:31 (+24)

I think the patterns you point which pressure towards “just doing things” are all reasonable, but I’ll push back on your link on your “poor epistemics of the animal welfare movement” claim.

The linked post by Elizabeth argues that EA vegan advocacy has bad epistemics. Magnus Vinding’s comment on that post is the closest to my view. Briefly, I was confused by Elizabeth’s focus on a topic I consider pretty tangential and not load-bearing on any of the arguments for vegan advocacy (and even less so for AW at large). Is vegan advocacy really less truthseeking than the general public’s views on meat consumption? How do the human health effects trade off against animal effects? Seems like an isolated demand for rigor.

The epistemics of vegan advocacy also seem quite distinct from arguments for the importance of animal welfare, or prioritization between animal welfare charities, which seem to have convinced many EAs who are not vegan. So there’d be much more work to do to make the claim that EA AW at large has poor epistemics.

I came away from Elizabeth’s post agreeing that some vegan advocates should message better about health tradeoffs, but not seeing why that should update me on the load-bearing arguments for veganism (animal effects), or certainly on EA AW’s epistemics at large.

AnonymousTurtle @ 2026-06-06T23:42 (+4)

I also recommend reading Natalia's answers to Elizabeth's posts, both here and on LessWrong (you need to scroll down a bit there) instead of just reading the post uncritically.^[1]

I didn't get the sense that Natalia's epistemics were poorer than commenters in other cause areas, but I did get the sense that the evidence base available is weaker.

Regarding this post, as a non-expert I wouldn't be shocked if shrimp stunning and slaughter interventions turned out to be less valuable than we currently think, as it's such a new field and shrimp seem hard to study.
But for cage-free campaigns there's been a lof of research, analysis, and debate in EA for more than 5 years. My sense is that people studying these things have many reasons to believe that these interventions are very likely to be net positive (and are deeply aware of the significant downsides of cage-free systems, they just have compelling arguments for why the upsides compensate for those)^[2]

I also don't think there's much pressure by donors/volunteers/employees to invest in cage free campaigns compared to things like vegan advocacy, or promotion of plant-based defaults in schools/hospitals/workplaces. If anything I'd guess the opposite, and that senior people pushing for cage-free interventions really do it because of the data (and of course tractability)

^{^}
But I do think most EAs should get their ferritin levels tested: low iron impacts productivity, EAs are uniquely likely to be lacto-vegetarian, which according to Gemini is "one of the most difficult dietary patterns for maintaining adequate iron levels", and 80mg equivalent iron tablets are very cheap
^{^}
But it does seem that the actual estimated marginal cost-effectiveness numbers could be very different from what's often claimed

mvolz @ 2026-06-06T09:33 (+14)

I also wrote specifically on just cage free hens last year here:

https://forum.effectivealtruism.org/posts/soz6CerTGpEWFXLeZ/is-cage-free-really-the-most-humane-option-for-egg-laying

There are some responses from the Animal Welfare Project team in the comments there that respond directly to some of the overlapping points.

NickLaing @ 2026-06-06T09:53 (+7)

Thanks @mvolz I had a read. Your article seems to be directionally similar to the OP is that the case? What's different?

Aidan Alexander @ 2026-06-06T16:38 (+17)

I really appreciate this post and am also very excited to see more primary research. In addition to the uncertainty about how a given change in farming practices impacts welfare, I think whether or not those changes are actually occurring is highly uncertain, and I’d like to see a greater investment in M&E at animal orgs to validate this.

More primary research could be complementary to this, for example, better monitoring of welfare indicators on farms and/or at slaughter could allow us to see whether the welfare is actually improving, which requires both that the intervention is actually occurring and that it’s helping animals when it occurs.

I do wonder about our ability to measure harms like stress and fear though. I think focusing just on mortality or health conditions might lead us to make false conclusions about which farming practices are better for animals. Does anyone know anything about our capacity to measure psychological distress from deprivation of natural behaviors vs fear of violence for example? (the former being one that drives a significant amount of the improvement that cage-free brings according to WFI, the latter being one Amanda points to being absent and potentially outweighing the former)

david_reinstein @ 2026-06-09T19:00 (+14)

One statistical/methodological point I’d add (something I always harp on). I don’t think “not statistically significant” should directly cited as evidence for a lack of a difference. If the question is whether mortality differences are small enough to be decision-irrelevant, we’d want something closer to an equivalence test or Bayesian posterior over the mortality difference, plus a welfare model translating mortality causes, morbidity, behavioral deprivation, fear/stress, and transition dynamics into aggregate welfare burden.

A forest plot or explicit meta-analytic summary could also make the cage-free evidence easier to interpret than the table of pairwise significance checks.

Related: I wouldn't always treat small sample sizes or mixed statistical significance as automatically implying “no useful inference.” Small-N studies can be informative if underlying measurement noise is low. For example if I ask 4 people to taste a drink and they all wince deeply in pain and disgust, I'm going to be highly confident it tastes bad. If all 4 smile and praise it, I'll be fairly confident that it's at least tolerable.

McElreath's globe-tossing example illustrates how much we can sometimes learn from small samples.

(Still, in the shrimp case, it does seem like there is some substantial underlying variation unrelated to the different slaughter methods.)

david_reinstein @ 2026-06-09T18:57 (+13)

On alternative proteins, I’d adjust the claims. I agree we don't currently have strong, clean evidence that today’s plant-based products robustly displace animal products at the scale we’d want. But I’d describe the evidence as “limited, mixed, and hard to interpret” rather than “evidence that the substitution effect is weak, at best.”

Evidence in this domain is very hard to come by, and there are substantial doubts about the extent to which we can even reliably measure these things. See Unjournal's evaluation of Bray et al.; tldr, a large-scale grocery pricing experiments found that own-price elasticity estimates much smaller than standard/cutting-edge demand-estimation methods generated on parallel data. (But other critiques also suggest doubts about the relevance of the experimental estimate.s)

Also see this AI curated evidence/discussion on the lack of validation evidence in this domain as well as our earlier post showing plant-vs-animal estimates all over the map,and not clearly linked to methodology.

Challenges include:

The (limited) evidence is mostly about current plant-based products and current meat-reduction interventions, not about genuinely price/taste/convenience-competitive products, cultivated meat, precision fermentation, etc.
Measurement issues ... Scanner data misses food-away-from-home, substitution across animal products, longer-run habit formation, product introductions, and general equilibrium effects. And of course, price endogeneity.

And (as others mentioned) I'd separate the consideration of current plant-based products from future alternative proteins.

My prior is that if something becomes a close enough substitute and is cheaper / tastier / more convenient, it should substantially displace some animal consumption, though the magnitude and species mix are very uncertain.

But there's a chicken/egg problem in gathering evidence and funding: we probably won't get decisive ("revealed-preference") evidence until (e.g.) cultured meat products are actually on shelves and menus at scale, but getting them there requires investment under uncertainty. There's a potentially high upside here which we'd sacrifice if we're not willing to invest under uncertainty.

Henry Stanley 🔸 @ 2026-06-08T00:26 (+13)

Great post!

we already had multiple strong studies showing that the availability of equivalent substitutes does not lead to major changes in meat consumption.

This feels like your weakest argument. So far there aren’t “equivalent substitutes” to meat - only plant based imitations - so it’s not clear what these studies are telling us. The RP study you link to explicitly says it’s an analysis of plant-based meats and not cultured meat.

david_reinstein @ 2026-06-09T17:09 (+2)

I agree with you that "showing that the availability of equivalent substitutes does not lead to major changes in meat consumption" is stated too strongly.

There's a profound lack of evidence, and it's also very difficult to gain reliable evidence in this domain.

MHR🔸 @ 2026-06-08T12:21 (+11)

Thanks for the post! I was surprised to see how weak the academically-published evidence is regarding shrimp stunning effectiveness, and agree that it would be valuable on current margins for effective animal advocates to invest more in research.

For what it's worth, I read the Tesco-Hilton case study more optimistically than you do. You quote it as:

The results are a bit vaguely presented, but with at least one machine setting “a significant proportion” of shrimp did not “show signs of recovery” within 10 minutes.

But the full quote is

A significant proportion of prawns were in irrecoverable stun (stun-kill) on exit of the stunner as evidenced by transfer back to a controlled aqueous environment where none showed signs of recovery within the monitoring period (10 minutes plus handling time of 32 seconds)

(emphasis mine). That's not to say that we should rely on a single vague industry report when the academic evidence is conflicting, but I think the report does provide some evidence.

Jo_🔸 @ 2026-06-06T19:32 (+11)

A few years ago, spending on animal welfare reforms in EA was so limited that running rigorous, informative studies to reduce the movement's uncertainty might have cost more than the movement's spending on the campaigns themselves. But as you mention, now that the funding bottleneck might be easing, such studies would inform much larger amounts of spending, and the expected value of information looks much higher. It thus seems like a very good time to raise an issue that would have been much harder to make progress on in the past (though as Lewis Bollard's comment notes, the challenge is still daunting).

I found the post well-written and balanced, and the discussion in the comments mostly helpful. Strongly upvoted.

Artūrs Kaņepājs @ 2026-06-12T11:57 (+8)

Thanks for this post! FYI one cage-free reform that seems unambiguously good is abandoning sow crates, provided the space is increased enough to avoid piglet crushing. Have been looking into that over the past few months, will share more info probably in a month or so.

Fai @ 2026-06-11T08:24 (+7)

Thank you very much for the post! I have read some comments (and except for Cynthia's, mostly incompletely). I want to leave a comment that is meant to be a reply to some comments, but also possibly the post itself:

Some comments in the discussion, and perhaps the post implicitly, seem to treat global health as a point of high certainty — Lewis described it as "the closest to total certainty of positive impact of any areas." But I think that "certainty" is partly an artefact of where we stop scrutinising. Yes, we have strong evidence that bednets counterfactually avert statistical deaths. But we have much weaker evidence that the counterfactual life thereby preserved is net-positive over its remaining course, and weaker still that it's more net-positive than the resources spent would have produced elsewhere (even limiting the resources within just humans, or the global health cause). That second layer — the value of the outcome, not the efficacy of the intervention — usually gets carried by unstated assumptions rather than by data. (FWIW, part of the assumptions are philosophical. For instance, there are serious philosophers who think that each extra life year is a net-negative, regardless of people's preferences. Also, people who have their lives saved might go on to harm other humans, but some EAs and ethicists think we ought not consider this when it comes to saving kids.)

I want to be careful not to overstate this, because there are disanalogies: The human prior is genuinely stronger on the immediate impact level. It also seems that on the secondary or further levels, interventions targeting humans are often even less certain than AW ones. So I'm not claiming the two are equally uncertain.

But the (in)consistency point still bites. If AW has a major evidence problem vis-à-vis whether overall welfare was indeed improved, life-saving human interventions have it too — it just happens that we rarely turn the skepticism in that direction (welfare).

P.S. I'm aware of problems raised by population ethics and the meat-eating problem (it's a more productive framing than the version you heard), so not a novel observation in general. I'm raising my points narrowly because the comment section (or maybe just Lewis, and David Reinstein, plus the post implicitly?) leans on global health being the secure benchmark, in comparison to AW interventions seeking to improve welfare.

P.P.S. I used Claude 4.8 to help me check whether my points were already made by someone else here, and to help me draft the reply, of which I modified.

David Mathers🔸 @ 2026-06-11T10:27 (+11)

I find it somewhat hard to take "but what if it's good for third world children themselves to die" all that seriously as an objection. I think most anti-natalist philosophers would deny this. Isn't Benatar's view that is bad for people to come into existence, but it's often better for their lives to continue once they have started. In general anti-natalists are not usually utilitarians, classical or negative.

Fai @ 2026-06-11T12:20 (+2)

My doubt was on the epistemics, and specifically on the estimation of welfare gain by an intervention.

Re: Benatar's view. He holds the view that the continuation of a life accrues harm. At the same time, he indeed also holds that it is overall better (or more like, less bad) for people's lives to continue once they have started, because death is even more significant harm.

I can't say how many anti-natalists are utilitarians of any sort, or the reverse. I am pretty sure many negative utilitarians think that the continuation of any sentient life is net negative.

Going back to Benatar's view and applying it to our subject matter. He would likely claim that:

Continuation of the lives of third-world children is a harm in itself, both because of the expected negative welfare, and also for some other non-utilitarian reasons.
Nonetheless, letting them die is still overall bad, because dying is an even greater harm.

Matt Goodman @ 2026-06-10T16:40 (+7)

A sample size of six individual shrimp for the stunning study is insane

VeryJerry @ 2026-06-05T23:01 (+7)

Thanks for bringing this up and going in-depth on the evidence. I've always felt uneasy with the cage free campaigns and shrimp stunning.

Part of that is definitely my more "abolitionist" viewpoint. I hear about those and think "wait how much money are we putting towards exploiting animals in a bit less bad way?" Of course an improvement is an improvement so if they are actually better then that is better, and there is benefit to getting momentum. It's easier to take a mile once you've gotten an inch, so to speak. But if we don't even actually know if the increment we're pushing is better, that's a problem that deserves to have alarm bells sounding.

That said, I don't really know what would be better tactically. Lately I'm pretty AI-pilled, thinking that if we can make AGI aligned with all sentient beings then that will be a huge benefit. But today's labs aren't gonna release a model that only ever suggests vegan recipes, and the future of AI is highly speculative.

Sometimes I wonder if we should just start doing door-to-door vegan advocacy, like the jehovahs witnesses and normy politics and deep canvassing (and street epistemology and "smart politics" https://www.lesswrong.com/posts/D2GrrrrfipHWPJSHh/book-review-how-minds-change ).

Tristan Katz @ 2026-06-06T06:51 (+3)

Totally related to this until you made your suggestion at the end 😂 I would rather lean in the direction of thinking political interventions deserve more attention, e.g. Animal Policy International, Animal Society or maybe even funding animal political parties? Normalizing animals in politics seems robustly good despite uncertainties about shrimp and cages.

VeryJerry @ 2026-06-06T13:22 (+1)

Yes, definitely in favor of more pro-animal politics! I think there are a lot of high impact things the movement can be doing. It's worth focusing on the ones we know, and researching new avenues and theories of change too.

david_reinstein @ 2026-06-09T18:39 (+6)

I’d frame the solution a bit differently from “the EA animal welfare community should take ownership of every stage of the evidence pipeline.”

I don’t think EA/AW people necessarily need to personally design every study, collect the data, or do all the econometrics/biology. But I do think the community needs to take responsibility for setting the agenda, creating the incentives, providing funding, clearly communicating the goals, priorities, and need for rigor to researchers, and enabling coordination.

And it would indeed help if the researchers intrinsically cared about animal welfare and thus about producing useful and accurate results. This makes the incentive alignment easier. But I think it can still work even if much of the work is done by people who aren't intrinsically interested in animal welfare or don't think about effectiveness in the same way - as long as they can understand and embody the priorities in their work.

I think entire organisations could and should be founded for this. Until now, this was simply not possible. Research is expensive and slow, especially at universities. But we're about to have the luxury to aim higher.

I agree that academic incentives and institutions on their own are not enough to ensure high-quality, credible work focused on animal welfare implications (although AW funders can leverage them to some extent). These are not well aligned with producing the kind of evidence funders want, and (e.g.) much existing agricultural/economic research is naturally oriented toward industry questions.

(Warning: self-promo) we're trying to do some of this bring together academic researchers, practitioners, and funders to help them understand their mutual priorities, abilities and needs at The Unjournal. See our “Pivotal questions”: an Unjournal trial initiative e.g., this one on cultured meat, and this potential upcoming one on animal/plant product substitution.

We also try to aim this at the value of information focusing on identifying "what specific operationalized questions (and what evidence) would productively change funding or implementation decisions", sourcing and evaluating the best evidence, fostering high-value discussion/debate, and eliciting and synthesizing beliefs and uncertainty.

I’d be excited to coordinate with others working in this direction.

(This overlaps with Jo’s comment somewhat, but I wanted to emphasize the institutional-design angle.)

Citrit @ 2026-06-06T01:43 (+6)

Yes, this!

To add, I also have issues with ACE's methodology & transparency. I don't have a background in stats or anything, but when I tried to independently verify ACE's evaluation of charity efficacy, I found several figures that seem (to me) not well-sourced. e.g., the 'source' column is mostly empty for ACE's AWO spreadsheet: https://docs.google.com/spreadsheets/d/1CAcmf4dXPk9tNWit25C7A7GJqEGNtnN-unXKZoyzvx4/edit

Compare with GiveWell's analysis: https://www.givewell.org/international/technical/programs/seasonal-malaria-chemoprevention#Sources

I'm still allocating my donations based on ACE's recommendations, but I'm really worried about how effective I'm being.

Jeanne Marie Jacqueline (JMJ/Evana) @ 2026-06-06T06:58 (+15)

Thank you for your insights @matthes @Citrit !

As I get more familiar with EA and will soon enter the workforce, I am considering how to decide on impactful future donations. I have been a vegetarian for 6 years and am greatly preoccupied by animal farming. Although the abolitionist perspective is attractive, it does not seem achievable to me, so I wish to orient donations towards animal welfare. This post makes me concerned about the lack of research and fake promises.

Aidan Alexander @ 2026-06-06T16:53 (+30)

I agree with the other, secondary Aidan’s comment (I am the original and true Aidan).

A common journey with EA is going from (1) not thinking we can know whether charity is helping or that it makes a difference which charity you choose to (2) learning about EA and being thrilled to discover that we can know and some charities are excellent to (3) getting into the weeds of the empirical and ethical uncertainties around the first, 2nd and nth order effects of a certain cause area and becoming disillusioned and pivoting to another cause to (4) realising that every cause is fraught with uncertainty and starting to feel a bit jaded about charity to (5) accepting that the world is super uncertain but that the expected impact of the best bets we have is promising enough to motivate ongoing work on them (especially if we double down on generating more primary evidence and doing good M&E to make our uncertainty go down with time)

I hope we don’t lose you at the 4th step ❤️

(This is not to say that all causes are equally uncertain, or that it’s guaranteed that the best bets in any particular cause will be compelling enough for you)

Angelina Li @ 2026-06-09T01:15 (+7)

I really like this description and it resonates with me. +1 to “I hope we don’t lose you at the 4th step” :)

It’s so desperately sad to have uncertainty about extremely important things, and yet we still need to choose how to act. It feels very productive to discuss where to allocate marginal resources between “reduce our uncertainty further” and “take the best step according to our current evidence base”, and I’m glad to see it being discussed.

Aidan Kankyoku @ 2026-06-06T12:09 (+15)

Congrats/good luck on soon entering the workforce!

I think OP is a valuable contribution to an ongoing discussion, but I'd encourage you not to update too aggressively on any one post. There are good reasons that some of the most rigorous organizations in the space (CG) have made cage-free a top priority and ex ante you should think it unlikely that they all converged without good reasons– see e.g. @Vasco Grilo🔸's comment above.

Aidan Alexander @ 2026-06-06T17:01 (+18)

“There are good reasons that some of the most rigorous organizations in the space (CG) have made cage-free a top priority and ex ante you should think it unlikely that they all converged without good reasons” — I agree, with the caveat that sometimes what looks like (A) converging opinions of a number of thoughtful actors is actually (B) a chain of deference where one actor has thought about it, and a second actor defers to them, and a third defers to the first two etc, with the nuance and uncertainty getting lost a bit more at every step in the chain. It can be hard to tell A and B apart from the outside

Craig Green 🔸 @ 2026-06-10T17:35 (+13)

Numerous EA-adjacent orgs arriving at the same conclusion about some issue may also be the result of re-circulating the same people. After one year of observing EA online, my impression is that EA is not that large or diverse a group of people in the grand scheme of things. Many people seem to be pretty tightly interconnected together, even to the point of being family with each other!

I actually think EA does a pretty good job of avoiding group-think relative to its homogeneity (see posts like OPs), but given the social dynamics of a smaller and tightly interconnected movement, it is important to constantly reinforce truth-seeking behaviors, including by direct questioning of established orthodoxy.

Maybe my characterization of EA is wrong though. I'm not someone who would know.

I think what really bothers me about OPs post, is precisely the possibility that my donations are actually worsening animal welfare. That stings. I think its important for recommenders of interventions to carefully consider how they are going to communicate about such things. I would want them to break out not only their general uncertainty about the effectiveness of the intervention, but the specific uncertainty that it actually causes harm. For those of us who are concerned a lot about harm reduction, seeing that as its own line item would be helpful.

Aidan Alexander @ 2026-06-11T16:26 (+20)

It's very fair to be concerned that your donations might be doing harm -- I can relate! A bittersweet consolation might be that this is a risk that applies to basically every charitable intervention, including highly evidence-based global health and development interventions. Something that increases a woman's income or autonomy can put her at greater risk of domestic violence for example. Changing someone's life can have impacts on the local and macro-economy, on those impacted by their diet and consumption patterns etc. -- the world is complicated and the risk of doing harm instead of good is always there.

This isn't to say we shouldn't work really hard to understand and prevent such harm! Just that "no chance of harm" is not a realistic bar to aim for

Aidan Kankyoku @ 2026-06-11T16:43 (+4)

Adding to what Other Aidan said, I think it's a mistake to think of the point on the spectrum from most to least helpful interventions where the real impact crosses from "very slightly helpful" to "very slightly harmful" as especially significant. If intervention A is worth 10,000 units, B is 1 unit, and C is -10 units, the difference between A and B matters much more than between B and C.

Craig Green 🔸 @ 2026-06-11T18:11 (+2)

I agree with you two. I don't have any delusions about avoiding all risk of causing harm or that harm avoidance is straightforwardly more important than providing benefits. I guess what I am saying is that risk of causing harm is distinct from risk this is ineffective, and it would be nice to see these broken out (as someone who works in data analytics and engineering, I realize the real world is not so simple). It seems to me that actively causing harm is a bit of a different thing than just being ineffective, and you would ideally reason about it specifically, rather than bundling it all together into one big effectiveness metric.

Especially since, while I don't think we should try to avoid all harm, people may have different moral weights about causing harm. For some people, they may be much more indifferent about causing harm relative to providing some benefit, whereas others may have a stronger bias towards "first, do no harm." Given that the tradeoff between these is something each individual must determine, it is better to separate it out in your model and allow people to discount the effectiveness according to their own priorities.

Forumite @ 2026-06-06T19:41 (+5)

Thanks for the thoughtful and truth-seeking post.

One of the shrimp-related interventions that animal advocates have worked on is ending eyestalk ablation - the act of cutting the eyes off of living mother shrimp.

Shrimp Welfare Project work on this - https://www.shrimpwelfareproject.org/blog/what-is-eyestalk-ablation / https://www.shrimpwelfareproject.org/eyestalk-ablation-free

And animal campaigners have won some commitments to end this practice - https://www.globalseafood.org/advocate/lidl-gb-bans-eyestalk-ablation-for-farmed-prawns-joining-nine-other-u-k-retailers/

Curious to hear how people feel about this kind of work?

(I haven't done the maths or read any academic work on this topic, but my assumption is that this is probably a robustly good thing, and good value for money?)

david_reinstein @ 2026-06-09T17:11 (+2)

I don't have particular expertise in this, but your comment suggests to me that there may indeed be animal welfare interventions with more obviously robust positive impacts. On the one hand, the post did consider some of the most prominent interventions, but perhaps it did ignore some of the more clear wins (based on evidence or just fairly obvious intuition ).

david_reinstein @ 2026-06-09T18:17 (+4)

Thanks again for this post. I left a bunch of Hypothes.is margin notes while reading: see here or install plugin. (I'll post some of these as regular forum comments now.)

I strongly agree with the central diagnosis: animal welfare seems under-supplied with decision-relevant evidence, especially compared to global health. ACE, Rethink, WFI, CG, SWP, etc. are doing serious work, but the AW still lacks something analogous to the dense ecosystem around GiveWell-style global health evidence: funders, implementers, independent evaluators, academic researchers, field-data pipelines (and research evaluation institutions) all pulling in the same direction.

I don't think academia/journals in the current state provides the right incentives or platforms. We end up with many small-scale projects papers, often not aimed at the key animal welfare outcomes, and a lack of follow-up. We need something more coordinated, directly aimed at effective animal welfare, collaborative, larger scale and longer term.

Seth Ariel Green 🔸 @ 2026-06-09T16:46 (+4)

Hi there

See also this^[20] more recent meta-analysis that came to a similar conclusion about alternative proteins and other meat reduction interventions.

👋 I am lead author of that paper!

Two studies we looked at addressed alt proteins directly

Bianchi

2022

Replacing meat with alternative plant-based products (RE-MAP): a randomized controlled trial of a multicomponent behavioral intervention to reduce meat consumption

10.1093/ajcn/nqab414

Acharya

2004

Nutritional changes among premenopausal women undertaking a soya based dietary intervention study in Hawaii

10.1111/j.1365-277X.2004.00537.x

The second paper is a small sample and doesn't really address what we consider the modern generation of alt proteins. The first does. From that paper's abstract:

Methods

Adult volunteers who regularly consumed meat were recruited from the general public and randomized 1:1 to an intervention or control condition. The intervention comprised free meat substitutes* for 4 weeks, information about the benefits of eating less meat, success stories, and recipes. The control group received no intervention or advice on dietary change. The primary outcome was daily meat consumption after 4 weeks, assessed by a 7-day food diary, and repeated after 8 weeks as a secondary outcome...

Results

Between June 2018 and October 2019, 115 participants were randomized. The baseline meat consumption values were 134 g/d in the control group and 130 g/d in the intervention group. Relative to the control condition, the intervention reduced meat consumption at 4 weeks by 63 g/d (95% CI: 44–82; P < 0.0001; n = 114) and at 8 weeks by 39 g/d (95% CI: 16–62; P = 0.0009; n = 113), adjusting for sex and baseline consumption. The intervention significantly increased the consumption of meat substitutes without changing the intakes of other principal food groups.

* Participants selected the meat-free substitute foods from a catalogue containing the full range of products available in a major UK grocery store at the time of the trial. This included mycoprotein meat alternatives and vegetable- and pulse-based meat substitutes.

How much should one update on this study? IDK. But by its own lights, giving people alt proteins for free -- embedded in a larger intervention -- had meaningful effects.

However, when we test intro-ing alt proteins in a hypothetical, online ordering, chipotle-like environment, alas we don't find much.

matthes @ 2026-06-09T20:09 (+1)

Thanks for flagging this!

I should have double-checked this reference. Have made an edit to redirect people to this comment.

Samo @ 2026-06-22T15:36 (+3)

Nice post.

Before we chose to campaign for a phase-out of cages for hens in Slovenia, we also wanted to be quite damn sure we were actually doing good for animals.

We eventually won a legislative phase-out of cages for layer hens. And the reason we felt confident enough to push for it was not just work by WFP, RP, OWA and posts on this forum.

EFSA, the main scientific authority on food safety in the EU, recommends phasing out cages on welfare grounds. Our own veterinary faculty and animal welfare centre experts in Slovenia came to the same conclusion.

So at least in our case, this was not based on vibes or claims by a single community (EA). It was based on a broader scientific and veterinary consensus. Or at least, I should say, an independent assessment of evidence by these institutions.

The Unjournal (bot) @ 2026-06-09T12:38 (+3)

We asked GPT 5.5-extra in Codex to do an evidence audit and discussion, because this seems to be a high value post relevant to Unjournal's work on evidence quality and animal welfare-relevant economics. It reports

We also added several public Hypothes.is annotations directly on the post text, tagged `gpt` and `evidence-audit`. Public annotations are visible here: Hypothes.is search for this post. You can also see them in context by installing the Hypothes.is browser extension.

My [GPT's] main takeaways from checking the cited and related sources:

Shrimp stunning: The evidence seems best read as parameter-sensitive rather than as a blanket case for or against electrical stunning. The recent Somerville et al. preprint and SWP review suggest electrical stunning may have more potential than chilling, but species-specific settings and industrial validation remain central uncertainties.
Cage-free reforms: The post seems strongest when framed as a critique of mortality-based inference and transition management. Mortality evidence is mixed and confounded, but that is not the same as saying the overall welfare evidence is equally mixed. Schuck-Paim et al. 2021, WFI’s laying-hen model, and Cynthia Schuck-Paim’s comments in this thread all seem relevant here.
Alternative proteins: I would frame this less as “we have evidence that substitution is weak” and more as “we have limited and hard-to-interpret evidence, plus serious measurement challenges.” The available evidence seems more directly about current plant-based meat and meat-reduction interventions; it does not settle future cultivated meat, precision fermentation, or genuinely cheaper/tastier/more convenient substitutes.

Sources checked included Weineck et al. 2018, Somerville et al. 2026, SWP’s stunning review, Schuck-Paim et al. 2021, WFI’s laying-hen work, Peacock’s RP report, and Green/Smith/Mathur’s 2025 meta-analysis.

David Reinstein: I added further comments manyally in hypothes.is and I'm working on a general response comment synthesizing that, also bringing in resources from our PQ/workshop on plant-based substitution. Let me know if the above is misleading or distracting.

The Unjournal (bot) @ 2026-06-09T12:51 (+1)

More coming, comments and replies like the images below. Please let me know if useful or annoying/inaccurate.

Michael Goff @ 2026-06-06T06:10 (+3)

This is a very important post, and thank you for writing it. Coming from the environmental world, this parallels my frustration that huge amount of efforts go into projects for which the evidence is shockingly flimsy. A significant fraction, maybe even the majority, of environmental advocacy is for measures that are nearly useless or outright harmful.

If you will pardon a personal plug, I wrote a piece on my Substack blog recently that looks at the evidence at a high level, drawing heavily on the methodology of Rethink Priorities' Moral Weigh Project. If we want to monetize animal welfare (in my piece, broiler chicken welfare), then estimates for the proper valuation span several orders of magnitude due to irresolvable philosophical differences. I am very grateful that the Moral Weight Project was done, but clearly we have a long way to go before we have reliable numbers that we can use.

I am particularly interested in your comments on alternative proteins, since it will be very relevant for a piece I am working on regarding meat taxation. A major focus will be to review that we know about the rebound effect and induced/latent demand to argue that, if they attain widespread market viability, alternative proteins will mostly augment the meat supply rather than replace it.

Naveeth Basheer @ 2026-06-09T16:11 (+2)

Interesting post! I definitely second the call to fund more quality animal welfare research. It's surprising how poor some research is, even when peer-reviewed. I especially like the specification of "action-relevant" research, as this is key. Especially when trying to implement interventions in LMICs, often lab studies aren't applicable.