Rethink Priorities’ Welfare Range Estimates

By Bob Fischer @ 2023-01-23T11:37 (+343)

Key Takeaways

Introduction

This is the eighth post in the Moral Weight Project Sequence. The aim of the sequence is to provide an overview of the research that Rethink Priorities conducted between May 2021 and October 2022 on interspecific cause prioritization—i.e., making resource allocation decisions across species. The aim of this post is to share our welfare range estimates.

This post builds on all the others in the Moral Weight Project Sequence. In the first, we explained how we understand welfare ranges and how they might be used to make cross-species cost-effectiveness estimates. In the second, we introduced the Welfare Range Table, which reported the results of a literature review covering over 90 empirical traits across 11 farmed species. In the third, we suggested a way to quantify the impact of assuming hedonism on our welfare range estimates. In the fourth, we explained why we’re skeptical of using neuron counts as our sole proxy for animals’ moral weights. In the fifth and sixth, we explained why we aren’t convinced by some revisionary ways that people try to alter humans’ and animals’ moral weights by proposing that there are more subjects per organism than we might initially assume. In the seventh, we argued that “animal-friendly” results shouldn’t be that surprising given the Moral Weight Project’s assumptions—nor are they a good reason to think that the Project’s assumptions are mistaken.

In what follows, we’ll briefly recap our understanding of welfare ranges and our proposed way of using them. Then, we’ll summarize our methodology and respond to some questions and objections.

How can we compare benefits to the members of different species?

Many EA organizations use DALYs-averted as a unit of goodness. So, the Moral Weight Project tries to express animals’ welfare level changes in terms of DALYs-averted. This lets people conduct standard cost-effectiveness analyses across human and animal interventions. (What follows is a compressed overview of our strategy. For more detail, please see our Introduction to the Moral Weight Project.)

In the context of a cost-effectiveness analysis, a “moral weight discount” is a function that takes some amount of some species’ welfare as an input and has some number of DALYs as an output. So, the Moral Weight Project tries to provide “moral weight discounts” for 11 commercially-significant species. The interpretation of this function depends on the moral assumptions in play. The Moral Weight Project assumes hedonism (welfare is determined wholly by positively and negatively valenced experiences) and unitarianism (equal amounts of welfare count equally, regardless of whose welfare it is). Given hedonism and unitarianism, a species's moral weight is how much welfare its members can realize—i.e., its members’ capacity for welfare. That is, everyone’s welfare counts the same, but some may be able to realize more welfare than others.

Capacity for welfare = welfare range × lifespan. An individual’s welfare range is the difference between the best and worst welfare states the individual can realize. In other words, assume we can assign a positive number to the best welfare state the individual can realize and a negative number to the worst welfare state the individual can realize. The difference between them is the individual’s welfare range.

We’re ultimately trying to convert changes in welfare levels into DALYs. So, the relevant “best” human welfare state is the average welfare level of the average human in full health. The relevant “best” animal welfare states will be analogous.

For simplicity’s sake, we assume that humans’ welfare range is symmetrical around the neutral point. So, if the “best” welfare state for a human is represented by some arbitrary positive number, then the “worst” welfare state is represented by the negation of that number. (For reasons we sketch below, this assumption matters less than you might think. For some preliminary thoughts on the symmetry assumption, see this report.)

Welfare ranges allow us to convert species-relative welfare assessments, understood as percentage changes in the portions of animals’ welfare ranges, into a common unit. To illustrate, let’s make the following assumptions:

  1. Chickens’ welfare range is 10% of humans’ welfare range.
  2. Over the course of a year, the average chicken is about half as badly off as they could be in conventional cages (they’re at the ~50% mark in the negative portion of their welfare range).
  3. Over the course of a year, the average chicken is about a quarter as badly off as they could be in a cage-free system (they’re at the ~25% mark in the negative portion of their welfare range). 

Given these assumptions, we can calculate the welfare gain of a cage-free campaign in DALY-equivalents averted: 

  1. Assuming symmetry around the neutral point, the negative portion of chickens’ welfare range is 10% of humans’ positive welfare range. (For instance, if humans’ welfare range is 100 and chickens’ welfare range is 10, humans range from -50 to 50 and chickens range from -5 to 5. So, the negative portion of chickens’ welfare range is still 10% of humans’ welfare range.)
  2. Given our assumptions about the welfare impacts of the two production systems, the move from conventional cages to aviary systems averts an amount of welfare equivalent to 25% of the average chicken’s negative welfare range. (Continuing with the numbers mentioned in the previous step, it moves chickens from -2.5 to -1.25).
  3. So, assuming symmetry around the neutral point, 25% of chickens’ negative welfare range is equivalent to 2.5% (10% × 25%) of humans’ positive welfare range. 
  4. By definition, averting a DALY averts the loss of an amount of welfare equivalent to the positive portion of humans’ welfare range for a year.
  5. So, assuming symmetry around the neutral point, the move from conventional cages to aviary systems averts the equivalent of 0.025 DALYs per chicken per year on average.

The symmetry assumption doesn’t matter for our welfare range estimates. Instead, it matters for estimates of the total number of DALY-equivalents averted. Suppose, for instance, that humans’ welfare range is 0 to 100 (on net, their welfare is always neutral or positive) whereas chickens’ welfare range is -9 to 1 (their welfare can be 9x worse than it can be good). Our estimate of chickens’ relative welfare range would be the same: 10%. However, such an asymmetry would obviously alter the amount of welfare represented by “25% of chickens’ negative welfare range” (0.225 DALYs per chicken per year on average vs. 0.025 DALYs per chicken per year on average). To make the implications clear, we’ve developed a farmed animal welfare cost-effectiveness BOTEC that allows users to input their own assumptions about the skews of animals’ welfare ranges to convert welfare changes into DALY-equivalents averted.

Some welfare range estimates

What follows are some probability-of-sentience- and rate-of-subjective-experience-adjusted welfare range estimates. These numbers are based on:

Species5th-percentile50th-percentile95th-percentile
Pigs

0.005

0.515

1.031

Chickens

0.002

0.332

0.869

Octopuses

0.004

0.213

1.471

Carp

0

0.089

0.568

Bees

0

0.071

0.461

Salmon

0

0.056

0.513

Crayfish

0

0.038

0.491

Shrimp

0

0.031

1.149

Crabs

0

0.023

0.414

Black Soldier Flies

0

0.013

0.196

Silkworms

0

0.002

0.073

We provide the technical details in this document. We now turn to the more general methodology behind these numbers.

How did we estimate relative welfare ranges?

Given hedonism, an individual’s welfare range is the difference between the welfare level associated with the most intense positively valenced experience the individual can realize and the welfare level associated with the most intense negatively valenced experience that the individual can realize. So, we looked for evidence of variation in the capacities that generate positively and negatively valenced experiences.

Since there are no agreed-upon objective measures of the intensity of valenced states, we pursued a four-step strategy:

  1. Make some plausible assumptions about the evolutionary function of valenced experiences
  2. Given those functions, identify a lot of empirical traits that could serve as proxies for variation with respect to those functions
  3. Survey the literature for evidence about those traits
  4. Aggregate the results

There are many theories of valence, not all of which are mutually exclusive. For instance, some think that valenced experiences represent information in a motivationally-salient way (“That’s good” / “That’s bad” / “That’s really good” / etc.; Cutter & Tye 2011), others that valenced experiences provide a common currency for decision-making (“A feels better than B” / “C feels worse than D”; Ginsburg & Jablonka 2019), and others still that they facilitate learning (“If I do X, I feel good” / “If I do Y, I feel bad”; Damasio & Carvalho 2013). In all three cases, there are potential links between valence and conceptual or representational complexity, decision-making complexity, and affective (emotional) richness.

We conducted a large literature review for traits that could serve as indicators of conceptual or representational complexity, decision-making complexity, and affective richness, involving over 100 qualitative and quantitative proxies across 11 species. The literature review is available here. Descriptions of the proxies are available here (and for the “quantitative proxies” model, here).

We aggregated the results. However, aggregation raises lots of thorny methodological issues. So, we opted to build several models. For a variety of reasons, though, we ultimately opted not to include them all in our estimates: some could be accused of stacking the deck in favor of animals (the Equality Model), some were missing too much data (the Quantitative Model), and some involved assumptions that went beyond the key assumptions of the Moral Weight Project (the Grouped Proxy Model and the JND Model). We then took the remaining models and used Monte Carlo simulations to estimate the distribution of welfare ranges, as detailed here.

Jason Schukraft estimated that there’s a ~70% chance that there exist morally relevant differences in the rate of subjective experience and a ~40% chance that CFF values roughly track the rate of subjective experience under ideal conditions. So, we applied a credence-discounted adjustment to our welfare range estimates by the CFF for a given species. Since this proxy suggests that some animals have a faster rate of subjective experience than humans, it supports greater-than-human welfare range estimates on some models. 

Finally, we adjusted our estimates based on our best guess estimates of the probability of sentience. We generated those estimates by extending and updating Rethink Priorities’ Invertebrate Sentience Table and then aggregating the results as detailed here.

Questions about and objections to the Moral Weight Project’s methodology 

“I don't share this project’s assumptions. Can't I just ignore the results?”

We don’t think so. First, if unitarianism is false, then it would be reasonable to discount our estimates by some factor or other. However, the alternative—hierarchicalism, according to which some kinds of welfare matter more than others or some individuals’ welfare matters more than others’ welfare—is very hard to defend. (To see this, consider the many reviews of the most systematic defense of hierarchicalism, which identify deep problems with the proposal.)

Second, and as we’ve arguedrejecting hedonism might lead you to reduce our non-human animal estimates by ~⅔, but not by much more than that. This is because positively and negatively valenced experiences are very important even on most non-hedonist theories of welfare.

Relatedly, even if you reject both unitarianism and hedonism, our estimates would still serve as a baseline. A version of the Moral Weight Project with different philosophical assumptions would build on the methodology developed and implemented here—not start from scratch.

“So you’re saying that one person = ~three chickens?”

No. We’re estimating the relative peak intensities of different animals’ valenced states at a given time. So, if a given animal has a welfare range of 0.5 (and we assume that welfare ranges are symmetrical around the neutral point), that means something like, “The best and worst experiences that this animal can have are half as intense as the best and worst experiences that a human can have”—remembering that, in this context, the welfare level associated with “best experiences that a human can have” is the average welfare level of the average human in full health, which, presumably, is lower than the most intense pleasure humans are physically capable of experiencing. 

Because we’re estimating the relative intensities of valenced states at a time, not over time, you have to factor in lifespan to make individual-to-individual comparisons. Suppose, then, that the animal just mentioned—the one with a welfare range of 0.5—has a lifespan of 10 years, whereas the average human has a lifespan of 80. Then, humans have, on average, 16x this animal’s capacity for welfare; equivalently, its capacity for welfare is 0.0625x a human’s capacity for welfare.

However, while there are decision-making contexts where total capacity for welfare matters, they aren’t the most pressing ones. In practice, we rarely compare the value of creating animal lives with the value of creating human lives. Instead, we’re usually comparing either improving animal welfare (welfare reforms) or preventing animals from coming into existence (diet change → reduction in production levels) with improving human welfare or saving human lives. Whatever combination we consider, total capacity for welfare isn’t relevant. Instead, we want to know things like how much suffering we can avert via some welfare reform vs. how many years of human life will this intervention save. Welfare ranges can be helpful in answering the former question.

“I can’t believe that bees beat salmon!”

We also find it implausible that bees have larger welfare ranges than salmon. But (a) we’re also worried about pro-vertebrate bias; (b) bees are really impressive; (c) there's a great deal of overlap in the plausible welfare ranges for these two types of animals, so we aren't claiming that their welfare ranges are significantly different; and (d) we don’t know how to adjust the scores in a non-arbitrary way. So, we’ve let the result stand. (We’d make similar points in response to: “I can’t believe that octopuses beat carp!”)

“Even granting the project’s assumptions, it seems obvious that [insert species] have much smaller welfare ranges than you’re suggesting. If the empirical evidence doesn’t demonstrate that, isn’t it a problem with the empirical evidence?”

No. First, the empirical evidence is our only objective guide to animals’ abilities—avoiding the twin mistakes of anthropomorphism (attributing human characteristics to nonhumans) and what Franz de Waal calls “anthropodenial”—i.e., “the a priori rejection of shared characteristics between humans and animals.” So, we’re inclined to defer to it.

This deference, plus the assumption of hedonism, do a lot of work in explaining our estimates. Given our deference to the empirical literature, we aren’t positing differences if we can’t cite justifications for them. Given hedonism, lots of apparent differences between humans and animals don’t matter, as they’re irrelevant to the intensities of the valenced states. So, if our results seem counterintuitive, it may be that implicit disagreements about these assumptions explain that reaction.

Second, recall that we’re treating missing data as evidence against sentience and for larger welfare range differences. So, while the empirical evidence is limited, we aren’t using that fact to stack the deck in animals’ favor—quite the opposite.

Third, even if the results are counterintuitive, that is not necessarily a reason to reject the estimates (as we argue here). After all, it’s an open question whether we should trust any of our intuitions about animals’ ability to generate welfare, especially if those intuitions are driven by thinking about the practical implications of these estimates. There are many, many other assumptions that need to be in place before these estimates have any practical implications at all. So, if the practical implications are counterintuitive, those other assumptions are just as much to blame.

“I’m skeptical that [insert proxy] has much to do with welfare ranges.” 

In some cases, we share that skepticism; we readily grant that the proxy list could be refined. However, there is either a version of hedonism or a theory about valenced states on which each of the proxies bears on differences in welfare ranges. We couldn’t resolve all those theoretical issues in the time available. Moreover, we could reject certain proxies if we had independent ways to check whether our welfare range estimates are accurate. Plainly, though, we don’t. So, it’s best to err on the side of inclusiveness. Indeed, the proxy list could be expandedWe opted for a fairly inclusive approach to the proxies, which made the project enormous. Still, there are many other traits that could have been included—and, in some cases, perhaps ought to have been included in a list of this length. 

If we can make progress on the relevant theoretical issues, we can refine our proxy list. Until then, we’re navigating uncertainty by incorporating as many reasonable approaches as possible.

“How could there be as many ‘unknowns’ as you’re suggesting? After all, in this context, ‘not-unknown’ just means ‘above or below 50% however slightly’—and surely that’s a low bar.”

We thought it was important to have domain experts review the literature whenever possible. However, domain experts are academics. Academics are socialized into a community where it’s inappropriate to make some positive claim (“Pigs have this trait” or “pigs lack that trait”) without being able to establish that claim to the satisfaction of their peers. There are good reasons to value this socialization in the present case. For instance, it’s difficult to predict which traits an organism will have based on its other traits. Moreover, it’s difficult to predict whether one kind of organism will have a trait because a related kind of organism does. Still, even though the probability ranges we mentioned earlier establish a very low bar for “lean yes” and “lean no” (above and below 50%, respectively), we defaulted to “unknown” when we couldn’t find any relevant literature. Even if our approach is defensible, other reasonable literature reviewers may have had more “lean yes” and “lean no” assessments than we did. 

“You’re assessing the proxies as either present or absent, but many of them obviously come either in degrees or in qualitatively different forms.”

This is indeed a limitation; we readily acknowledge that many of the proxies are relatively coarse-grained. Consider a trait like reversal learning: namely, the ability to suppress a reward-related response, which involves stopping one behavior and switching to another. This trait comes in degrees: some animals can learn to suppress a reward-related response in fewer trials; and, having learned to suppress a reward-related response at all, some can suppress their response more quickly. A more sophisticated version of the project would account for this variation. 

However, it isn’t clear what to do about it, as the empirical literature doesn’t provide straightforward ways to score animals on many of these proxies. This problem might be solvable in the case of reversal learning specifically, since we can, at the very least, measure the rate at which the animal learns to suppress the reward-related response. In other cases, the problem is much harder. For instance, parental care is obviously different in humans than in chickens. But we don’t see how to quantify the difference without making many controversial assumptions that, in all likelihood, will simply smuggle in a range of pro-human biases. So, given the current state of knowledge, the present / absent approach seems best.

“It isn’t even clear to me that [insert species] are sentient. So, why should I accept your estimate of their (ostensible) welfare range?”

You shouldn’t. Instead, you should adjust our probability-of-sentience-conditioned estimate based on your credence in the hypothesis that [insert species] are sentient.

That being said, there is deep uncertainty about consciousness generally and sentience specifically. In the face of that uncertainty, we think there’s no good argument for assigning a credence below 0.3 (30%) to the hypothesis that normal adult pigs, chickens, carp, and salmon are sentient. Likewise, we think there’s no good argument for assigning a credence below 0.01 (1%) to the hypothesis that normal adult members of the invertebrate species of interest are sentient. So, skepticism about sentience might lead you to discount our estimates, but probably by fairly modest rates.

“Your literature review didn’t turn up many negative results. However, there are lots of proxies such that it’s implausible that many animals have them. So, your welfare range estimates are probably high.”

This is a good objection. However, it isn’t clear how aggressively to discount our results because of it. After all, we know so little about animals’ lives. In many cases, no one has cared enough to investigate welfare-relevant traits; in many other cases, no one knows how to investigate them. Moreover, the history of research on animals suggests that we’ll be surprised by their abilities. So, of the unknown proxies for any given species, we should expect to find at least some positive results—and perhaps many positive results. The upshot is that while it might make sense to discount our estimates by some modest rate (e.g., 25%—50%), we don’t think it would be reasonable to discount them by, say, 90%, much less 99%.

In any case, we should stress that we aren’t inflating our estimates: we’re just following what seems to us to be a reasonable methodology, premised on deferring to the state of current knowledge. As we learn more about these animals, we should—and will indeed—update.

In future work, we could make inferences about proxy possession from more distant taxa. Or, we could try using a modern missing data method to account for any potential systematic trends in why some species-model pairs have no extant evidence.

“Shouldn’t you give neuron counts more weight in your estimates?”

We discuss neuron counts in depth here. In brief, there are many reasons to be skeptical about the value of neuron counts as proxies for welfare ranges. Moreover, some ways of incorporating neuron counts would increase our welfare range estimates for invertebrates, not decrease them. So, we already regard the weight currently assigned as a kind of compromise with community credences. 

“You don’t have a model that’s based on the possibility that the number of conscious systems in a brain scales with neuron counts (i.e., 'the Conscious Subsystems Hypothesis')."

We discuss the conscious subsystems hypothesis in depth hereThe conscious subsystems hypothesis is a highly controversial philosophical thesis. So, given our methodological commitment to letting the empirical evidence drive the results, we decided not to include this hypothesis in our calculations.

How confident are we in our estimates and what would change them?

No one should be very confident in any estimate of a nonhuman animal’s welfare range. We know far too little for that. However, we’re reasonably confident about some things.

Given hedonism and conditional on sentience, we think (credence: 0.7) that none of the vertebrate nonhuman animals of interest have a welfare range that’s more than double the size of any of the others.  While carp and salmon have lower scores than pigs and chickens, we suspect that’s largely due to a lack of research.

Given hedonism and conditional on sentience, we think (credence: 0.65) that the welfare ranges of humans and the vertebrate animals of interest are within an order of magnitude of one another.

While humans have some unique and impressive abilities, those abilities have histories; they didn’t just pop into existence when humans came on the scene. Many nonhuman animals have precursors to these abilities (or variants on them, adapted to animals’ particular ecological niches). 

Moreover, and more importantly, it isn’t clear that many of these impressive abilities make much difference to the intensity of the valenced states that humans can realize. Instead, humans seem to realize a much greater variety of valenced states. If hedonism is true, though, variety probably doesn’t matter; intensity does the work.

Given hedonism and conditional on sentience, we think (credence 0.6) that all the invertebrates of interest have welfare ranges within two orders of magnitude of the vertebrate nonhuman animals of interest. Invertebrates are so diverse and we know so little about them; hence, our caution.

As for what would change our mind, the main thing is research on the proxies. In principle, research on the proxies could alter our welfare range estimates significantly. Right now, the proxies are fairly coarse-grained and we aren’t confident about their relative importance. If, for instance, we were to learn there are ten levels of reversal learning and that shrimp only reach the second, that could significantly alter our results. Likewise, if we were to learn that having a self-concept is 10x more important than parental care when it comes to estimating differences in welfare ranges, that could significantly alter our results.

Conclusion

Our view is that the estimates we’ve provided are placeholders. Our estimates will change as we learn more about all animals, human and nonhuman. They will change as we learn more about the various traits we share with nonhuman animals and the various traits we don’t share with them. They will change with advances in comparative cognition, neuroscience, philosophy, and various other fields. We’re under no illusions that we’re providing the last word on this topic. Instead, we’re providing a starting point for more rigorous, empirically-driven research into animals’ welfare ranges. At the same time, we’re offering guidance for decisions that have to be made long before that research is finished.

 

Acknowledgments

This research is a project of Rethink Priorities. It was written by Bob Fischer. For help at many different stages of this project, thanks to Meghan Barrett, Marcus Davis, Laura Duffy, Jamie Elsey, Leigh Gaffney, Michelle Lavery, Rachael Miller, Martina Schiestl, Alex Schnell, Jason Schukraft, Will McAuliffe, Adam Shriver, Michael St. Jules, Travis Timmerman, and Anna Trevarthen. If you’re interested in RP’s work, you can learn more by visiting our research database. For regular updates, please consider subscribing to our newsletter.


Joel Tan (CEARCH) @ 2023-01-24T13:16 (+31)

Hi Bob & team,

Really great work. Regardless of my specific disagreements, I do think calculating moral weights for animals is literally some of the highest value work the EA community can do, because without such weights we cant compare animal welfare causes to human-related global health/longtermism causes - and hence cannot identify and direct resources towards the most important problems. And I say this as someone who has always donated to human causes over animal ones, and who is not, in fact, vegan.

With respect to the post and the related discussion:

(1) Fundamentally, the quantitative proxy model seems conceptually sound to me.

(2) I do disagree with the idea that your results are robust to different theories of welfare. For example, I myself reject hedonism and accept a broader view of welfare (given that we care about a broad range of things beyond happiness,  e.g. life/freedom/achievement/love/whatever). If (a) such broad welfarist views are correct, (b) you place a sufficiently high weight on the other elements of welfare (e.g. life per se, even if neutral valenced), and (c) you don't believe animals can enjoy said elements of welfare (e.g. if most animals aren't cognitively sophisticated enough to have preferences over continued existence), then  an additional healthy year of human life would plausibly be worth a lot more than an equivalent animal year even after accounting for similar degrees of suffering and the relevant moral weights as calculated.

(3) I would like to say, for the record, that a lot of the criticism you're getting (and I don't exempt myself here) is probably subject to a lot of motivated reasoning. I am personally uncertain as to the degree to which I should discount my own conclusions over this reason.

(4) My main concern, as someone who does human-related cause prioritization research, is the meat eater argument and whether helping to save human lives is net negative from overall POV, given the adverse consequences for animal suffering. I am moderately optimistic that this is not so, and that saving human lives is net positive (as we want/need it to be) . Having very roughly run the numbers myself using RP's unadjusted moral weights (i.e. not taking into account point 2 above) and inputting other relevant data (e.g. on per capita consumption rate of meat), my approximate sense is that in saving lives we're basically buying 1 full week of healthy human life for around 6 days of chicken suffering or above 2 days of equivalent human suffering - which is worth it.

Bob Fischer @ 2023-01-24T15:55 (+11)

Thanks for the kind words about the project, Joel! Thanks too for these thoughtful and gracious comments.

1. I hear you re: the quantitative proxy model. I commissioned the research for that one specially because I thought it would be valuable. However, it was just so difficult to find information. To even begin making the calculations work, we had to semi-arbitrarily fill in a lot of information. Ultimately, we decided that there just wasn't enough to go on.

2. My question about non-hedonist theories of welfare is always the same: just how much do non-hedonic goods and bads increase humans' welfare range relative to animals' welfare ranges? As you know, I think that even if hedonic goods and bads aren't all of welfare, they're a lot of it (as we argue here). But suppose you think that non-hedonic goods and bads increase humans' welfare range 100x over all other animals. In many cost-effectiveness calculations, that would still make corporate campaigns look really good.

3. I appreciate your saying this. I should acknowledge that I'm not above motivated reasoning either, having spent a lot of the last 12 years working on animal-related issues. In my own defense, I've often been an animal-friendly critic of pro-animal arguments, so I think I'm reasonably well-placed to do this work. Still, we all need to be aware of our biases.

4. This is a very interesting result; thanks for sharing it. I've heard of others reaching the same conclusion, though I haven't seen their models. If you're willing, I'd love to see the calculations. But no pressure at all.

Vasco Grilo @ 2023-04-03T21:32 (+6)

Hi Joel,

I myself reject hedonism and accept a broader view of welfare (given that we care about a broad range of things beyond happiness,  e.g. life/freedom/achievement/love/whatever).

Hedonism is compatible with caring about "life/freedom/achievement/love/whatever", because all of those describe sets of conscious experiences, and hedonism is about valuing conscious experiences. I cannot think of something I value independently of conscious experiences, but I would welcome counterexamples.

Joel Tan (CEARCH) @ 2023-04-04T10:04 (+5)

There's the standard philosophical counterexample of the experience machine, including the reformulated Joshua Greene example that addressed status quo bias. But basically, the idea is this - would you rather that the world was real or just an illusion as you're trapped as a brain in a vat (with the subjective sensory experience itself otherwise identical)? Almost certainly, and most people will give this answer, you'll want the world to be real. That's because we don't just want to think that you're free/successful/in a loving relationship - we also actually want to be all those things.

In less philosophical terms, you can think about how would not want your friends and families and family to actually hate you (even if you couldn't tell the different). And that would also be why people care about having non-moral impact even after they're dead (e.g. authors hoping their posthumously published book is successful, or some athlete wanting their achievements to stand the test of time and not being bested at the next competition, or some mathematician wanting to prove some conjecture and not just think he did).

Vasco Grilo @ 2023-04-04T12:52 (+2)

Thanks for the reply, Joel!

But basically, the idea is this - would you rather that the world was real or just an illusion as you're trapped as a brain in a vat (with the subjective sensory experience itself otherwise identical)?

It depends on the specific properties of the real and simulated world, but my answer would certainly be guided by hedonic considerations:

  • My personal hedonic utility would be the same in the simulated and real worlds, so it would not be a deciding factor.
  • If I were the only (sentient) being in the simulated world, and there were lots of (sentient) beings in the real world, the absolute value of the total hedonic utility would be much larger for the real world.
  • As a result, I would prefer:
    • The real world if I expected the mean experience per being there to be positive (i.e. positive total hedonic utility).
    • The simulated world if I expected the mean experience per being in the real world to be negative (i.e. negative total hedonic utility), and I had positive experiences myself in the simulated world.

Hedonism says all that matters is conscious experiences, but that does mean we should be indifferent between 2 worlds where our personal concious experiences are the same. We still have to look into the experiences of other beings, unless we are perfectly egoistic, which I do not think we should be.

For me, a true counterexample to hedonism would have to present 2 worlds in which expected total (not personal) hedonistic utility (ETHU) were the same, and people still preferred one of them over the other. However, since we do not understand well how to calculate ETHU, we can only ensure 2 worlds have the same of it if they are exactly the same, in which case it does not make sense to prefer one over the other.

In less philosophical terms, you can think about how would not want your friends and families and family to actually hate you (even if you couldn't tell the different).

I agree. However, as I commented here, that is only an argument against egoistic hedonism, not altruistic hedonism (which is the one I support).

MichaelStJules @ 2023-04-04T13:41 (+4)

You can imagine a) everyone in their own experience machine isolated from everyone else, so that all the other "people" inside are not conscious (but the people believe the others are conscious, and there's no risk they'll find out they aren’t), or b) people genuinely interacting with each other (in the real world, or virtual reality), making real connections with other real people. I think most people would prefer the latter for themselves, even if it makes them somewhat worse off. An impartial hedonistic view would recommend disregarding these preferences and putting everyone in the isolated experience machines anyway.

Vasco Grilo @ 2023-04-04T15:34 (+2)

Thanks for the clarification! Some thoughts:

  • Not related to your point, but I would like to note it seems quite extreme to reject the application of hedonism in the context of welfare range estimates based on such thought experiment.
  • It is unclear to me whether ETHU is greater in a) or b). It depends on whether it is more efficient to produce it via experience machines or genuine interactions (I suppose utility per being would be higher with experience machines, but maybe not utility per unit resources). So I do not think people preferring a) or b) is good evidence that there is something else which matters besides ETHU.
  • It does not seem possible to make a hard distinction between a) and b). I am only able to perceive reality via my own conscious experience, so there is a sense in which my body is in fact an experience machine.
  • I believe most people preferring b) over a) is very weak evidence that b) is better than a). Our intuitions are biased towards assessing the thought experiment based on how the words used to describe it make us feel. As a 1st approximation, I think people would be thinking about whether "genuine" and "real" sound better than "machine" and "isolated", and they do, so I am not surprised most people prefer b).
MichaelStJules @ 2023-04-04T00:19 (+4)

Being genuinely loved rather than just believing you are loved could matter to your welfare even if it doesn't affect your conscious experiences. Knowing the truth even of it makes no difference to your experiences. Actually achieving something rather than falsely believing you achieved it.

Vasco Grilo @ 2023-04-04T07:13 (+5)

Thanks for the examples, Michael!

I would say they work as counterexamples to egoistic hedonism, but not to altruistic hedonism (the one I support). In each pair of situations you described, my mental states (and therefore personal hedonic utility) would be the same, but the experiences of others around me would be quite different (and so would total hedonic utility):

  • Pretending to love should feel quite different from loving, and being fake generally leads to worse outcomes.
  • One is better positioned to improve the mental states of others if one knows what is true.
  • Actually achieving something means actually improving the mental states of others (to the extent one is altruistic), rather than only believing one did so.

For these reasons, rejecting wireheading is also compatible with hedonism. A priori, it does not seem like the best way to help others. One can specify in thought experiments that "everyone else['s hedonic utility] is taken care of", but I think it is quite hard to conditional human answers on that, given that lots of our experiences go against the idea that having delusional experiences is both optimal for us and others.

Ula @ 2023-01-27T16:21 (+4)

Would love to see the draft calculations from point 4 as well.

Vasco Grilo @ 2023-02-05T09:17 (+4)

Hi Ula,

FYI, this and this could also be relevant for analysing the meat eater problem. The posts are not updated with RP's moral weight estimates, but the models should still be useful (and I am happy to update them with RP's estimates if you think it is useful).

Joel Tan (CEARCH) @ 2023-01-28T01:05 (+3)

Will DM on slack!

Vasco Grilo @ 2024-03-14T15:54 (+2)

Hi Joel,

(4) My main concern, as someone who does human-related cause prioritization research, is the meat eater argument and whether helping to save human lives is net negative from overall POV, given the adverse consequences for animal suffering. I am moderately optimistic that this is not so, and that saving human lives is net positive (as we want/need it to be) .

Great to know you are considering impacts on animals! Even if the meat eater problem is not a major concern according to your calculations, has CEARCH considered that the best animal welfare interventions may be orders of magnitude more cost-effective than GiveWell's top charities? CEARCH uses a cost-effectiveness bar of 10 times the cost-effectiveness of GiveWell's top charities, but I think this is very low. I estimated corporate campaigns for broiler welfare are 1.71 k times as cost-effective as the lowest cost to save a life among GW's top charities.

With respect to the meat eater problem, I think the conclusion depends on the country. This influences the consumption per capita of animals, how much of each animal species is consumed, and the conditions of the animals. High income countries will tend to have greater consumption per capita and worse conditions, given the greater prevalence of factory-farming. For reference:

  • I estimated the annual suffering of all farmed animals combined is 4.64 times the annual happiness of all humans combined, which goes against your conclusion. For simplicity, I set the welfare per time as a fraction of the welfare range of each farmed animal of any species to a value I got for broilers in a reformed scenario.
  • However, I estimated accounting for farmed animals only decreases the cost-effectiveness of GiveWell's top charities by 14.5 %, which is in line with your conclusion. Yet, I am underestimating the reduction in cost-effectiveness due to using current consumption, given it will tend to increase with economic growth.

I think considering impacts on animals may well affect CEARCH's prioritisation:

  • Interventions in different countries may have super different impacts on animals (as illustrated by the 2 distinct conclusions above). I guess this is more relevant for CEARCH than GiveWell because I have the impression you have been assessing interventions whose beneficiaries are from a set of less homogeneous countries, which means the impacts on animals will vary more, and therefore cannot be neglected so lightly.
  • Interventions to extend life have different implications from interventions to improve quality of life. In general, interventions which improve quality of life without affecting lifespan and income much will have smaller impacts on animals (at least nearterm, i.e. neglecting how population size changes economic growth, and hence the trajectory of the consumption of animals). This is relevant to CEARCH because you have looked not only into interventions mostly saving lives and increasing income, but also into mental health.

I also encourage you to publish your estimates regarding the meat eater problem. I am not aware of any evaluator or grantmaker (aligned with effective altruism or not) having ever published a cost-effectiveness analysis of an intervention to improve human welfare which explicitly considered the impacts on farmed animal welfare (although I am aware of another besides you which have an internal analysis). So CEARCH would be the 1st to do so. For the reasons above, I think it would also be great if you included impacts on animals as a standard feature of your cost-effectiveness analyses.

Aaron Bergman @ 2023-01-29T04:00 (+24)

At risk of jeopardizing EA's hard-won reputation of relentless internal criticism:

Even setting aside its object-level impact-relevant criteria (truth, importance, etc), this is just enormously impressive both in terms of magnitude and quality. The post itself gives us readers an anchor on which to latch critiques, questions, and comments, so it's easy to forget that each step or decision in the whole methodology had to be chosen from an enormous space of possibilities. And this looks— at least on a first red—like very many consecutive well-made steps and decisions

Bob Fischer @ 2023-01-29T12:30 (+4)

Thanks for the kind words, Aaron!

Lizka @ 2023-01-24T20:08 (+21)

I'm curating the post. I should note that I think I agree with a big chunk of Joel's comment.

I notice I'm quite confused about the symmetry assumption. For example: suppose we have two animals — M and N — and they're both at the worst end of their welfare ranges (~0th percentile) and have equal lifespans (and there are no indirect effects). M has double the welfare range of N. If we assume that their welfare ranges are symmetric around the neutral point, then replacing one M with one N is similar to moving M from the 0th percentile of its welfare range to the 25th. If, however, their welfare ranges aren't symmetric — say M's is skewed very positive and N's is skewed very negative — then we could actually be making the situation worse. In the BOTEC spreadsheet you linked, you seem to resolve this by requiring people to state the specific endpoints of the welfare ranges relative to the neutral point. If that's the main solution, it seems very important to be clear about where the neutral point is for different animals, and that seems really hard — I'm curious if you have thoughts on how to approach that. (Maybe you assume that welfare ranges are generally close to symmetric, or asymmetric in similar ways? If so, I would like to understand why you think that.) It's also very possible that I misunderstood something; I was reading things fast and haven't read all the linked posts and documents.

To make sure that I understand (the broad strokes of the rest of the framework) correctly; suppose I want to use this framework and these welfare range estimates to help me decide between two (completely hypothetical, unrealistic) options — assuming that every animal's welfare range is symmetric around the neutral point: (A) getting someone to buy the equivalent of a cage-free chicken instead of a caged chicken vs (B) getting someone to buy a farmed salmon instead of a farmed carp. Is it right that I'd now need to incorporate (estimates for) the following additional information? 

  1. To understand the welfare impact on the animals in question
    1. Lifespans of the animals[1]
    2. Where exactly on their respective welfare ranges they are, on average (in the situations I'm considering)[2]
  2. The other stuff
    1. Indirect effects
      1. E.g. how (many) other animals are affected by the farming processes — feed (insects/fish), how many die in the farming process, etc.
    2. Costs of the interventions

(In particular, I worry a bit that people might not be tracking 1a and 1b — you seem to worry about this, too, given the sections on things like "so you're saying that one person =~ three chickens?" — and I'd like to make sure that I actually understand correctly (and that others do, too).)

  1. ^

    Broiler chickens live for 5-7 weeks, apparently. Farmed carp apparently live for around a year, and farmed salmon live for around 1-3 years. (These numbers are from quick Google searches  —definitely don't trust them.) 

  2. ^

    A highly technical diagram is below. Note that the diagram represents the ranges as if they're all symmetric — as if each animal can experience as much bad as good —  whereas that isn't necessarily true. The welfare impact of choice (A) and (B) is the highlighted interval (assuming completely made-up numbers), multiplying by the lifespans of the animals, and adjusting for indirect effects.

    Given the lifespans of the animals in question, switching to salmon seems harmful ((even) without accounting for indirect effects or costs). 

Bob Fischer @ 2023-01-24T22:09 (+11)

Fantastic questions, Lizka! And these images are great. I need to get much better at (literally) illustrating my thinking. I very much appreciate your taking the time!

Here are some replies:

Replacing an M with an N. This is a great observation. Of course, there may not be many real-life cases with the structure you’re describing. However, one possibility is in animal research. Many people think that you ought to use “simpler” animals over “more complex” animals for research purposes—e.g., you ought to experiment on fruit flies over pigs. Suppose that fruit flies have smaller welfare ranges than pigs and that both have symmetrical welfare ranges. Then, if you’re going to do awful things to one or the other, such that each would be at the bottom of their respective welfare range, then it would follow that it’s better to experiment on fruit flies. 

Assessing the neutral point. You’re right that this is important. It’s also really hard. However, we’re trying to tackle this problem now. Our strategy is multi-pronged, identifying various lines of evidence that might be relevant. For instance, we’re looking at the Welfare Footprint Data and trying to figure out what it might imply about whether layer hens have net negative lives. We’re looking at when vets recommend euthanasia for dogs and cats and applying those standards to farmed animals. We’re looking at tradeoff thought experiments and some of the survey data they’ve generated. And so on. Early days, but we hope to have something on the Forum about this over the summer.

Symmetry vs. asymmetry. This is another hard problem. In brief, though, we take symmetry to be the default simply because of our uncertainty. Ultimately, it’s a really hard empirical question that requires time we didn’t have. (Anyone want to fund more work on this!?) As we say in the post, though, it’s a relatively minor issue compared to lots of others. Some people probably think that we’re orders of magnitude off in our estimates, whereas symmetry vs. asymmetry will make, at most, a 2x difference to the amount of welfare at stake. That isn’t nothing, but it probably won’t swing the analysis.

 The "caged vs. cage-free chicken / carp vs. salmon" examples. This is a great question. We’ve done a lot on this, though none of it’s publicly available yet. Basically, though, you’re correct about the information you’d want. Of course, as your note indicates, we don’t care about natural lifespan; we care about time to slaughter. And while it’s very difficult to know where an animal is in its welfare range, we don’t think it’s in principle inestimable. Basically, if you think that caged hens are living about the worst life a chicken can live, you say that they’re at the bottom end of their welfare range. And if you think cage-free hens have net negative lives, but they’re only about half as badly off as they could be, then can infer that you’re getting a 50% gain relative to chickens’ negative welfare range in the switch from caged to cage-free. And so on. This is all imperfect, but at least it provides a coherent methodology for making these assessments. Moreover, it's a methodology that forces us to be explicit about disagreements re: the neutral point and the relative welfare levels of animals in different systems, which I regard as a good thing.

MichaelStJules @ 2023-03-15T17:04 (+19)

“I can’t believe that bees beat salmon!”

We also find it implausible that bees have larger welfare ranges than salmon. But (a) we’re also worried about pro-vertebrate bias; (b) bees are really impressive; (c) there's a great deal of overlap in the plausible welfare ranges for these two types of animals, so we aren't claiming that their welfare ranges are significantly different; and (d) we don’t know how to adjust the scores in a non-arbitrary way. So, we’ve let the result stand. (We’d make similar points in response to: “I can’t believe that octopuses beat carp!”)

 

(I could believe octopuses beat carps, because octopuses seem unusually cognitively sophisticated among animals.)

I'd guess the main explanation for this (at least sentience-adjusted, if that's what's meant here), which may have biased your results against salmons and carps, is that you used the prior probability for crab sentience (43% mean, 31% median from table 3 in the doc) as the prior probability for salmon and carp sentience, and your posterior probabilities of sentience are generally very similar to the priors (compare tables 3 and 4 in the doc). Honeybees, fruit flies, crabs, crayfish, salmons and carps all ended up with similar sentience probabilities, but I'd assign higher probabilities to salmons and carps than to the others. You estimated octopuses to be about 2x as likely to be sentient as salmons and carps, according to both your priors and posteriors, with means and medians roughly between 73% and 78% for octopuses. On the other hand, your sentience-conditioned welfare ranges didn't differ too much between the fish, octopuses and bees. It's worth pointing out that Luke Muehlhauser had signficantly higher probabilities for rainbow trouts (70%, in the salmonid family like salmons) than Gazami crabs (20%) and fruit flies (25%), and you could use his prior for rainbow trouts for salmons and carps instead (or something in between). That being said, his probabilities were generated in different ways from yours, so that might introduce other biases. You could instead use your prior for octopuses (or something in between). Or, most consistent with your methodology, would be to have the authors of the original estimates for RP just estimate these probabilities directly, with or without the data you gathered for salmons and carps. Any of these would be relatively small fixes.

As an aside, should we interpret this sentience probability work as not primarily refining your old estimates (since the posteriors and priors are very similar), but as adding other species and further modelling your uncertainty?

 

There may be some other smaller potential sources of bias that contributed here, but I don't expect them to have been that important:

  1. I'm guessing salmon and carp (and apparently zebrafish, which seem to often have been used when direct evidence wasn't available, maybe more for carp) are less well-studied than bees, so your conservative assumptions of assigning 0 to "unknown" for both probabilities of sentience and welfare ranges conditional on sentience may count more against them. For example, there were some studies found for "cognitive sophistication" for honeybees but not for salmons or carps, and more found for "mood state behaviors" for honeybees than salmons and carps in your new Sentience Table. For your Welfare Range Table, bees had fewer "unknowns" for cognitive proxies than salmons and carps, but more for hedonic proxies and representational proxies.
    1. One possible quick-ish fix would be to use a prior for the presence/absence of proxies across animals based on the ones for which there are studies (possibly just those you collected evidence for), although this may worsen other biases, like publication bias, and it requires you to decide how to weigh different animal species (but uniformly across those you collected evidence for is one way, although somewhat arbitrary).
    2. Another quick-ish fix could be to make more assumptions between species you gathered evidence for, e.g. if a fruit fly has some capacity, I'd expect fish to, as well, and if some mammal is missing some capacity, I'd expect salmon and carp to not have it either. This may be too strong, but you did use the crab sentience prior for the fish.
    3. Longer fixes could use more sophisticated missing data methods.
  2. You may have underestimated salmon and carp neuron counts around 100x.
MichaelStJules @ 2023-03-15T19:33 (+6)

Also, among the proxies you've used, I'd be inclined to give almost all of my weight to a handful of hedonic proxies, namely panic-like behavior, hyperalgesia, PTSD-like behavior, prioritizes pain response in relevant context and motivational trade-off (a cognitive proxy) as indicating the extremes of welfare conditional on sentience, and roughly in that order by weight. The first three all came up "unknown" due to no studies for bees, but there were a few studies suggesting their presence (and none negative) for the fish. Giving almost all of your weight to these proxies would favor the fish over bees. That being said, I wouldn't be that surprised to find out that bees display those behaviors, too, because I also think bees are very impressive and behaviorally complex.

I might use joy-like behavior and play behavior for the other end of the welfare range, but I expect them to be overshadowed by the intense suffering indicators above, and I don't expect them to differ too much across the species. There was evidence of play behavior in all three, but only evidence for joy-like behavior in carps.

The next proxies that could make much difference that I think could matter on some models (although I don't assign them much weight) would be neuron counts and the number of just-noticeable differences, and neuron counts would also favor the fish.

Bob Fischer @ 2023-03-17T09:35 (+6)

Thanks for all this, Michael. Lots to say here, but I think the key point is that we don't place much weight on these particular numbers and, as you well know and have capably demonstrated, we could get different numbers (and ordinal rankings) with various small changes to the methodology. The main point to keep in mind (which I say not for your sake, but for others, as I know you realize this) is that we'd probably get even smaller differences between welfare ranges with many of those changes. One of the main reasons we get large differences between humans and many invertebrates is because of the sheer number of proxies and the focus on cognitive proxies. There's an argument to be given for that move, but it doesn't matter here. The point is just that if we were to focus on the hedonic proxies you mention, there would be smaller differences--and it would be more plausible that those would be narrowed further by further research.

If I had more time, I would love to build even more models to aggregate various sets of proxies. But only so many hours in the day!  

Ariel Simnegar @ 2023-01-24T15:47 (+17)

Hi Bob and RP team,

I've been working on a comparative analysis of the knock-on effects of bivalve aquaculture versus crop cultivation, to try to provide a more definitive answer to how eating oysters/mussels compares morally to eating plants. I was hoping I could describe how I'd currently apply the RP team's welfare range estimates, and would welcome your feedback and/or suggestions. Our dialogue could prove useful for others seeking to incorporate these estimates into their own projects.

For bivalve aquaculture, the knock-on moral patients include (but are not limited to) zooplankton, crustaceans, and fish. Crop cultivation affects some small mammals, birds, and amphibians, though its effect on insect suffering is likely to dominate.

RP's invertebrate sentience estimates give a <1% probability of zooplankton or plant sentience, so we can ignore them for simplicity (with apologies to Brian Tomasik). The sea hare is the organism most similar to the bivalve for which  sentience estimates are given, and it is estimated that a sea hare is less likely to be sentient than an individual insect. Although the sign of crop cultivation's impact on insect suffering is unclear, the magnitude  seems likely to dominate the effect of bivalve aquaculture on the bivalves themselves, so we can ignore them too for simplicity.

The next steps might be:

  1. Calculate welfare ranges:
    1. For bivalve aquaculture, use carp, salmon, crayfish, shrimp, and crabs to calculate a welfare range for the effect of bivalve aquaculture on marine populations.
    2. Use chickens as a model species to calculate a welfare range for the effect of crop cultivation on vertebrate populations.
    3. For the effect of crop cultivation on insect suffering, I might just toss this problem on to future researchers. I'm only doing this as a side project, and given the sheer complexity of the considerations at play, I'm worried I might publish something which inadvertently increases insect suffering instead of decreasing it.
  2. For several moral views (negative utilitarianism, symmetric utilitarianism) and several perspectives of the value of a typical wild animal's life (net negative, net neutral, net positive), extract relevant conclusions. (e.g. if bivalve aquaculture is robustly shown to increase marine populations, given Brian's arguments that crop cultivation likely reduces vertebrate populations, a negative utilitarian who views wild animal lives as net negative may want to oppose bivalve consumption.)

(Of course, I'd have to mention longtermist considerations. The effect of  norms surrounding animal consumption on moral circle expansion could be crucial. So could the effect of these consumption practices on climate change or on food security.)

Bob Fischer @ 2023-01-26T11:10 (+7)

Thanks for your comment, Ariel, and sorry for the slow reply! What you've described sounds great as far as it goes. However, my basic view here--which I offer with sincere appreciation for the project you're describing and a genuine desire to see it completed--is that the uncertainties are so far-reaching that, while we can get clearer about the conditions under which, say, a negative utilitarian will condemn bivalve consumption, we basically have no idea which condition we're in. So, I think that the most valuable thing right now would be to write up specific empirical research questions and value-aligned ways of operationalizing the key concepts. Then, we should be hunting for graduate students and early-career researchers who might be willing to do the empirical work in exchange for relatively small amounts of funding. (Many academics are cheap dates.) From my perspective, EA has gone just about as far as it can already on these kinds of questions without more substantive collaborations with entomologists, aquatic biologists, ecologists, and so on.

All that said, I'll stress that I completely agree with you about the importance of getting answers here! I just think we're at the point where we can't make much more progress toward them from the armchair.

MHR @ 2023-09-27T12:31 (+13)

Question about uncertainty modeling (tagging @Laura Duffy here since she might be the best person to answer it): 

How do you think about the different models of welfare capacity that were averaged together to make the mixture model? Is your assumption that one of these models is really the true correct model in all species (and you don't yet know which one it is), or that the different constituent models might each be more or less true for describing the welfare capacity for each individual species? 

My context for asking this is in thinking about quantifying the uncertainty for a function that depends on the welfare ranges of two different species (e.g. y = f(welfare range of shrimp, welfare range of pigs)). It's tempting to just treat the welfare ranges of shrimp and pigs as independent variables and to then sample each of them from their respective mixture model distribution. But if we think there's one true model and the mixture model is just reflecting uncertainty as to what that is, the welfare ranges of shrimp and pigs should be treated as correlated variables. One might then obtain an estimate of the uncertainty in y by generating samples as follows:

  1. Randomly pick one of the 9 models in the mixture model as the true model 
  2. Sample the welfare range of both shrimp and pigs from their distributions for the selected constituent model
  3. Compute y = f(welfare range of shrimp, welfare range of pigs)
  4. Repeat steps 1-3 until the desired # of samples is obtained 

I could also imagine computing the covariance of the different species' welfare ranges and directly generating samples as correlated random variables. 

Bob Fischer @ 2023-09-27T21:00 (+6)

Thanks a bunch for your question, Matt. I can speak to the philosophical side of this; Laura has some practical comments below. I do think you're right that---and in fact our team discussed the possibility that---we ought to be treating the welfare range estimates as correlated variables. However, we weren't totally sure that that's the best way forward, as it may treat the models with more deference than makes sense.
Here's the rough thought. We need to distinguish between (a) philosophical theories about the relationship between the proxies and welfare ranges and (b) models that attempt to express the relationship between proxies and welfare range estimates. We assume that there's some correct theory about the relationship between the proxies and welfare ranges, but while there might be a best model for expressing the relationship between proxies and welfare range estimates, we definitely don't assume that we've found it. In part, this is because of ordinary points about uncertainty. Additionally, it's because the philosophical theories underdetermine the models: lots of models are compatible with any given philosophical theory; so, we just had to choose representative possibilities. (The 1-point-per-proxy and aggregation-by-addition approaches, for instance, are basically justified by appeal to simplicity and ignorance. But, of course, the philosophical theory behind them is compatible with many other scoring and aggregation methods.) So, there's a worry that if we set things up the way you're describing, we're treating the models as though they were the philosophical theories, whereas it might make more sense not to do that and then make other adjustments for practical purposes in specific decision contexts if we're worried about this.

Laura's practical notes on this:

  1. A change like the one you're suggesting would likely decrease the variance in the estimates of f(), since if you assume the welfare ranges are independent variables, you'd get samples where the undiluted experiences model is dominating the welfare range for, say, shrimp, and the neuron count model is dominating the welfare range for pigs. I suggest a quick practical way of dealing with this would be to cut off values of f() below the 2.5th percentile and 97.5th percentile.
  2. Or, even better, I suggest sorting the welfare ranges from least to greatest, then using pairs of the ith-indexed welfare ranges for the ith estimate of f(). Since each welfare model is given the same weight, I predict this'll most accurately match up welfare range values from the same welfare model. (e.g. the first 11% will be neuron count welfare ranges, etc.)
  3. Ultimately, however, given all the uncertainty in whether our models are accurately tracking reality, it might not be advisable to reduce the variance as such.
MHR @ 2023-10-04T00:05 (+16)

Thanks, this is great information! The concern you raised regarding distinguishing between philosophical theories and models makes a lot of sense. With that said, I don't currently feel super satisfied with the practical steps you suggested. 

On the first note, the impact of the correlation depends on the structure of . Suppose I'm trying to estimate the total harms of eating chicken/pork, so we have something like . In this case, treating the welfare ranges of chickens and pigs as correlated will increase the variance of . On the flip side, if we're trying to estimate the welfare impact of switching from eating chicken to eating pork, we have something like . In that case,  treating the welfare ranges of pigs and chickens as correlated will decrease the variance of . Trying to address this in an ad-hoc manner seems like it's pretty challenging. 

On the second note, I think that's basically treating the welfare capacities of e.g. pigs and chickens as perfectly correlated with one another. That seems extreme to me, since I think a substantial portion of the uncertainty in the welfare rages is coming from uncertainty as to which traits each species has, not which philosophical theory of welfare is correct. 

I come away still thinking that the procedure I suggested seems like the most workable of the approaches mentioned so far. To put a little more rigor to things, here are some examples of plotting the welfare range estimates of chickens and pigs against one another with the different methods (uncorrelated sampling from the respective mixture distributions, sampling from the ordered distributions, and pair-wise sampling from the constituent models). In addition, there are some plots showing the impact of the different sampling methods on some toy analyses of the welfare impact of eating chicken/pork and the impact of switching from eating chicken to eating pork (note that the actual numbers are not intended to be very representative). You can see that the trimming approach only make sense in the second case, and that the paired sampling from constituent models approach produces distributions in between those for the uncorrelated case and those for the ordered case. 

 

Note that when using the pair-wise sampling from constituent models approach, pigs and chickens are more strongly correlated with one another than many other pairs of species are. Here is what the correlation between chickens and shrimp looks like, for example: 

Laura Duffy @ 2023-10-04T00:30 (+7)

Hey, thanks for this detailed reply! 
When I said "practical", I more meant "simple things that people can do without needing to download and work directly with the code for the welfare ranges." In this sense, I don't entirely agree that your solution is the most workable of them (assuming independence probably would be). But I agree--pairwise sampling is the best method if you have the access and ability to manipulate the code! (I also think that the perfect correlation you graphed makes the second suggestion probably worse than just assuming perfect independence, so thanks!)

MHR @ 2023-10-04T00:47 (+4)

Yeah that makes complete sense, it was a pain to get the pairwise sampling working.

Tobias Häberli @ 2023-01-24T10:18 (+9)

Love this type of research, thank you very much for doing it!

I'm confused about the following statement:

While carp and salmon have lower scores than pigs and chickens, we suspect that’s largely due to a lack of research.

Is this a species-specific suspicion? Or does a lower amount of (high-quality) research on a species generally reduce your welfare range estimate? 
On average I'd have expected the welfare range estimate to stay the same with increasing evidence, but the level of certainty about the estimate to increase. 

If you have reason to believe that the existing research is systematically biased in a way that would lead to higher welfare range estimates with more research,  do you account for this bias in your estimates?

Bob Fischer @ 2023-01-24T11:04 (+6)

Great question, Tobias. Yes, less research on a species generally reduces our welfare range estimate. I agree with you that it would be better, in some sense, to have our confidence increase in a fixed estimate rather than having the estimates themselves vary. However, we couldn't see how to do that without invoking either our priors (which we don't trust) or some other arbitrary starting point (e.g., neuron counts, which we don't trust either). In any case, that's why we frame the estimates as placeholders and give our overall judgments separately: vertebrates at 0.1 or better, the vertebrates themselves within 2x of one another, and the invertebrates within 2 OOMs of the vertebrates.

MHR @ 2023-01-23T15:28 (+9)

This is really valuable work, and I look forward to seeing the discussion that it generates and to digging into it more closely myself. I did have one immediate question about the neuron count model specifically, though I recognize that it's a  a small contributor to the overall weights.  I'd be curious to understand how you arrived at 13 million neurons as your estimate for salmon. The reference in the spreadsheet is: 

The teleost brain is capable of adult neurogenesis, with neural proliferation zones in dozens of locations within the brain (e.g. Zupanc et al. 2005, Zupanc 2009). This makes a definitive count of total neurons within the brain difficult, since the number of neurons may be continuously in flux. For example, Zupanc (2009) summarizes: “the continuous production of new cells, together with the longterm persistence of a large portion of them, leads to a permanent growth of the brain and its individual structures... This growth by a net increase in the total number of brain cells is characteristic of at least some, but likely most, of the estimated 30,000 species of teleost fish.” Therefore, reports of total neuron counts for salmon and carp are rare, but Hinsch & Zupanc (2007) report that “By labeling S-phase cells with the thymidine analog 5-bromo-2-deoxyuridine (BrdU), quantitative analysis demonstrated that, on average, 6000 new cells were generated in the entire adult brain within any 30 min period. This corresponds to roughly 0.06% of the total number of brain cells” in an adult zebrafish (Danio rerio, a model cyprinid) brain. As part of their study, Hinsch & Zupanc (2007) report that, for adult zebrafish, the total number of brain cells varied between 0.8 x 107 and 1.3 x 107 (mean: 1.0 x 107 ± S.E.M. 8 x 105). They also report that “approximately 46% of the cells present at 10 days persisted in the adult zebrafish brain” meaning that “​​at least half of the cells generated in the adult zebrafish brain develop into neurons and are likely to persist for the rest of the fish’s life.” This pattern is reflected in other species of teleosts, for example in adult gymnotiform fish (Apteronotus leptorhynchus) who generate 100 000 new brain cells (corresponding to approximately 0.2% of the total population of cells in the brain) within a period of 2 hours (Zupanc & Horschke 1995). Thus the teleost brain is constantly growing and likely increasing in terms of total number of neurons, and counts are only representative of snapshots through time.

I don't easily see how that translates to 13 million neurons. When I previously looked at this issue myself, I came away thinking it was possible that salmon had substantially more neurons than you're estimating. 

Bob Fischer @ 2023-01-23T15:45 (+6)

Thanks, MHR. Quick reply to say: Good question, but I don't know the answer offhand, as I didn't come up with that number myself. Many different people helped with the literature reviews. I'll get in touch with the relevant person and get back to you.

Bob Fischer @ 2023-01-27T16:13 (+8)

Sorry for the delay, MHR! It took a bit to get to the bottom of this. In any case, the short version is that the 8-13M neuron count for both salmon and carp should be read as the lowest reasonable estimate, not our best guess. We got the number from the zebrafish literature--specifically, a study by Hinsch & Zupanc (2007) (cited in the table) who reported that the total number of brain cells for adult zebrafish varied between 8 and 13 million. In the notes associated with the Welfare Range Table, we had a caveat that neuron counts are very hard to come by in fish and, in any case, only represent a snapshot in time, because the teleost brain is constantly growing. Moreover, no one has done total neuron count estimates for salmon or carp, whereas zebrafish are often used as a model species and are well-studied; so, we simply used those values as a placeholder. Granted, then, the 8-13M number may well be an underestimate due to the size differences between zebrafish and salmon, and we do see the appeal of using Invincible Wellbeing's curve fits to come up with a higher number. However, we tried to stick as close to the empirical literature as possible. And truth be told, because neuron counts are just one of several models we include, using a higher number wouldn't make a major difference to our welfare range estimates for salmon or carp.

The upshot is that is one of many cases where our methodology is more conservative than many EAs have been when doing related projects (e.g., we were more inclined to default to "unknown," we used lower-bound placeholder values in some cases, etc.). Advantages and disadvantages!

MHR @ 2023-02-03T10:46 (+5)

Thanks Bob, that makes sense!

Just to see the magnitude of the change, I tried rerunning the model with a neuron count estimate of 100 million for salmon. That led to salmon's 50th-percentile estimate increasing by 0.001 and  95th-percentile estimate increasing by 0.002. So you're right that it's not really a noticeable impact. 

Arturo Macias @ 2023-01-23T14:31 (+9)

Hello to all,

Have you contacted the Integrated Information Theory group about this project? In my (dualistic naturalist) viewpoint their work is the most advanced in the area of consciece detection. 

https://www.amazon.com/Sizing-Up-Consciousness-Objective-Experience/dp/0198728441

Of course, conscience is absolutely noumenal and the best part of their work is focused in the case where self reported conscience experience is possible [humans], but they tried to extrapolate into mathematical models of application to any material system.

RogerAckroyd @ 2023-01-24T09:02 (+7)

The last I read about Integrated Information Theory was Scott Aaronsson's criticism of it. Has his arguments been addressed, because I found it very compelling? 

Arturo Macias @ 2023-01-24T09:13 (+3)

Regarding the neurological part (the conscience detector based in brain information) that is described in "Sizin Up consciuosness" I think they are mostly rigth. The IIT mathematical model is beyond my understanding, and the Aronsson criticism also. But given my naturalistic dualist vision of conscience, unfortunately only an axiomatic and extrapolative way to consciousness measurement is possible. 

Bob Fischer @ 2023-01-23T15:14 (+5)

Good suggestion, Arturo. We haven't reached out, but it's certainly worth having a conversation.

NunoSempere @ 2023-02-19T23:14 (+8)

Hey, I thought I'd make a Bayesian adjustment to the results of this post. To do this, I am basically ignoring all nuance. But I thought that it might still be interesting. You can see it here: https://nunosempere.com/blog/2023/02/19/bayesian-adjustment-to-rethink-priorities-welfare-range-estimates/

MichaelStJules @ 2023-02-20T06:46 (+9)

May be worth also updating on https://forum.effectivealtruism.org/posts/WfeWN2X4k8w8nTeaS/theories-of-welfare-and-welfare-range-estimates. Basically, you can roughly decompose the comparison as (currently achievable) peak human flourishing to the worst (currently achievable) human suffering (torture), and then that to the worst (currently achievable) chicken suffering. You could also rewrite your prior to be over each ratio (as well as the overall ratio), and update the joint distribution.

NunoSempere @ 2023-02-20T14:42 (+2)

Seems like a good idea, but also a fair bit of work, so I'd rather wait until RP releases their value ratios over actually existing humans and animals, and update on those. But if you want to do that, my code is open source.

Bob Fischer @ 2023-02-20T21:51 (+28)

Thanks for all this, Nuno. The upshot of Jason's post on what's wrong with the "holistic" approach to moral weight assignments, my post about theories of welfare, and my post about the appropriate response to animal-friendly results is something like this: you should basically ignore your priors re: animals' welfare ranges as they're probably (a) not really about welfare ranges, (b) uncalibrated, and (c) objectionably biased. 

You can see the posts above for material that's relevant to (b) and (c), but as evidence for (a), notice that your discussion of your prior isn't about the possible intensities of chickens' valenced experiences, but about how much you care about those experiences. I'm not criticizing you personally for this; it happens all the time. In EA, the moral weight of X relative to Y is often understood as an all-things-considered assessment of the relative importance of X relative to Y. I don't think people hear "relative importance" as "how valuable X is relative to Y conditional on a particular theory of value," which is still more than we offered, but is in the right ballpark. Instead, they hear it as something like "how valuable X is relative to Y," "the strength of my moral reasons to prioritize X in real-world situations relative to Y," and "the strength of my concern for X relative to Y" all rolled into one. But if that's what your prior's about, then it isn't particularly relevant to your prior about welfare-ranges-conditional-on-hedonism specifically.

Finally, note that if you do accept that your priors are vulnerable to these kinds of problems, then you either have to abandon or defend them. Otherwise, you don't have any response to the person who uses the same strategy to explain why they assign very low value to other humans, even if the face of evidence that these humans matter just as much as they do.

NunoSempere @ 2023-02-21T18:50 (+2)

I agree with a), and mention this somewhat prominently in the post, so that kind of sours my reaction to the rest of your comment, as it feels like you are answering to something I didn't say:

The second shortcut I am taking is to interpret Rethink Priorities’s estimates as estimates of the relative value of humans and each species of animal—that is, to take their estimates as saying “a human is X times more valuable than a pig/chicken/shrimp/etc”. But RP explicitly notes that they are not that, they are just estimates of the range that welfare can take, from the worst experience to the best experience. You’d still have to adjust according to what proportion of that range is experienced, e.g., according to how much suffering a chicken in a factory farm experiences as a proportion of its maximum suffering.

and then later:

Note that I am in fact abusing RP’s estimates, because they are welfare ranges, not relative values. So it should pop out that they are wrong, because I didn’t go to the trouble of interpreting them correctly.

In any case, thanks for the references re: b) and c)

Re: b), it would in fact surprise me if my prior was uncalibrated. I'd also say that I am fairly familiar with forecasting distributions. My sense is that if you wanted to make the argument that my estimates are uncalibrated, you can, but I'd expect it'd be tricky.

Re: c), this is if you take a moral realist stance. If you take a moral relativist stance, or if I am just trying to describe that I do value, you have surprisingly little surface to object  to.

Otherwise, you don't have any response to the person who uses the same strategy to explain why they assign very low value to other humans, even if the face of evidence that these humans matter just as much as they do.

Yes, that is part of the downside of the moral relativist position. On the other hand, if you take a moral realist position my strong impression is that  you still can't convince e.g., a white supremacist, or an egoist, that all lives are equal, so you still share that downside. I realize that this is a longer argument though.

Anyways, I didn't want to leave your comment unanswered but I will choose to end this conversation here (though feel free to reply on your end).

Stan Pinsent @ 2023-01-24T15:19 (+6)

I skimmed the piece on axiological asymmetries that you linked and am quite puzzled that you seem to start with the assumption of symmetry and look for evidence against it. I would expect asymmetry to be the more intuitive, therefore default, position. As the piece says

At just the first-order level, people tend to assume that (the worst) pain is worse than (the best) pleasure is pleasurable. The agonizing ends for non-human animals in factory farms and in the wild seem far worse than the best sort of life they could realize would be good. [...]  it’s hard to find any organisms that risk the worst pains for the greatest pleasures and vice versa.

I would expect that a difference in magnitude between the best pleasure and worst possible is the most obvious explanation, but the piece concludes that these judgments are "far more plausibly explained by various cognitive biases".

As far as I can tell this would suggest that either:

On a slightly separate note, I played around with the BOTEC to check the claim that assuming symmetry doesn't change the numbers much and I was convinced. The extreme suffering-focused assumption (where perfect health is merely neutral) resulted in double the welfare gain of the symmetric assumption (when the increase in welfare as a percentage of the animals' negative welfare range is held constant). 

My main question on this last point is: why use "percentage of the animals' negative welfare range" when "percentage of the animals' total welfare range"  seems more relevant and would not vary at all across different (a)symmetry assumptions?

Travis Timmerman @ 2023-01-24T20:34 (+4)

Thanks for reading that Stan! Good question. I realize now that my report and the post together are a bit confusing because there are two types of symmetry at issue that seem to get blended together. I could have been clearer about this in the report. Sorry about that! 

First, the post mentions the concept of welfare ranges being *symmetrical around the neutral point*. Assuming this means assuming that the best realizable welfare state is exactly as good as the worst realizable welfare state. That is assumed for simplicity, though the subsequent part of the post is meant to show that that assumption matters less than one might think. 

Second, in my linked report, I focus on the concept of *axiological symmetries* which concern whether every fundamental good-making feature of a life has a corresponding fundamental bad-making feature. If we assume this and, for instance, believe that knowledge is a fundamental good-making feature, then we'd have to think that there is a corresponding fundamental bad-making feature (unjustified false belief, perhaps). 

These concepts are closely related, as the existence of axiological asymmetries may provide reason to think that welfare is not symmetrical around the neutral point and vice versa. Nevertheless, and this is the crucial point, it could work out that there is complete axiological symmetry, yet welfare ranges are still not symmetrical around the neutral point. This could be because some beings are constituted in such a way that, at any moment in time, they can realize a greater quantity of fundamental bad-making features than fundamental good-making features (or vice versa).

Axiological asymmetries seem prima facie ad hoc. Without some argument for specific axiological asymmetries and without working out their axiological implications, I do think axiological symmetry should be the default assumption. There's some nice discussion of this kind of issue in the Teresa Bruno-Niño paper cited in the report. In fact, it seems to me that both (what she calls) continuity and unity are theoretical virtues. 

https://www.pdcnet.org/msp/content/msp_2022_0999_11_25_29

Now, even granting what I just wrote about axiological symmetry, perhaps the default assumption should be that welfare is not symmetrical around the neutral point for the reasons you gave. That seems totally reasonable! I personally don't have strong views on this. Though, I do think there is a good evolutionary debunking argument to give for why animals (including humans) might be more motivated to avoid pain than accrue pleasure and why humans might be disposed to be risk-adverse in the roulette wheel example. I'm genuinely not sure how much these considerations suggest that the default is that welfare is not symmetrical around the neutral point. 

Whether welfare is symmetrical around the neutral point is largely an empirical question, though. I wouldn't be surprised if we discover that welfare is not symmetrical around the neutral point. That's a very realistic possibility. Though still a viable possibility, I would be somewhat surprised if we discover any axiological asymmetries. 

Bob Fischer @ 2023-01-24T16:08 (+3)

Thanks for your questions, Stan. Travis wrote the piece on axiological asymmetries and he can best respond on that front. FWIW, I'll just say that I'm not convinced that there's a difference of an order of magnitude between the best pleasure and the worst pain--or any difference at all--insofar as we're focused on intensity per se. I'm inclined to think it's just really hard to say and so I take symmetry as the default position. For all that, I'm open to the possibility that pleasures and pains of the same intensity have different impacts on  welfare, perhaps because some sort of desire satisfaction theory of welfare is true, we're risk-averse creatures, and we more strongly dislike signs of low fitness than the alternative. Point is: there may be other ways of accommodating your intuition than giving up the symmetry assumption.

To your main question, we distinguish the negative and positive portions of the welfare range because we want to sharply distinguish cases where the interventions flips the life from net negative to net positive. Imagine a case where an animal has a symmetrical welfare range and an intervention moves the animal either 60% of their negative welfare range or 60% of their total welfare range. In the former case, they're still net negative; in the latter case, they now net positive. If you're a totalist, that really matters: the "logic of the larder" argument doesn't go through even post-intervention in the former case, whereas it does go through in the latter. 

MichaelStJules @ 2023-07-18T17:02 (+5)

If these estimates will be used as multipliers for a hedonistic/suffering scale based on WFP's pain intensity levels (as was done here recently), then the undiluted experience model might contradict the definition of disabling pain, and probably contradicts the definition of excruciating pain, because these can't be ignored and they take up most or ~all of an animal's attention, by definition. Furthermore, I think what you'd want to do instead anyway, if using WFP's pain scale, is just use an equality model and assess more carefully where an animal is on WFP's pain scale, taking into account potential distractions. Dilution wouldn't change the badness of a given level of suffering (affective component of physical and psychological pain, which is what I think WFP's scale is supposed to capture); it would reduce the level of suffering, and so move the experience towards the milder end of WFP's pain scale. I'm confident that excruciating pain in humans is never or rarely significantly diluted (just through distraction by things other than similarly intense pain), and I doubt that disabling pain is significantly diluted, too.

WFP also has a post on the role of attention here, and, related to this, they wrote (bold mine):

Additionally, the potential for positive welfare may be also overestimated if factors other than attention are not considered. For example, pain caused by traumatic injury or pathological processes may lead to immobility, restricted movement or impaired behavioral responsiveness to potentially pleasurable opportunities [30]. Similarly, sickness, weakness, nausea, dizziness and other debilitating affects may demotivate animals from engaging in physically active, gregarious and positive behaviors [30].

Finally, positive and negative affective states may interact in complex ways other than those considered. For instance, evidence indicates that in environments where animals can engage in motivated behaviors the perceived intensity of pain is reduced. In chickens, experiments conducted by Mike Gentle two decades ago [21,31] have shown that the higher the motivation to engage in a behavior (hence attention diverted to it), the higher the degree of endogenous analgesia mediated by opioids. The possibility to express positive behaviors may therefore inhibit pain that would otherwise be felt as Hurtful or Annoying (pain of higher intensity cannot, by definition, be eliminated with distraction).

 

I also worry about most of the qualitative/non-quantitative models basically double-counting animals' responses, if used as multipliers for WFP's pain scale. Some animals may just not be capable of experiencing excruciating pain at all, but that should just be captured in the probability that they are in fact experiencing excruciating (or disabling) pain under given conditions, not as a multiplier for the badness of excruciating pain, except possibly for reasons that really do stack on top of excruciating pain. Maybe the number of JNDs or conscious subsystems stack on top, which are reflected in the quantitative models, but few if any of the qualitative indicators seem like they should stack on top.

 

I would personally shift the probabilities assigned to the qualitative models to the equality model, when you want to use the welfare ranges estimates as multipliers for a WFP pain intensity scale.[1]

  1. ^

    But then, this also makes a uniform prior across the original subset of models look weird/suspicious.

    Each of the eight models was assigned an equal probability of being correct.

    Should we instead use a uniform prior over the new subset of models for multiplying WFP scales? Your credences in the models shouldn't be sensitive to something like this.

MichaelStJules @ 2023-03-15T19:56 (+5)

Do the estimates for black soldier flies primarily reflect adults? If we wanted to use an estimate for BSF larvae or mealworms, should we use the BSF estimates, the silkworm estimates (which presumably reflect the larvae, or else you'd call them silkmoths?), something in-between (an average?) or something else?

Bob Fischer @ 2023-03-17T09:23 (+5)

Great question, Michael. It's probably fine to use the silkworm estimates for this purpose.

Anthony DiGiovanni @ 2024-12-04T09:55 (+4)

So, given our methodological commitment to letting the empirical evidence drive the results, we decided not to include this hypothesis in our calculations

I'm not sure I understand this reasoning. If our interpretation of the empirical evidence depends on whether we accept different philosophical hypotheses, it seems like the results should reflect our uncertainty over those hypotheses. What would it mean for claims about weights on potential conscious experiences to be driven purely by empirical evidence, if questions about consciousness are inherently philosophical?

Keyvan Mostafavi @ 2024-03-05T19:01 (+4)

@Laura Duffy @Bob Fischer 
A question about your methodology : If I understand correctly, your placeholders are probability-of-sentience-adjusted, but your key takeaways are not (since they are "conditional on sentience").
Why having adjusted for sentience in your placeholders but not in your key takeaways ?

Bob Fischer @ 2024-03-06T15:37 (+4)

Good question, Keyvan. This was pragmatic: our main goal was to make a point about welfare ranges, not p(sentience), so we wanted to discuss things that way in the key takeaways. But knowing people would want a single number per species to play with in models, we figured we should give people placeholders that are already adjusted.

Keyvan Mostafavi @ 2024-03-06T15:52 (+1)

Thanks for your reply Bob :)

Vasco Grilo @ 2024-02-28T12:10 (+4)

Hi Bob,

Could you clarify how you aggregated the welfare range distributions from the 8 models you considered? I understand you gave the same weight to all of these 8 models, but I did not find the aggregation method here.

I would obtain the final cumulative distribution function (CDF) of the welfare range aggregating the CDFs of the 8 models with the geometric mean of odds, as Epoch did to aggregate judgement-based AI timelines. I think Jaime Sevilla would suggest using the mean in this case:

If you are not aggregating all-considered views of experts, but rather aggregating models with mutually exclusive assumptions, use the mean of probabilities.

However, I would say the 8 welfare range models are closer to the "all-considered views of experts" than to "models with mutually exclusive assumptions". In addition:

  1. ^

    For the question "What is the unconditional probability of London being hit with a nuclear weapon in October?", the 7 forecasts were 0.01, 0.00056, 0.001251, 10^-8, 0.000144, 0.0012, and 0.001. The largest of these is 1 M (= 0.01/10^-8) times the smallest.

Vasco Grilo @ 2023-06-15T13:34 (+4)

Hi Bob,

Do you have any thoughts on the feasibility of extending your framework to estimate the welfare range of non-biological systems, namely advanced AI models like GPT-4? It naively looks like some of the models you considered to estimate the welfare ranges could apply to AI systems. I wish discussions about artificial sentience moved from "is this AI system sentient" to "what is the expected welfare range of this AI system"...

Bob Fischer @ 2023-06-16T14:59 (+4)

Short version: strongly agree with you about the importance of shifting the conversation from sentience to welfare ranges, but I think that the issue is basically intractable given hedonism at this juncture, as we have no reason to think that any of the states that could be mental states in AI systems are type identical to any of the states in biological organisms. It isn't intractable given other theories of welfare, though, and depending on your views about what moral weights represent, a "moral weight" for AI systems might still be available. However, we'd need a different methodology for that than the one we outline here.

Moritz Stumpe @ 2023-11-17T10:52 (+3)

Thanks and congratulations to the RP team for your work on this. This is incredibly thorough and useful!

Having looked at the whole Moral Weight Project sequence in some detail, I have some uncertainties around the following question/objection that you list above:
“Your literature review didn’t turn up many negative results. However, there are lots of proxies such that it’s implausible that many animals have them. So, your welfare range estimates are probably high.”

In your response you write that this is a good objection.

However, as I understand it, whenever proxies were unknown, you assumed these to be zero (i.e. not present). For instance, in your methodology writeup, I read: "Assigning proxies labeled “Unknown” zero probability of being present is certainly leading to underestimates of the welfare ranges and probabilities of sentience."

Somehow I cannot square these two statements. Can you solve that seeming contradiction for me?

Bob Fischer @ 2023-11-17T16:11 (+5)

Thanks for your question, Moritz. We distinguish between negative results and unknowns: the former are those where there's evidence of the absence of a trait; the latter are those where there's no evidence. We penalized species where there was evidence of the absence of a trait; we gave zero when there was no evidence. So, not having many negative results does produce higher welfare range estimates (or, if you prefer, it just reduces the gaps between the welfare range estimates).

Moritz Stumpe @ 2023-11-23T03:05 (+1)

Thanks for the explanation Bob. That absolutely makes sense! I was somehow assuming that negative results would count as zeros as well.

Finngoeslong @ 2023-02-06T10:05 (+3)

Thanks for the writeup. Not an area I know much about. Interested to hear what you think the priorities are for further research in this area. 

I liked the common questions & responses section - very helpful for someone like me who is new to this topic.

What surprised me - perhaps it shouldn't have done - is that you think it's plausible that some animals have a welfare higher than humans... 

Bob Fischer @ 2023-02-06T12:38 (+5)

Appreciate the comment!

Re: further research priorities, there are "within paradigm" priorities and "beyond paradigm" priorities. As for the former, I think the most useful thing would be a more thorough investigation of theories of valence, as I think we could significantly improve the list of proxies and our scoring / aggregation methods if we had a better sense of which theories are most promising. As for the latter, my guess is that the most useful thing would be figuring out whether, given the hierarchicalism, there are any limits at all on discounting animal welfare simply because it belongs to animals. My guess is "No," which is one of the problems with hierarchicalism, but it would be good to think this through more carefully.

Re: some animals having larger welfare ranges than humans, we don't want to rule out this possibility, but we don't actually believe it. And it's worth stressing, as we stress here, that this possibility doesn't have any radical implications on its own. It's when you combine it with other moral assumptions that you get those radical implications.

Sabs @ 2023-01-23T15:03 (+3)

so although I'm not worth only 3 chickens, the key takeaway is that I'm worth around 50 chickens, is that the deal?

Bob Fischer @ 2023-01-23T15:13 (+18)

Thanks for your question, Sabs. Short answer: if (a) you think of your value purely in terms of the amount of welfare you can generate, (b) you think about welfare in terms of the intensities of pleasures and pains, (c) you're fine with treating pleasures and pains symmetrically and aggregating them accordingly,  and (d) you ignore indirect effects of benefitting humans vs. nonhumans, then you're right about the key takeaway.  Of course, you might not want to make those assumptions! So it's really important to separate what should, in my view, be a fairly plausible empirical hypothesis--that the intensities of many animals' pleasures and pains are pretty similar to the intensities of humans' pleasures and pains--from all the philosophical assumptions that allow us to move from that fairly plausible empirical hypothesis to a highly controversial philosophical conclusion about how much you matter.

Nathan Young @ 2023-01-23T15:18 (+24)

I think you should put this in big letters on the graph and that Peter should write it in his tweet thread. Currently this is going to get misunderstood and since you can predict this, I suggest it's your responsibility to avoid it.

That graph and all tables need to be hard to share without the provisos you've given here.

Peter Wildeford @ 2023-01-23T16:00 (+20)

Added clarification to Twitter thread - thanks

Bob Fischer @ 2023-01-23T15:29 (+2)

Thanks, Nathan. This is a good point.

Nathan Young @ 2023-01-23T15:41 (+8)

In particular Edouard of Our World In Data said that they really care about their graphs being understood well and that when they see a graph being mistaken or with a bad legend they change it. 

I think this is the right approach to ensure that graphs are shared with context.

Bob Fischer @ 2023-01-23T16:56 (+6)

I've redone the summary image, Nathan. Thanks again for recommending this.

Lizka @ 2023-01-23T17:49 (+13)

Really appreciate this thread ^. I'm impressed that something misleading got pointed out by Nathan/Sabs and then was immediately improved. 

Minor comment: I'd maybe re-title the image to something like "For each species, an estimate of their welfare range" or "Estimated welfare ranges per year of life of different species" ? I find "Placeholder Welfare Range Estimates (Life Years)" somewhat hard to parse. Although having written this, I'm not sure that my suggestions are better. 

(And thanks for writing the post and working on this project!)

Bob Fischer @ 2023-01-23T17:57 (+4)

Good of you to say, Lizka. Thanks.

Re: the title of the image, that's a helpful suggestion. I'm genuinely unsure what's best. The most accurate title would be something like, "Welfare range estimates by species for welfare-to-DALYs-averted conversions," but that doesn't win any awards for accessibility.

MichaelStJules @ 2023-01-23T18:08 (+10)

It's also per period of time, and humans live much longer than chickens.

emre kaplan @ 2023-01-23T15:19 (+15)

I will respond with my interpretation of the report, so that the author might correct me to help me understand it better.

If you ask "If we have an option between preventing the birth of Sabs versus preventing the birth of an average chicken, how many chickens is Sabs worth?" then Sabs might be worth -10 chickens since chickens have net negative lives whereas you (hopefully) have a net positive life.

If you ask "Let's compare a maximally happy Sabs and maximally happy chickens, how many chickens is Sabs  worth?", I don't think these estimates respond to that either. It might be the case that chickens have a very large welfare range, but this is mostly because they have a potential for feeling excruciating pain even though their best lives are not that good.

I think you need to complement  this research with "how much the badness of  average experiences of animals compare with each other" to answer your question. This report by Rethink Priorities seems to be based on the range between the worst  and the best experiences for each species.

Bob Fischer @ 2023-01-23T15:28 (+27)

This is exactly right, Emre. We are not commenting on the average amount of value or disvalue that any particular kind of individual adds to the world. Instead, we're trying to estimate how much value different kinds of individuals could add to the world. You then need to go do the hard work of assessing individuals' actual welfare levels to make tradeoffs. But that's as it should be. There's already been a lot of work on welfare assessment; there's been much less work on how to interpret the significance of those welfare assessments in cross-species decision-making. We're trying to advance the latter conversation.

Chris Said @ 2024-12-02T14:26 (+1)

Bob, do you have any recommendations for where I could find estimates of the welfare of common farmed animals, ideally including chickens, pigs, cows, and shrimp? I found some "Life Quality" scores in the Supplementary Materials of Scherer (2018), but it often scores farmed cows as having a much lower life quality than farmed pigs, which seems implausible to me. 

https://link.springer.com/article/10.1007/s11367-017-1420-x

Bob Fischer @ 2024-12-03T11:43 (+2)

That's a tough one, Chris. I assume you're looking for something like, "On a -1 to 1 scale, the average welfare of broiler chickens is -0.7, the average welfare of pigs is -0.1, the average welfare of cattle is 0.2, etc." Is that right? The closest thing to that would be the scores that Norwood and Lusk give in Compassion by the Pound, though not for shrimp, and I also tend to think that their numbers skew high. For the most part, animal welfare scientists aren't interested in scoring welfare on a cardinal scale, so it's an oddity when they try. (Marc Bracke is one exception, though I don't think you're going to get what you want from his papers either.) I'm sorry that I can't be of more help!

Chris Said @ 2024-12-03T13:28 (+1)

Thanks Bob, much appreciated.

> For the most part, animal welfare scientists aren't interested in scoring welfare on a cardinal scale, so it's an oddity when they try.

Just to confirm, you and Rethink Priorities are using a cardinal scale for your welfare ranges, right? So when you say that a cow has a welfare range of 0.5, you implicitly mean that there is some universal scale where a cow's minimal welfare is -0.25 and maximum is +0.25 (or shifted if we don't assume symmetry).

I guess I’m confused on why there isn’t more work on estimating the average realized values of welfare, both from Rethink and from other animal welfare scientists. Those values are necessary for foundational claims like “eating 1000 calories of beef creates demand for X units of suffering”, or "moving cows to a pasture will increase welfare by Y units".

Bob Fischer @ 2024-12-03T15:27 (+4)

Yes, Chris: we're using a cardinal scale. To your point about estimating the average realized values of welfare, I agree that this would be highly valuable. Animal welfare scientists don't do it because they don't face decisions that require it. If you're primarily responsible for studying broiler welfare, you don't need to know how to compare broiler welfare with pig welfare. You just need to know what to recommend to improve broiler welfare. As for RP, we'd love to work on this and I've proposed such projects many times. However, this work has never been of sufficient interest to funders. If that changes, you can bet I'll devote a lot of time to it!

emre kaplan @ 2023-01-23T15:37 (+1)

Thank you for the prompt reply Bob. Just to be clear, I am happy about the scope of this project and am impressed by its quality. I do not intend to criticise the report for being mindful about its scope.

Bob Fischer @ 2023-01-23T15:51 (+4)

Didn't take it that way at all! I appreciate your taking the time to comment and help clarify what we've done.

Dawn Drescher @ 2024-05-07T19:55 (+2)

I love this research! Thank you so much for doing it!

My gut reaction to the results is that it's odd that humans are so high up in terms of their capacity for welfare. Just as an uninformative prior, I would've expected us to be somewhere in the middle. Less confidently, I would've expected a similar number of orders of magnitude deviation from the human baseline in either direction, within reason. E.g. +/- ~.5 OOM.

Plus, we are humans, so there's a risk that we're biased in our favor. It could be simply a bias from our ability to emphasize with other humans. But it could also be the case that there are countless more markers of sentience that humans don't have (but many other sentient animals do) that we are prone to overlook.

Have you investigated what the sources of this effect might be? There might be any number of biases at work as I mentioned, but perhaps our lives have become so comfy most of the time that we perceive slight problems very strongly (e.g., a disapproving gaze). If then something really bad happens, it feels enormously bad?

(I've in the past explicitly assumed that most beings with a few (million) neurons have a roughly human capacity for welfare – not because I thought that was likely but because I couldn't tell in which direction it was off. Do you maybe already have a defense of the results for people like me?)

In any case, I'll probably just adopt your results into my thinking now. I don't expect them to change my priorities much given all the other factors.

Thank you again! <3

Update: When I mentioned this to a friend on a hike, I came up with two ways in which the criteria might be amended to include nonhuman ones: (1) In may cases, we probably have a theory for why a particular behavior or feature is likely to be indicative of conscious experience. Understanding this mechanism, we can look for other systems that might implement the same mechanism, sort of how the eyes of humans, eagles, and flies are very different but we infer that they are probably all for the purpose of vision. (2) Maybe a number of animals that show certain known criteria for consciousness also share suspiciously consistently some other features. One could then investigate whether these features are also indicative of consciousness and whether there are other animals that have these new features at the expense of the older, known ones. (The analysis could cluster features that usually co-occur to not overweight causally related features in cases where many of them are observable.)

Vasco Grilo @ 2023-03-03T18:41 (+2)

Hi Bob,

Great work!

I think it would be nice to have all the estimates in the table here with 3 significant digits, in order not to propagate errors. I understand more digits may give a sense of false precision, but you provide the 5th and 95th percentiles in the same table, so I suppose the uncertainty is already being conveyed.

Why do you give estimates for the median moral weight, instead of the mean moral weight? Normally, we care about expectations...

Bob Fischer @ 2023-03-03T21:38 (+5)

Thanks, Vasco!

Short version: I want to discourage people from using these numbers in any context where that level of precision might be relevant. That is, if the sign of someone's analysis turns on three significant digits, then I doubt that their analysis is action-relevant. 

As for medians rather than means, our main concern there was just that means tend to be skewed toward extremes. But we can generate the means if it's important!

Finally, I should stress that I'm seeing people use these "moral weights" roughly as follows: "100 humans = ~33 chickens (100*.332= ~33)." This is not the way they're intended to be used. Minimally, they should be adjusted by lifespan and average welfare levels, as they are estimates of welfare ranges rather than all-things-considered estimates of the strength of our moral reasons to benefit members of one species rather than another. 

Vasco Grilo @ 2023-03-05T15:16 (+2)

Hi again,

Sorry, I forgot to touch on this point:

As for medians rather than means, our main concern there was just that means tend to be skewed toward extremes. But we can generate the means if it's important!

Do you think the extremes of your moral weight distributions are reasonable? If so, even if the mean is skewed towards them, it would become more accurate. Anyways, I would say sharing the mean would be important, such that people could see how much influence extremes have (i.e. how heavy-tailed is the moral weight distribution).

Bob Fischer @ 2023-03-09T14:04 (+4)

Sorry for the slow reply, Vasco. Here are the means you requested. My vote is that if people are looking for placeholder moral weights, they should use our 50th-pct numbers, but I don't have very strong feelings on that. And I know you know this, but I do want to stress for any other readers that these numbers are not "moral weights" as that term is often used in EA. Many EAs want one number per species that captures the overall strength of their moral reason to help members of that species relative to all others, accounting for moral uncertainty and a million other things. We aren't offering that. The right interpretation of these numbers is given in the main post as well as in our Intro to the MWP.

Vasco Grilo @ 2023-03-09T15:07 (+2)

Thanks for clarifying and sharing the means, Bob! There are some significant differences to the medians for some species, so it looks like it would be important to see whether the extremes of the distributions are being well represented.

Vasco Grilo @ 2023-03-04T17:42 (+2)

Thanks for clarifying!

Short version: I want to discourage people from using these numbers in any context where that level of precision might be relevant.

I thought this would be the reason. That being said, I still think it makes sense to present the results with 2 or 3 significantdigits whenever the uncertainty is already being conveyed. For example, if I say the mean moral weight is 1.00, and the 5th and 95th  percentiles are 0.00100 and 1.00 k, it should be clear that the result is pretty uncertain, even though all numbers have 3 significant digits.

That is, if the sign of someone's analysis turns on three significant digits, then I doubt that their analysis is action-relevant.

I agree in general, but wonder whether for some cases it may matter in a non-crucial way. For example, the ratio between 1.50 and 2.49 is 0.602 without rounding, but 1 if we round both numbers to 2. An error of a factor of 0.602 may not be crucial, but it will not necessarily be totally negligible either.

Finally, I should stress that I'm seeing people use these "moral weights" roughly as follows: "100 humans = ~33 chickens (100*.332= ~33)." This is not the way they're intended to be used.

Ahah, I agree! They are supposed to be used as follows: "100 chickens = 100*0.332 humans = 33.2 humans". One should always be careful not to interpret the moral weight of chickens relative to humans as that of humans relative to chickens, and also present the final result with 3 significant digits instead of 2.

Jokes apart, when I read "[based on RP's median moral weights] 100 chickens = 33.2 humans", I assume we are considering the duration and intensity of experience (relative to the moral weight) are the same for both humans and chickens, because that is what the moral weight alone tells us. However, if one says "saving x humans equals saving y chickens", I agree the moral weights have to be combined with other variables, because now we are describing the consequences of actions instead of just a direct comparison of experiences.

MichaelStJules @ 2023-01-28T21:02 (+2)

What follows are some probability-of-sentience- and rate-of-subjective-experience-adjusted welfare range estimates.

The probability of sentience is multiplied through here, right? Some of these animals are assigned <50% probability of sentience but have nonzero probability of sentience-adjusted welfare ranges at the median. Another way to present this would be to construct the random variable that's 0 if they're not sentient, and then equal to the random variable representing their moral weight conditional on sentience. This would be your actual distribution of welfare ranges for the animal, accounting for their probability of sentience. That being said, what you have now might be more useful to represent a range of expected moral weights for (approximately) risk-neutral EV-maximizing utilitarians, to represent deep uncertainty or credal fragility.

Henry Howard @ 2023-01-23T13:36 (+2)

The use of expected value doesn't seem useful here. Your confidence intervals are huge (95% confidence interval for pig suffering capacity relative to humans is between 0.005 to 1.031). Because the implications are so different across that spectrum (varying from basically "make the cages even smaller, who cares" at 0.005 to "I will push my nan down the stairs to save a pig" at 1.031) it really doesn't feel like I can draw any conclusions from this.

Bob Fischer @ 2023-01-23T13:43 (+10)

Fair enough, Henry. We have limited faith in the models too. But as we said:

  1. The numbers are placeholders.
  2. Our actual views are summarized in the key takeaways and again toward the end (e.g., within an order of magnitude of humans for vertebrates--0.1 or above--which certainly does make a practical difference).
  3. This work builds on everything else we've done and is not, all on its own, the complete case for relatively animal-friendly welfare range estimates.
Laura Duffy @ 2023-01-23T16:37 (+14)

To follow up on Bob's point, the ranges presented here are from a mixture model which combines the results from several models individually. You can see the results for each model here: https://docs.google.com/spreadsheets/d/1SpbrcfmBoC50PTxlizF5HzBIq4p-17m3JduYXZCH2Og/edit?usp=sharing 

For example, the 0.005 arises because we are including the neuron count model of welfare ranges in our overall estimates. If you don't include this model (as there are good reasons not to, see https://forum.effectivealtruism.org/posts/Mfq7KxQRvkeLnJvoB/why-neuron-counts-shouldn-t-be-used-as-proxies-for-moral) then the 5th percentile welfare range for pigs of all models combined is 0.20. 

The 1.031 comes from a model called the "Undiluted Experiences" model, which suggests that animals with lower cognitive abilities have greater welfare ranges because they are not as able to rationalize their feelings (eg. pets being anxious when you're packing for a trip). A somewhat different model would be the "Higher-Lower Pleasures" model that is built on the idea that higher cognitive capacities means you can experience more welfare (akin to the JS Mill idea of higher-order pleasures). Under this model, we estimate that the range for pigs is 0.23 to 0.49--which is quite significant given how this model could be seen as having a pro-human bias! 

In sum, the welfare ranges presented above reflect our high degree of uncertainty surrounding how to think about measuring welfare. As such, we invite you to take a closer look at each model (you'll find most of them converge on the overall conclusion that vertebrates are within an order of magnitude of humans in terms of their welfare ranges). 

mvolz @ 2023-01-24T16:25 (+1)

I'm curious whether you've indicated parental care is "present" or "absent" in bees, however, I have briefly checked the documents linked and couldn't find where that lives but maybe I missed it. Can anyone link to that documentation?

(Bees provide care to young, but it's primarily done by siblings, not parents, so it's considered alloparental care, not parental care. I should think that probably counts, but wasn't sure.)

Bob Fischer @ 2023-01-24T19:55 (+5)

Sorry about the confusion, mvolz. The table with the models is tricky to navigate. Here's the one we shared originally, which is clearer. Short answer: yes, we said it was present.