Valuing Impacts Across Species: A Research Agenda

By Bob Fischer, Hayley Clatterbuck, arvomm, David_Moss, Derek Shiller @ 2024-11-04T11:55 (+45)

Rethink Priorities' Worldview Investigations Team is sharing research agendas that provide overviews of some key research areas as well as projects in those areas that we would be keen to pursue. This is the first of three such agendas to be posted in November 2024.

If we want to do the most good per dollar, then we need to measure the value of various outcomes on a common scale. In some contexts, however, those outcomes include benefits to members of different species, forcing a question about the relative value of human and nonhuman welfare. Unsurprisingly, any answer involves taking stances on a host of controversial claims about which there’s deep uncertainty.

As Holden Karnofksy once observed, this uncertainty matters:

Some people think that animals such as chickens have essentially no moral significance compared to that of humans; others think that they should be considered comparably important, or at least 1-10% as important. If you accept the latter view, farm animal welfare looks like an extraordinarily outstanding cause, potentially to the point of dominating other options: billions of chickens are treated incredibly cruelly each year on factory farms, and we estimate that corporate campaigns can spare over 200 hens from cage confinement for each dollar spent. But if you accept the former view, this work is arguably a poor use of money.

With the practical implications in mind, this research agenda identifies the main philosophical uncertainties associated with choosing between human- and animal-focused causes. (We bracket all the empirical uncertainties, important as they are.) Then, it identifies several projects that could reduce our uncertainty about the relative value of human and animal welfare.

Differences in Capacity for Welfare

In EA and EA-adjacent cause prioritization, people typically assume that:

The objective is to maximize the expected value (EV) gained per dollar spent.
The value of an action is equivalent to its net welfare impacts.

For now, let’s make these assumptions. Then, suppose we’re trying to decide between two actions:

Donate $X to an organization that tries to prevent the deaths of human children from malaria by distributing bednets.
Donate $X to an organization that tries to improve the lives of intensively farmed chickens via a corporate campaign.

Suppose we know all the empirical facts about the consequences of (A) and (B), such as the number of individuals who would be affected if the action achieved its intended effect, the probability of the action achieving its intended effect, and so on. Still, if we’re uncertain about how much to value a benefit to a human vs. a benefit to a chicken, we’ll be uncertain about the EV of (A) vs. (B).

Given that we’re focused on welfare, though, there’s a plausible way to begin valuing the effects:

Treat the value of saving a child’s life as being equivalent to the welfare gain from saving that life.^[1]
Estimate the welfare gain associated with improved farm conditions for a single chicken.
Make the plausible assumption that all welfare gains count equally, regardless of whose welfare we’re considering.
Multiply by the relevant numbers of individuals to get the total amounts of welfare at stake in benefitting humans vs. birds.

While each step in this process is controversial, we’ll focus on (2) for the time being. So, how do we estimate the welfare gain associated with improved farm conditions for a single chicken?

The standard strategy involves combining species-relative welfare assessment tools with estimates of chickens’ capacity for welfare. The welfare assessment tool tells you how things are going for a chicken. A capacity for welfare estimate tells you how things could be going for a chicken compared to a human. So, we can use that capacity for welfare estimate to convert that species-relative welfare assessment into a human-equivalent: it allows us to say that if a chicken is half as well off as it can be, then, if a chicken has 1/10th the capacity for welfare of a human, that’s equivalent to a human’s being 1/20th as well off as they can be:

1 🐓 × 0.1 (i.e., chickens’ capacity for welfare relative to humans) × 50% as well off as possible for one year (i.e., an illustrative species-relative welfare assessment) = 0.05 human-equivalent welfare-adjusted life years

If we think that this basic approach is worth exploring, then we’re left with two main uncertainties:

How can we estimate chickens’ capacity for welfare?
Given an estimate of chickens’ capacity for welfare, how can we generate the species-relative welfare assessments we need to compute human welfare impact equivalents?

In the following subsections, we’re going to work our way through each uncertainty, flagging some of the key puzzles we need to solve to assess the relative cost-effectiveness of preventing malaria deaths vs. trying to improve the lives of intensively farmed chickens. First, though, we’ll see whether there’s an easy way around the question.

Can we dodge the question?

As we’ll see, it’s difficult to estimate differences in capacity for welfare. It’s reasonable, then, to want to avoid the problem entirely. Can we?

It depends on what a sensitivity test reveals. In principle, it could work out that helping chickens is less valuable than preventing malaria deaths on every capacity for welfare estimate that we take seriously. If so, then we don’t need to go any further; more granular information isn’t necessary. For instance, if we’re convinced that a chicken’s capacity for welfare is somewhere between a trillionth and a billionth of a human’s, then it would be fairly surprising if helping chickens were more cost-effective than saving children’s lives. Granted, it’s difficult to know exactly how much welfare we secure when we prevent a malaria death. Presumably, though, it’s quite a lot: we’re preventing the loss of many life years and, even if those life years are far from perfect, they’re probably worth living. Given as much, it’s implausible that benefits to chickens could outweigh the relevant benefits to humans, even given the difference in the number of individuals that each intervention would affect.

But we shouldn’t be convinced that a chicken’s capacity for welfare is somewhere between a trillionth and a billionth of a human’s.

People disagree extensively about the relative importance of human and nonhuman welfare, as Karnofsky flagged at the outset. Such disagreement should lower our confidence that we’ve landed on the correct range.
Differences in capacity for welfare rely on contentious philosophical assumptions (more on this below); so, insofar as we’re uncertain about the relevant philosophical issues, we ought to have some uncertainty about differences in capacity for welfare.
Differences in capacity for welfare are partly an empirical matter (more on this below); so, insofar as we’re uncertain about the relevant empirical issues, we ought to have some uncertainty about differences in capacity for welfare.

Given these points, we should be uncertain over a wide range of possible capacity for welfare estimates. And because of the sheer number of potential chicken beneficiaries, this means that our uncertainty is likely to be over estimates that support the conclusion that helping chickens is more cost-effective than helping humans. Thanks to our uncertainty, we probably won’t be able to dodge hard questions about chickens’ capacity for welfare.

The challenge of estimating chickens’ capacity for welfare

We’re left, then, trying to reduce our uncertainty. Unfortunately, estimating animals’ capacity for welfare is difficult. There are four challenges.

Selecting a theory of welfare
Identifying what would count as evidence of variation in the ability to realize the determinants of welfare
Finding evidence of variation in the ability to realize the determinants of welfare
Producing overall estimates in light of that variation

In the rest of this section, we outline each challenge.

Selecting a theory of welfare

The first challenge is to provide a theory about what welfare is. Here are some standard theories:

Welfare is determined by positive and negative valenced states (i.e., hedonism).
Welfare is determined by desire satisfaction and frustration (i.e., desire satisfaction theory).
Welfare is determined by the attainment of or failure to attain various objective goods, such as doing meaningful work and participating in significant relationships (i.e., objective list theory).
Welfare is flourishing, where flourishing is developing and manifesting your species-specific abilities (i.e., Aristotelianism).

The choice between these theories is likely to matter, as theories of welfare differ in how friendly they are to there being large differences in capacity for welfare across species. Some theories of welfare won’t posit any differences at all, like Aristotelianism. Other features may posit some differences, but probably relatively small ones, as they prioritize features we generally share with animals (like pleasures and pains). Other theories of welfare are likely to be more bullish on large differences, as they stress features that we generally don’t share with animals (like participating in certain kinds of significant relationships).

Of course, as we reflect on the various possible determinants of welfare—i.e., those things that make individuals better and worse off—we may find ourselves sympathetic to a pluralistic theory where several things can contribute to and detract from well-being. However, pluralism raises its own hard questions about the relationships among the determinants of value. For instance:

What’s the relative importance of avoiding negatively valenced experiences, such as physical pain, vs. satisfying your desires?^[2]
Are welfare goods fungible, such that the loss of any one can be compensated by an equal gain of any other?

Whatever we say about all this, it’s important to remember that we need a bit more precision in our theory of welfare than we’ve considered here; it isn’t enough to distinguish between hedonism, desire satisfaction theory, objective list theory, Aristotelianism, and pluralism. Our theory of welfare needs to be precise enough to allow us to determine what would count as variation in different individuals’ capacities for welfare. That is, we need our theory of welfare to give a sufficiently granular account of the determinants of welfare.

For example, we can know that welfare is about valenced states without knowing how welfare is affected by variation in the kind, quantity, duration, or intensity of those states. But those details matter a lot: humans and animals probably differ much more with respect to the kinds of experiences they can have than with respect to the intensities of those experiences. Likewise, if our theory is pluralistic, we can know that both desire satisfaction and valenced states matter for welfare without knowing about the relative contributions of these determinants.^[3] And again, that detail matters a lot: humans and animals might differ a lot more with respect to their desires than they do with respect to their valenced states.

Identifying what would count as evidence of variation

In any case, given a sufficiently precise theory of welfare, we come to the second challenge: namely, identifying what would count as evidence of variation in how well or badly off an individual can be—i.e., their capacity for welfare. So, if the determinants of welfare are more and less intense valenced states, then we need to identify what would count as evidence that an individual experiences more or less intense valenced states. This proves to be quite difficult, as we have no agreed-upon metrics for comparing the intensities of valenced states across species—just as we have no agreed-upon metrics for comparing the intensities of desire strength across species, or significance of relationships, or anything else someone might propose as a determinant of welfare.

Moreover, there are immediate complications once we try to generate some metrics. Suppose we want to compare humans and chickens with respect to their capacity for living meaningful lives. Even if we’re just using an intuitive scale, what should it be? Two points, where we only distinguish between high and low meaningfulness? 100 points? 1,000,000? The answer will hugely influence our estimates—and the answer isn’t obvious.

Finding evidence of variation

The third challenge is finding evidence of variation in how well or badly off an individual can be. Unfortunately, we may not have that evidence, thus remaining uncertain about relative differences. This raises difficult questions about when, if ever, we can generalize from data about other species.

Suppose, for example, we think that some cognitive capacity, such as episodic memory, bears on differences in capacity for welfare. We may have evidence for it in some species but not in others. However, that may be largely because no one has bothered to look for this trait in those species. After all, research into the experiences of animals is underfunded, uncoordinated, and generally not motivated by the kinds of ethical considerations that would make it relevant to the problem of interest here. So, we’re often going to face hard questions about when, if ever, we can use phylogenetic relationships to infer the presence of a capacity in one species because it’s present in another.

Producing overall estimates

The fourth challenge is turning the evidence we possess into specific quantitative estimates of differences in capacity for welfare. This can be difficult for several reasons. For example, the evidence may support differences but not of any particular magnitude. Alternatively, there may be various lines of evidence, the collective import of which is unclear. Accordingly, our method for producing estimates needs to be sensitive to our uncertainties. One way to do this is to have the primary output of the process be probability distributions over wide ranges of possible estimates, from which we can extract means, medians, and other point estimates as needed.

How important is each challenge?

Again, there are four challenges associated with estimating animals’ capacity for welfare:

Selecting a theory of welfare
Identifying what would count as evidence of variation in the ability to realize the determinants of welfare
Finding evidence of variation in the ability to realize the determinants of welfare
Producing overall estimates in light of that variation

Rethink Priorities did its best to tackle challenges (2)–(4) in the Moral Weight Project.^[4] And while it offered an argument that (1) is less important than we might initially think, this is a new research area, so it’s unclear what our all-things-considered judgment should be about the choice of a theory of welfare. So, on the face of it, the highest-value research areas probably concern:

The relative differences in capacity for welfare supported by different theories of welfare
How we ought to set up scales to compare not-previously-quantified differences (e.g., in desire strength, meaningfulness, etc.)
Insofar as we’re uncertain about welfare and these scales, how we ought to navigate our uncertainty

The challenge of assessing welfare

Again, suppose we’re trying to decide between two actions:

Donate $X to an organization that tries to prevent the deaths of human children from malaria by distributing bednets.
Donate $X to an organization that tries to improve the lives of intensively farmed chickens via a corporate campaign.

Let’s assume that we’re able to generate an estimate of the differences in capacity for welfare between humans and chickens. The next task is to estimate the amount of welfare secured by a successful chicken intervention. This task relies on the results of welfare assessment instruments, each of which scores an animal’s condition on a set of metrics. For instance, a chicken might be scored on whether they have footpad dermatitis and, if so, the size of the lesion; their feather condition; the regularity of their gait, and so on.

The need for a cardinal scale

Developing and applying welfare assessment instruments is no small feat. It’s difficult to identify possible welfare indicators, to validate those indicators, to collect data about those indicators, and to interpret the results.^[5] More importantly for present purposes, though, most welfare assessment instruments don’t deliver what we’d like—namely, results on a cardinal scale. We need results on a cardinal scale so that we can convert those welfare assessments, via capacity for welfare estimates, into human equivalents. Recall the calculations we mentioned earlier:

Standard welfare assessment instruments don’t tell us that a chicken is “50% as well off as possible for one year.” Instead, they provide ordinal rankings of welfare, as in the widely-used Five Domains framework (see figure),^[6] which don’t allow us to multiply through as just suggested.

This framework identifies four “physical” domains—nutrition, environment, health, and behavior—that feed into the “mental” domain, understood in terms of positive and negative experiences. However, people usually grade the domains using ordinal scales (e.g., an A–E scale), making it unclear how to compare any two overall assessments.

Granted, there are frameworks that appear to use cardinal scales, such as the Welfare Quality framework, as illustrated in the figure below, which depicts hypothetical scores for four farms (and thus, indirectly, the welfare of the animals on those farms) based on domains analogous to the ones in the Five Domains framework.

However, it’s easy to misinterpret the significance of these scores, as (a) there’s no assumption that a score of 100 for feeding is five times better than a 20, as we’d have on a cardinal scale, and (b) these scores can’t be aggregated straightforwardly (“high scores in one principle do not offset low scores in another, so [overall assessments] cannot be based on average scores”^[7]). And without results on a cardinal scale, we can’t use capacity for welfare estimates to convert species-relative assessment into a human-equivalent.

Adapting existing welfare assessment frameworks

Of course, even if welfare assessment instruments aren’t designed to score welfare on a cardinal scale, perhaps the results of these instruments could be mapped to such a scale. This, however, raises its own difficulties. To make this concrete, consider the Welfare Footprint Project’s (WFP) approach. Instead of identifying several domains of welfare assessment, it focuses on the mental domain exclusively—and, more narrowly, on pain. The central ambition of the WFP is to quantify the amount of time that animals spend in four pain categories: (1) annoying, (2) hurtful, (3) disabling, and (4) excruciating. We’ll focus on the first and fourth pain categories. Roughly, annoying pains are “experiences of pain perceived as aversive, but not intense enough to disrupt routine in a way that alters adaptive functioning or affects the behaviors that individuals are motivated to perform,” whereas excruciating pain is pain at a level that is “not normally tolerated even if only for a few seconds” and marks “the threshold of pain under which many people choose to take their lives rather than endure the pain.”

Now, the WFP is silent—and intentionally so—on two questions of interest here. First, it doesn’t tell us how bad annoying pain is. So, it doesn’t tell us that something like, “If a chicken is experiencing annoying pain, then it’s 1/10 as badly off as it can be.” Second and relatedly, it doesn’t tell us how much worse excruciating pain is than annoying pain. The WFP doesn’t say, for instance, that a minute in the fourth pain category is four times worse than a minute in the first pain category—or any other specific relationship.

The upshot is that even if we know that chickens would spend less time in both annoying and excruciating pain after an intervention, and thus that the intervention would be a Pareto improvement, we wouldn’t therefore know the extent of the total welfare benefit. Suppose, for instance, that a chicken would spend 100 fewer hours in annoying pain and one fewer hour in excruciating pain. If we assume that excruciating pain is the worst possible pain and that it’s 4x worse than annoying pain, we can do the math: (100 hours × 25%) + (1 hour × 100%) = 26 hours of excruciating-pain-equivalents. However, if we assume that excruciating pain is the worst possible pain and that it’s 100x worse than annoying pain, we can a very different estimate: (100 hours × 1%) + (1 hour × 100%) = 2 hours of excruciating-pain-equivalents. This problem isn’t unique to the WFP: it’s an issue for all standard welfare assessment tools, none of which is designed to produce an estimate of total welfare impacts on a cardinal scale.^[8]

How important is it to refine our welfare assessment methods?

It clearly matters how we think about the tradeoff between pains of different intensities (and between welfare states generally). However, the question of how much it matters depends on our capacity for welfare estimates. If we posit very large differences in capacity for welfare between humans and chickens, then tradeoff rates between pains of different intensities may not be decision-relevant. That is, if chickens have 0.0000001x humans’ capacity for welfare, then whether excruciating pain is 4x or 1,000x worse than annoying pain may not matter: the difference in capacity for welfare alone will determine what’s best.^[9] However, if we posit very small differences in capacity for welfare between humans and chickens, then tradeoff rates between pains of different intensities may play a major role in our decision-making.

Questioning our two starting assumptions

Again, in EA and EA-adjacent cause prioritization, people typically assume that:

The objective is to maximize EV gained per dollar spent.
The value of an action is equivalent to its net welfare impacts.

Each assumption is controversial. Not everyone thinks you should maximize EV. Some people are risk-averse in one way or another. Others think there are important moral objectives that constrain or compete with maximizing EV, such as respecting people’s rights or reducing suffering.

Moreover, not everyone thinks that the value of an action is equivalent to its net welfare impacts. For instance, some people think that the value of an action is determined by its priority-adjusted welfare impacts—namely prioritarians. On their view, benefits to the less well-off matter more, morally, than benefits to the better off.

In this section, we discuss the significance of alternatives to the two assumptions above.

Uncertainty about EV maximization

Moral decision-making involving human-animal tradeoffs involves deep uncertainties. While we can make (and have made) some progress on the key issues, we still have to make decisions in the interim. EV maximization is one way to make those decisions. However, it isn’t the only way.

Instead, we could be risk-averse in one way or another. Typically, when people think about risk aversion, they think about wanting to avoid worst-case scenarios. All else equal, employing a worst-case risk-averse decision procedure will probably recommend allocating more resources to corporate campaigns for chickens than disease-averting interventions for humans, just given the magnitude of the downside risk should chickens matter a great deal.

However, we can be risk-averse in other ways. If you’re ambiguity-averse, you avoid taking actions on unknown probabilities. Therefore, if you’re more uncertain about the value of animal projects than human ones, you may prioritize the latter. Alternatively, if you’re averse to failing to make a difference—i.e., having your actions come to nothing—then, if you think that humans are much more likely to matter than chickens, you might be more disposed to prioritizing human welfare over chicken welfare, as that would reduce your risk of throwing your money away on beings who aren’t morally important.

It’s an open question whether there are other important forms of risk aversion and, if so, what they favor relative to these options. It’s also important to consider the normative foundations for these forms of risk aversion. However, it’s clear that if we reject EV maximization, that will have significant implications for cause prioritization—perhaps as large or larger than those associated with different capacity for welfare estimates.

Uncertainty about simple aggregation

Many moral frameworks imply that the value of an action is not equivalent to its net welfare impacts. For instance, according to some partially aggregative axiologies (PAAs), there’s no number of mild headaches that’s as morally weighty as a person being tortured. So, even if the net welfare impact of preventing a million mild headaches would be greater than the net welfare impact of preventing a person from being tortured, it doesn’t follow that the value of preventing the headaches would be greater than the value of preventing the person from being tortured. And if we think that chickens have much less capacity for welfare than humans, then we might view even severe harms to chickens as akin to mild headaches. So, even if the net welfare impact of helping chickens is greater than the net welfare impact of preventing childhood mortality, the value of helping chickens could be lower.

In other words, on some PAAs, given sufficiently large differences in capacity for welfare between humans and chickens, chicken welfare can drop out of the moral calculation, as the harms to each individual chicken are more like mild headaches compared to childhood deaths. By contrast, the simple utilitarian never thinks that small quantities of welfare drop out of the calculation, which is why utilitarians face standard objections about many trivial benefits summing to outweigh a single catastrophic harm.

Prioritarianism is another framework on which the value of an action isn’t equivalent to its net welfare impacts. Instead, prioritarianism says that the value of action is equal to its net priority-adjusted welfare impacts. With that in mind, let’s suppose that many farmed chickens have net negative lives and that the humans who are vulnerable to malaria have net positive lives. In that case, the chickens are worse off than the humans. So, if prioritarianism is true, then alleviating a chicken’s suffering relative to helping a human is more important, morally, than the magnitudes of the respective welfare benefits would suggest: because the chicken is less well-off than the human, the welfare benefit to the chicken counts extra.

Finally, we might not think that the value of an action is equivalent to its net welfare impacts due to considerations in population ethics. Corporate campaigns for chickens may well have implications for the number of chickens who come into existence. Suppose the intervention causes more chickens to come into existence by causing producers to switch to slower-growing breeds. If, for instance, we endorse a critical level view in population ethics, we can’t assess whether this would be good without knowing whether these additional chickens would have lives that are above the critical welfare level. Various other positions in population ethics will have similar implications.

The significance of uncertainty about our two starting options

It matters whether we’re uncertain about EV maximization or the claim that the value of an action is equivalent to its net welfare impacts. If these assumptions are in play, then the highest-value research may not be on interspecies welfare comparisons, as our choice of decision theory and/or aggregation method could drive our decision-making given many different capacity for welfare estimates.

Independently, we should be sensitive to these uncertainties, as it’s easy to mistake disagreements about these assumptions with others. For instance, someone might feel skeptical about the view that human and animal welfare have roughly the same value just because they think it would follow that we ought to prioritize animal-focused interventions over the best interventions in global health. However, there are lots of reasons why it might not be true that you ought to prioritize animal-focused interventions even if human and animal welfare have roughly the same value. For instance, maybe we have duties of justice to the global poor that trump our impartial reasons to help animals. In other words, maybe we shouldn’t just maximize EV. Or, maybe there’s something especially valuable about helping conspecifics (literally, individuals of the same species). In other words, maybe the value of an action isn’t equivalent to its net welfare impacts. If we forget these background assumptions, then we’re forgetting major issues that could affect our intuitions elsewhere.

Project proposals

In what remains, we sketch a few projects that seem particularly valuable.

Beyond hedonism

The first round of the Moral Weight Project assumed hedonism, according to which pleasure and pain are the sole determinants of welfare. However, we might be skeptical of hedonism. How should we estimate differences in capacity for welfare given other theories of welfare?

This project develops methods for estimating differences in capacity for welfare based on a standard desire theory and a standard objective list theory. We consider several strategies for estimating differences.

First, we consider a range of philosophical arguments for and against thinking that particular abilities—such as having richer desires or being able to engage in artistic endeavors—bear on individuals’ capacity for welfare.

Second, we consider a thought experiment-based approach that allows people to consider how they would make tradeoffs between goods of different kinds, such as positive hedonic experiences and major personal achievements (which might matter for welfare on both views).

Third, we consider approaches that are structurally similar to the Moral Weight Project, in that they look for behavioral and neurophysiological proxies for the determinants of welfare according to a desire theory and an objective list theory.

Stated and revealed preferences

There’s good reason to try to generate moral weights from first principles, as that can reduce the risk that various biases drive our judgments. However, it may also be valuable to use revealed and stated preferences to calibrate our thinking about what common sense actually says.

A revealed preferences approach would involve examining a wide range of tradeoffs that people actually make between humans and animals, working out the moral weights to which their behavior commits them.

A state preference approach could take two forms, as surveys can be holistic or decompositional. Holistic surveys solicit people’s views about tradeoffs between humans and animals, attempting to identify patterns in those tradeoffs that can be converted into moral weights. Decompositional surveys solicit people’s views about factors that are relevant conditional on certain moral assumptions. Then, these views can be combined with those moral assumptions to produce moral weights.

These surveys can focus on (ostensible) experts, stakeholders, or the general public.

It’s an open question whether there are moral experts (e.g., ethics professors, members of the clergy, Eliezer, etc.). If there are, then it’s possible to pursue the holistic strategy. If not, then there may still be experts regarding the various factors that emerge in decompositional surveys (e.g., asking comparative cognition researchers and consciousness scientists about variation in the possible intensities of pain states). Insofar as we think that some individuals have special insights into the relevant questions, we have reason to ask them what they think.
Alternatively, we might survey stakeholders. These could include those allocating money, those people whose money is being allocated, potential human beneficiaries, human representatives of potential animal beneficiaries, and perhaps a range of others. Insofar as we think that our decision procedure should be broadly democratic, we have reason to ask stakeholders what they think.
Alternatively, we might survey the general public. Insofar as we can trust the wisdom of crowds, we may have reason to ask the general public what it thinks.

Scalar Proxies

The Moral Weight Project explored traits that may bear on the intensities of animals’ valenced states. However, given the limits of current knowledge, we focused on binary rather than scalar assessments of traits—i.e., we assessed traits as possessed or not rather than on a continuous scale. There were practical reasons for this: we rely on an existing body of research that is somewhat unsystematic; comparative cognition has yet to coordinate on scales to assess most traits. Still, it’s plausible that animals vary in the degree to which they possess the relevant traits; moreover, it’s plausible that this variation matters for estimating capacity for welfare.

So, this project does three things. First, it considers the possibilities for creating uniform scales for familiar traits that can be applied across the phylogenetic tree. Second, it explores whether there are other traits that may be useful for estimating capacity for welfare that already have scales associated with them, including a range of neurophysiological traits. Finally, it investigates whether, in the absence of suitable trait-specific scales, we can infer plausible degrees from overall patterns of other cognitive traits, under the assumption that cognitive sophistication overall correlates with the sophistication of specific capacities.

Pain Category Aggregation

The Welfare Footprint Project (WFP) tries to quantify the amount of time that animals spend in four pain categories: (1) annoying, (2) hurtful, (3) disabling, and (4) excruciating. It’s silent on the tradeoff rates between these categories. Different estimates of the tradeoff rates can produce radically different estimates of the total welfare burdens associated with given conditions. So, it matters that we reduce our uncertainty about the relationships between these pain categories. The aim of this project is to consider the implications of animals’ attentional capacities for these tradeoff rates. That is, suppose that the severity of pain is a linear function of the proportion of the attention it consumes. Then, we can assess the lowest proportion of attention that’s plausibly attributed to annoying pain, which would allow us to set an upper threshold for the severity of excruciating pain. For instance, if it’s implausible that annoying pain consumes less than 1% of an animal’s attention, then it follows that excruciating pain is no more than 100x as bad as annoying pain.

The WFP is in the process of developing parallel pleasure categories. So, we could also run a parallel version of this project for positive affective states.

Animal Welfare in Benefit-Cost Analysis

Many governments select policies partially based on benefit-cost analysis (BCA)—i.e., the practice of assessing all costs and benefits of the regulatory options and selecting the one that maximizes net benefits. A key feature of BCA is that, insofar as possible, all benefits and costs are expressed in monetary terms, allowing them all to be compared on a common scale. Some benefits and costs, of course, are naturally expressed in dollars; others aren’t. However, economists have various methods for “monetizing” the value of cleaner air, reductions in vehicular deaths, and the preservation of old-growth forest, imperfect though they may be.

At present, there’s no agreed-upon way to monetize the value of animal welfare. The aim of this project is to explore ways of using capacity for welfare estimates to generate prices for welfare impacts on animals. We’ve already proposed one idealized methodology for doing this, but it may not be politically feasible. So, this project scopes nonideal options that balance the concern to give a principled account of animals’ importance while accommodating interest in consumer preferences.

Welfare Propensities

We’ve said that a capacity for welfare estimate can give us a principled way of converting species-relative welfare assessments into human welfare impact equivalents. However, this might be a simplification in an important respect: we can know an animal’s welfare range without knowing its propensity to “use” that range. That is, suppose that organisms A, B, C, and D have identical welfare ranges: still, A could have a propensity to experience valenced experiences at the maximum ends of its range, B could have a propensity never to experience valence at the extremes, and C and D respectively could have propensities towards positive and negative welfare, respectively. But are there any reasons to believe that there are such propensities? If so, what might they be? What evolutionary pressures could have produced one over the other? And insofar as we’re unsure about the existence of such propensities, can we safely ignore them? Or are there ways of incorporating them into decision-making despite our various uncertainties? On the assumption that there is such a method, one output of this project would be “welfare propensity-adjusted” capacity for welfare estimates, which are better positioned to serve as moral weights in utilitarian decision-making.

Alternatives to Welfare Ranges

The welfare range approach estimates an individual's welfare state by estimating the proportion of their total possible welfare range that they experience. One objection to this methodology is that it asks us to assess a very difficult question (What's the total range of well-being a chicken could possibly experience?) in order to answer seemingly simpler ones (How much does the welfare impact of footpad dermatitis for a chicken compare to the welfare impact of extreme headache from malaria for humans?).^[10] In this project, we explore alternatives to the welfare range approach. Are there methods we can use to directly compare the value of particular experiences across species without knowing what those species’ overall welfare ranges are? We might use the methods mentioned above, such as measuring the attention costs of an experience or asking people about their hypothetical preferences between experiences (e.g., suffering from a malaria headache for an hour as a human versus suffering confinement in a cage for a day as a chicken). By comparing these alternatives to the welfare range methodology, we can get a better appreciation of the benefits and limitations of each approach.

States Worse than Death?

For many of the cost-effectiveness comparisons we want to make, the relative welfare differences between states are all that we need. But for other kinds of decisions, especially those involving population ethics, the absolute value of a state also matters. As we have noted, many actions to benefit chickens will change the number of chickens who come into existence. If there is a threshold amount of welfare an animal must achieve in order for that animal’s life to be worth living (whether the zero/neutral state or some other critical level), then if an additional chicken would be under the threshold, then their existence wouldn’t add value. While we’re unlikely to make progress on foundational issues in population ethics, we may be able to clarify whether chickens’ lives are below the relevant threshold. We use the euthanasia guidelines that veterinarians have developed for companion animals to assess whether it would be appropriate to recommend euthanization at various life stages. If so, that suggests that chickens’ lives are no longer worth living at that life stage. And if those life stages make up the bulk of the life, that’s some evidence that their lives are net negative on the whole.

Moral Weights Under Uncertainty

Something like Rethink Priorities’ Moral Parliament Tool could be used to generate moral weights given moral uncertainty. The steps could include:

Assigning credences to a series of worldviews that include moral weights.
Selecting an aggregation method.
Take the allocation that results and reverse engineer the moral weight to which that allocation would commit you if you were a straightforward utilitarian.
Repeat this process for a range of plausible credence assignments and aggregation assignments.
Treat the geometric mean of the resulting, reverse-engineered moral weights as the moral weight that ought to be used under moral uncertainty in standard utilitarian decision-making.

Acknowledgments

The post was written by Rethink Priorities' Worldview Investigations Team. This post is a project of Rethink Priorities, a global priority think-and-do tank, aiming to do good at scale. We research and implement pressing opportunities to make the world better. We act upon these opportunities by developing and implementing strategies, projects, and solutions to key issues. We do this work in close partnership with foundations and impact-focused non-profits or other entities. If you're interested in Rethink Priorities' work, please consider subscribing to our newsletter. You can explore our completed public work here.

^{^}
This phrasing is intentionally vague. The simplistic version of the calculus just focuses on the welfare gain for the child; a more sophisticated analysis would incorporate the indirect effects on family members, the community, future generations, etc.
^{^}
Depending on how we understand the pluralistic/non-pluralistic distinction, questions like this arise under non-pluralistic theories too. Even given hedonism, for instance, we might ask how we compare differences in the intensity and duration of pain.
^{^}
Pluralism takes different forms. For instance, we might be attracted to a form of hedonism where more than one feature of valenced states matters—e.g., kind and intensity. But in that case, we’ll again face hard questions about the relationships between these states. Valenced experiences come in many different types, such as physical pain, emotional pain from grief, frustration at the lack of free movement, etc. Are physical pains worse than emotional pains? Is the opposite true? Can less intense emotional pleasures outweigh more intense physical pains?
^{^}
For the original version of this work, see the Moral Weight Project Sequence.
^{^}
These challenges are even greater when we consider animals who are more dissimilar to us and about whom less research has been done. For example, compared to chickens, we are less confident about whether many invertebrates, such as shrimp, are sentient in the first place; what constitutes evidence of their welfare; and how best to study the factors that appear to be relevant. Reducing our uncertainty about these questions requires research from many fields, including consciousness science, animal welfare, veterinary science, and more.
^{^}
Mellor D. J. (2017). Operational Details of the Five Domains Model and Its Key Applications to the Assessment and Management of Animal Welfare. Animals : an open access journal from MDPI, 7(8), 60. https://doi.org/10.3390/ani7080060
^{^}
Welfare Quality (2009). Assessment protocol for poultry (broilers, laying hens). Welfare Quality Consortium, Lelystad, Netherlands, https://www.welfarequalitynetwork.net/media/1293/poultry-protocol-watermark-6-2-2020.pdf.
^{^}
A different way to appreciate this problem is to note that there are more general uncertainties about (a) the tradeoff rates between positive and negative welfare and (b) the tradeoff rate between welfare states with the same valence (positive or negative) but different magnitudes. For instance, we might not think that a minute of pain trades off cleanly with a minute of equally intense pleasure, instead taking the view that pains have larger negative welfare impacts than equally intense pleasures have positive welfare impacts. Likewise, we might think that some pains—e.g., the emotional pain associated with the loss of a child—are exponentially worse than even excruciating physical pains. (Or vice versa!)
^{^}
An additional (though smaller) factor is that capacity for welfare may be asymmetrical, in the sense that some animals may have greater capacity for negative welfare than positive welfare. However, unless we posit very small differences in capacity for welfare across species, such asymmetries are probably ignorable.
^{^}
Following Vapnik’s famous advice: "When solving a problem of interest, do not solve a more general problem as an intermediate step."

Vasco Grilo🔸 @ 2025-12-12T11:26 (+4)

Thanks for sharing!

What you would do to decrease the uncertainty about interspecies comparisons of expected hedonistic welfare as much as possible with 1 k, 10 k, 100 k, 1 M, and 10 M$? The picks should account not only for the outcomes of the research which was directly funded, but also for any additional research that is done to decrease the uncertainty further (supported by other funds).

I think Ambitious Impact (AIM), Animal Charity Evaluators (ACE), and the Animal Welfare Fund (AWF) use the welfare ranges initially presented by Rethink Priorities (RP), or the ones in Bob's book as if they are within a factor of 10 of the right estimates (such that these could 10 % to 10 times as large). However, I believe the differences could be much larger. For example, the estimate in Bob's book for the welfare range of shrimps is 8.0 % that of humans, but I would say it would be quite reasonable for someone to have a best guess of 10^-6, the ratio between the number of neurons of shrimps and humans.

SummaryBot @ 2024-11-04T14:49 (+1)

Executive summary: Estimating the relative moral value of human vs. animal welfare involves deep philosophical and empirical uncertainties, but is crucial for cause prioritization between human- and animal-focused interventions.

Key points:

Estimating animals' capacity for welfare is challenging, requiring choices about welfare theories, evidence of variation, and quantification methods.
Existing animal welfare assessment tools lack cardinal scales needed to compare across species, limiting their usefulness for prioritization.
Alternatives to expected value maximization and simple welfare aggregation could significantly impact cause prioritization decisions.
Key research priorities include: developing non-hedonic welfare estimates, eliciting preferences on moral weights, creating scalar trait measures, determining pain intensity tradeoffs, and incorporating animal welfare into policy analysis.
Resolving uncertainties about welfare propensities, alternatives to welfare ranges, and whether farm animal lives are net negative could refine prioritization methods.
Approaches for generating moral weights under moral uncertainty are needed to inform decision-making given remaining philosophical disagreements.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.