Problems and Solutions in Infinite Ethics

By Ben_West🔸 @ 2015-01-01T20:47 (+14)

Summary: The universe may very well be infinite, and hence contain an infinite amount of happiness and sadness. This causes several problems for altruists; for example: we can plausibly only affect a finite subset of the universe, and an infinite quantity of happiness is unchanged by the addition or subtraction of a finite amount of happiness. This would imply that all forms of altruism are equally ineffective.

Like everything in life, the canonical reference in philosophy about this problem was written by Nick Bostrom. However, I found that an area of economics known as "sustainable development" has actually made much further progress on this subject than the philosophy world. In this post I go over some of what I consider to be the most interesting results.

NB: This assumes a lot of mathematical literacy and familiarity with the subject matter, and hence isn't targeted to a general audience. Most people will probably prefer to read my other posts:

Ridiculous math things which ethics shouldn't depend on but does, which includes such interesting tidbits as: why standard calculus teaches children to be immoral and why the Banach-Tarski paradox implies we should means test Medicare and
Kill the young people, whose name speaks for itself

1. Summary of the most interesting results

There’s no ethical system which incorporates all the things we might want.
Even if we have pretty minimal requirements, satisfactory ethical systems might exist but we can’t prove their existence, much less actually construct them
Discounted utilitarianism, whereby we value people less just because they are further away in time, is actually a pretty reasonable thing despite philosophers considering it ridiculous.
1. (I consider this to be the first reasonable argument for locavorism I've ever heard)

2. Definitions

In general, we consider a population to consist of an infinite utility vector (u₀,u₁,…) where u_i is the aggregate utility of the generation alive at time i. Utility is a bounded real number (the fact that economists assume utility to be bounded confused me for a long time!). Our goal is to find a preference ordering over the set of all utility vectors which is in some sense “reasonable”. While philosophers have understood for a long time that finding such an ordering is difficult, I will present several theorems which show that it is in fact impossible.

Due to a lack of latex support I’m going to give English-language definitions and results instead of math-ey ones; interested people should look at the papers themselves anyway.

3. Impossibility Results

3.0 Specific defs

Strong Pareto: if you can make a generation better off, and none worse off, you should.
Weak Pareto: if you can make every generation better off, you should.
Intergenerational equity: utility vectors are unchanged in value by any permutation of their components.
- There is an important distinction here between allowing a finite number of elements to be permuted and an infinite number; I will refer to the former as “finite intergenerational equity” and the latter as just “intergenerational equity”
Ethical relation: one which obeys both weak Pareto and finite intergenerational equity
Social welfare function: an order-preserving function from the set of populations (utility vectors) to the real numbers

3.1 Diamond-Basu-Mitra Impossibility Result¹

There is no social welfare function which obeys Strong Pareto and finite intergenerational equity. This means that any sort of utilitarianism won’t work, unless we look outside the real numbers.

3.2 Zame's impossibility result²

If an ordering obeys finite intergenerational equity over [0,1]^N, then almost always we can’t tell which of two populations is better
1. (i.e. the set of populations {X,Y: neither X<Y nor X>Y} has outer measure one)
The existence of an ethical preference relation on [0,1]^N is independent of ZF plus the axiom of choice

4. Possibility Results

We’ve just shown that it’s impossible to construct or even prove the existence of any useful ethical system. But not all hope is lost!

The important idea here is that of a “subrelation”: < is a subrelation to <’ if x<y implies x<’y.

Our arguments will work like this:

Suppose we could extend utilitarianism to the infinite case. (We don't, of course, know that we can extend utilitarianism to the infinite case. But suppose we could.) Then A, B and C must follow.

Technically: suppose utilitarianism is a subrelation of <. Then < must have properties A, B and C.

Everything in this section comes from (3). This is a great review of the literature.

4.1 Definition

Utilitarianism: we extend the standard total utilitarianism ordering to infinite populations in the following way: suppose there is some time T after which every generation in X is at least as well off as every generation in Y, and that the total utility in X before T is at least as good as the total utility in Y before T. Then X is at least as good as Y.
- Note that this is not a complete ordering! In fact, as per Zame’s result above, the set of populations it can meaningfully speak about has measure zero.
Partial translation scale invariance: suppose after some time T, X and Y become the same. Then we can add any arbitrary utility vector A to both X and Y without changing the ordering. (I.e. X > Y ó X+A > Y+A)

4.2 Theorem

Utilitarianism is a subrelation of > if and only if > satisfies strong Pareto, finite intergenerational equity and partial translation scale invariance.
1. This means that if we want to extend utilitarianism to the infinite case, we can’t use a social welfare function, as per the above Basu-Mitra result

4.3 Definition

Overtaking utilitarianism: suppose there is some point T after which the total utility of the first N generations in X is always greater than the total utility of the first N generations in Y (given N > T). Then X is better than Y.
- Note that utilitarianism is a subrelation of overtaking utilitarianism
Weak limiting preference: suppose that for any time T, X truncated at time T is better than Y truncated at time T. Then X is better than Y.

4.4 Theorem

Overtaking utilitarianism is a subrelation of < if and only if < satisfies strong Pareto, finite intergenerational equity, partial translation scale invariance, and weak limiting preference

4.5 Definition

Discounted utilitarianism: the utility of a population is the sum of its components, discounted by how far away in time they are
Separability:
- Separable present: if you can improve the first T generations without affecting the rest, you should
- Separable future: if you can improve everything after the first T generations without affecting the rest, you should
Stationarity: preferences are time invariant
Weak sensitivity: for any utility vector, we can modify its first generation somehow to make it better or worse

4.6 Theorem

The only continuous, monotonic relation which obeys weak sensitivity, stationary, and separability is discounted utilitarianism

4.7 Definition

Dictatorship of the present: there’s some time T after which changing the utility of generations doesn’t matter

4.8 Theorem

Discounted utilitarianism results in a dictatorship of the present. (Remember that each generation’s utility is assumed to be bounded!)

4.9 Definition

Sustainable preference: a continuous ordering which doesn’t have a dictatorship of the present but follows strong Pareto and separability.

4.10 Theorem

The only ordering which is sustainable is to take discounted utilitarianism and add an “asymptotic” part which ensures that infinitely long changes in utility matter. (Of course, finite changes in utility still won't matter.)

5. Conclusion

I hope I've convinced you that there's a "there" there: infinite ethics is something that people can make progress on, and it seems that most of the progress is being made in the field of sustainable development.

Fun fact: the author of the last theorem (the one which defined "sustainable") was one of the lead economists on the Kyoto protocol. Who says infinite ethics is impractical?

6. References

Basu, Kaushik, and Tapan Mitra. "Aggregating infinite utility streams with intergenerational equity: the impossibility of being Paretian." Econometrica 71.5 (2003): 1557-1563. http://folk.uio.no/gasheim/zB%26M2003.pdf
Zame, William R. "Can intergenerational equity be operationalized?." (2007). https://tspace.library.utoronto.ca/bitstream/1807/9745/1/1204.pdf
Asheim, Geir B. "Intergenerational equity." Annu. Rev. Econ. 2.1 (2010): 197-222.http://folk.uio.no/gasheim/A-ARE10.pdf

undefined @ 2015-01-03T13:41 (+7)

"The universe may very well be infinite, and hence contain an infinite amount of happiness and sadness. This causes several problems for altruists; for example: we can plausibly only affect a finite subset of the universe, and an infinite quantity of happiness is unchanged by the addition or subtraction of a finite amount of happiness. This would imply that all forms of altruism are equally ineffective."

I have no particular objection to those, unlike me, interested in aggregative ethical dilemmas, but I think it at least preferable that effective altruism - a movement aspiring to ecumenical reach independent of any particular ethical presuppositions - not automatically presume some cognate of utilitarianism. The repeated posts on this forum about decidedly abstract issues of utilitarianism with little or no connection with the practice of charitable giving is, perhaps, not particularly helpful in this regard. Most basically however, I object to your equivalence of altruism and utilitarianism as a matter of form: that should not be assumed, but qualified.

undefined @ 2015-01-03T18:12 (+3)

The problems with extending standard total utilitarianism to the infinite case are the easiest to understand, which is why I put that in the summary, but I don't think most of the article was about that.

For example, the fact that you can't have intergenerational equity (Thm 3.2.1) seems pretty important no matter what your philosophical bent.

undefined @ 2015-01-04T01:48 (+1)

A minuscule proportion of political philosophy has concerned itself with aggregative ethics, and in my being a relatively deep hermeneutical contextualist, I take what is important to them to be what they thought to be important to them, and thus your statement - that intergenerational equity is perennially important - as patently wrong. Let alone people not formally trained in philosophy.

The fact I have to belabour that most of those interested in charitable giving are not by implication automatically interested in the 'infinity problem' is exactly demonstrative of my initial point, anyhow, i.e. of projecting highly controversial ethical theories, and obscure concerns internal to them, as obviously constitutive of, or setting the agenda for, effective altruism.

undefined @ 2015-01-02T16:13 (+6)

One thing I liked about this post is that it was written in English, instead of math symbols. I find it extremely hard to read a series of equations without someone explaining them verbally. Overall I thought the clarity was fairly good.

undefined @ 2015-01-02T21:28 (+5)

Thanks for the summary. :)

I don't understand why they're working on infinite vectors of future populations, since it looks very likely that life will end after a finite length of time into the future (except for Boltzmann brains). Maybe they're thinking of the infinity as extended in space rather than time? And of course, in that case it becomes arbitrary where the starting point is.

we can plausibly only affect a finite subset of the universe, and an infinite quantity of happiness is unchanged by the addition or subtraction of a finite amount of happiness.

Actually, every action we take makes an infinite difference. I was going to write more explanation here but then realized I should add it to my essay on infinity: here.

undefined @ 2015-01-03T13:52 (+2)

Thanks Brian – insightful as always.

It might be the case that life will end after time T. But that's different than saying it doesn't matter whether life ends after time T, which a truncated utility function would say.
(But of course see theorem 4.8.1 above)
Thanks for the insight about multiverses – I haven't thought much about it. Is what you say only true in a level one multiverse?

undefined @ 2015-01-03T16:01 (+1)

1) Fair enough. Also, there's some chance we can affect Boltzmann brains that will exist indefinitely far into the future. (more discussion)

3) I added a new final paragraph to this section about that. Short answer is that I think it works for any of Levels I to III, and even with Level IV it depends on your philosophy of mathematics.

(Let me know if you see errors with my facts or reasoning.)

undefined @ 2015-01-03T18:06 (+1)

1) interesting, thanks! 3) I don't think I know enough about physics to meaningfully comment. It sounds like you are disagreeing with the statement "we can plausibly only affect a finite subset of the universe"? And I guess more generally if physics predicts a multiverse of order w_i, you claim that we can affect w_i utils (because there are w_i copies of us)?

michaelchen @ 2021-05-06T03:23 (+4)

The blog post links "Ridiculous math things which ethics shouldn't depend on but does" and "Kill the young people" are dead. You can find archived versions of the posts at the following links:

MichaelStJules @ 2021-10-01T06:15 (+2)

(Of course, finite changes in utility still won't matter.)

The discounted utilitarianism term should still be sensitive to those, at least enough to break ties. Discounted utilitarianism satisfies strong Pareto, after all.

undefined @ 2016-08-29T23:34 (+1)

This is really interesting stuff, and thanks for the references.

A few comments:

It'd be nice to clarify what: "finite intergenerational equity over [0,1]^N" means (specifically, the "over [0,1]^N" bit).

Why isn't the sequence 1,1,1,... a counter-example to Thm4.8 (dictatorship of the present)? I'm imagining exponential discounting, e.g. of 1/2 so the welfare function of this should return 2 (but a different number if u_t is changed, for any t).

undefined @ 2016-08-30T20:09 (+1)

Thanks for the comments!

Regarding your second question: the idea is that if x is better than y, then there is a point in time after which improvements to y, no matter how great, will never make y better than x.

So in your example where there is a constant discount rate of one half: (1, 1, 1, (something)) will always be preferred to (0, 0, 0, (something else)), no matter what we put in for (something) and (something else). In this sense, the first three generations "dictate" the utility function.

As you point out, there is no single time at which dictatorship kicks in, it will depend on the two vectors you are comparing and the discount rate.

undefined @ 2015-01-13T22:11 (+1)

I am curious about your definitions: intergenerational equity and finite intergenerational equity. I am aware of that some literature suggests that finite permutations are not enough to ensure equity among an infinite number of generations. The quality of the argumentation in this literature is often not so good. Do you have a reference that gives a convincing argument for why your notion of intergenerational equity is appropriate and/or desirable? I hope this does not sound like I am questioning whether your definition is consistent with the literature: I am only asking out of interest.

undefined @ 2015-01-18T02:42 (+1)

Good question. It's easiest to imagine the one-dimensional spatial case like (...,L2, L1, me, R1, R2, ...) where {Li} are people to my left and {Ri} are those to my right. If I turn 180° this permutes the vector to (..., R1, me, L1, ...) Which is obviously an infinite number of permutations, but seems morally unobjectionable.

undefined @ 2015-01-18T20:17 (+1)

Thank you for the example. I have two initial comments and possibly more if you are interested. 1. In all of the literature on the problem, the sequences that we compare specify social states. When we compare x=(x_1,x_2,...) and y=(y_1,y_2,...) (or, as in your example, x=(....,x_0,x_1,x_2,...) and y=(...,y_0,y_1,y_2,...)), we are doing it with the interpretation x_t and y_t give the utility of the same individual/generation in the two possible social states. For the two sequences in your example, it does not seem to be the case that x_t and y_t give the utility of the same individual in two possible states. Rather, it seems that we are re-indexing the individuals. 2. I agree that moral preferences should generally be invariant to re-indexing, at least in a spatial context (as opposed to an intertermporal context). Let us therefore modify your example so that we have specified utilities x_t,y_t, where t ranges over the integers and x_t and y_t represent the utilities of people located at positions on a doubly infinite line. Now I agree that an ethical preference relation should be invariant under some (and possibly all) infinite permutations IF the permutation is performed to both sequences. But it is hard to give an argument for why we should have invariance under general permutations of only one stream.

The example is still unsatisfactory for two reasons. (i) since we are talking about intergenerational equity, the t in x_t should be time, not points in space where individuals live at the same time: it is not clear that the two cases are equivalent. (They may in fact be very different.) (ii) in almost all of the literature (in particular, in all three references in the original post), we consider one-sided sequences, indexed by time starting today and to the infinite future. Are you aware of example in this context?

undefined @ 2015-01-20T16:01 (+1)

Thank you for the thoughtful comment.

For the two sequences in your example, it does not seem to be the case that xt and yt give the utility of the same individual in two possible states. Rather, it seems that we are re-indexing the individuals.

This is true. I think an important unstated assumption is that you only need to know that someone has utility x, and you shouldn't care who that person is.

Now I agree that an ethical preference relation should be invariant under some (and possibly all) infinite permutations IF the permutation is performed to both sequences. But it is hard to give an argument for why we should have invariance under general permutations of only one stream.

I'm not sure what the two sequences you are referring to are. Anonymity constraints simply say that if y is a permutation of x, then x~y.

in almost all of the literature (in particular, in all three references in the original post), we consider one-sided sequences, indexed by time starting today and to the infinite future. Are you aware of example in this context?

It is a true and insightful remark that whether we consider vectors to be infinite or doubly infinite makes a difference.

To my mind, the use of vectors is misleading. What it means to not care about temporal location is really just that you treat populations as sets (not vectors) and so anonymity assumptions aren't really required.

I guess you could phrase that another way and say that if you don't believe in infinite anonymity, then you believe that temporal location matters. This disagrees with general utilitarian beliefs. Nick Bostrom talks about this more in section 2.2 of his paper linked above.

A more mathy way that's helpful for me is to just remember that the relation should be continuous. Say s_n(x) is a permutation of _n_ components. By finite anonymity we have that x~s_n(x) for any finite n. If lim {n -> infinity} s_n = y, yet y was morally different from x, the relation is discontinuous and this would be a very odd result.

undefined @ 2015-01-20T20:59 (+1)

I would not only say that "that you only need to know that someone has utility x, and you shouldn't care who that person is" is an unstated assumption. I would say that it is the very idea that anonymity intends to formalize. The question that I had and still have is whether you know of any arguments for why infinite anonymity is suitable to operationalize this idea.

Regarding the use of sequences: you can't just look at sets. If you do, all nontrivial examples with utilities that are either 0 or 1 become equivalent. You don't have to use sequences, but you need (in the notation of Vallentyne and Kagan (1997)), a set of "locations", a set of real numbers where utility takes values, and a map from the location set to the utility set.

Regarding permutations of one or two sequences. One form of anonymity says that x ~ y if there is a permutation, say pi, (in some specified class) that takes x to y. Another (sometimes called relative anonymity) says that if x is at least as good as y, then pi(x) is at least as good as pi(y). These two notions of anonymity are not generally the same. There are certainly settings where the fullblown version of the relative anonymity becomes a basic rationality requirement. This would be the case with people lined up on an infinite line (at the same point in time). But it is not hard to see its inappropropriateness in the intertemporal context: you would have to rank the following two sequences (periodic with period 1000) to be equivalent or non-comparable

x=(1,1,....,1,0,1,1,...,1,0,1,1,...,1,......) y=(0,0,....,0,1,0,0,...,0,1,0,0,...,0,......)

This connects to whether denying infinite anonymity implies that "temporal location matters". If x and y above are two possible futures for the same infinite-horizon society, then I think that any utilitarian should be able to rank x above y without having to be critisized for caring about temporal location. Do you agree? For those who do not, equity in the intertemporal setting is the same thing as equity in the spatial (fixed time) setting. What those people say is essentially that intergenerational equity is a trivial concept: that there is nothing special about time.

If you do not think that the sequences x and y above should be equivalent in the intergenerational context then I would be very interested to see another example of sequences (or whatever you replace them with) that are infinite permutations of each other, but not finite permutations of each other, and where you do think that equivalence should.

P.S

Regarding continuity arguments, I assume that the usefulness of such arguments depends on whether you can justify your notion of continuity by ethical principles rather than that they appear in the mathematical literature. Take x(n)=(0,0,....,1,0,0,...) with a 1 in the n:the coordinate. For every n we want x(n) to be equivalent to (1,0,0,....). In many topologies x(n) goes to (0,0,0,....), which would then give that (0,0,...) is just as good as (1,0,0,....).

undefined @ 2015-02-05T14:50 (+1)

The question that I had and still have is whether you know of any arguments for why infinite anonymity is suitable to operationalize this idea.

Maybe I am missing something, but it seems obvious to me. Here is my thought process; perhaps you can tell me what I am overlooking.

For simplicity, say that A is the assumption that we shouldn't care who people are, and IA is the infinite anonymity assumption. We wish to show A IA.

Suppose A. Observe that any permutation of people can't change the outcome, because it's not changing any information which is relevant to the decision (as per assumption A). Thus we have IA.
Suppose IA. Observe that it's impossible to care about who people are, because by assumption they are all considered equal. Thus we have A.
Hence A IA.

These seems so obviously similar in my mind that my "proof" isn't very insightful… But maybe you can point out to me where I am going wrong.

One form of anonymity says that x ~ y if there is a permutation, say pi, (in some specified class) that takes x to y. Another (sometimes called relative anonymity) says that if x is at least as good as y, then pi(x) is at least as good as pi(y). These two notions of anonymity are not generally the same.

I hadn't heard about this – thanks! Do you have a source? Google scholar didn't find much.

In your above example is the pi in pi(X) the same as the pi in pi(y)? I guess it must be because otherwise these two types of anonymity wouldn't be different, but that seems weird to me.

If x and y above are two possible futures for the same infinite-horizon society, then I think that any utilitarian should be able to rank x above y without having to be critisized for caring about temporal location. Do you agree?

I certainly understand the intuition, but I'm not sure I fully agree with it. The reason I think that x better than y is because it seems to me that x is a Pareto improvement. But it's really not – there is no generation in x who is better off than another generation in y (under a suitable relabeling of the generations).

I would be very interested to see another example of sequences (or whatever you replace them with) that are infinite permutations of each other, but not finite permutations of each other, and where you do think that equivalence should.

(0,1,0,1,0,1,...) and (1,0,1,0,1,0,...) come to mind.

undefined @ 2015-02-07T21:03 (+1)

The problem in your argument is the sentence "...any permutation of people can't change the outcome...". For example: what does "any permutation" mean? Should the stream be applied to both sequences? In a finite context, these questions would not matter. In the infinite-horizon context, you can make mistakes if you are not careful. People who write on the subject do make mistakes all the time. To illustrate, let us say that I think that a suitable notion of anonymity is FA: for any two people p1 and p2, p1's utility is worth just as much as p2's. Then I can "prove" that A FA by your method. The A -> FA direction is the same. For FA -> A, observe that if for any two people p1 and p2, p1's utility is worth just as much as p2's, then it is not possible to care about who people are.

This "proof" was not meant to illustrate anything besides the fact that if we are not careful, we will be wasting our time.

I did not get a clear answer to my question regarding the two (intergenerational) streams with period 1000: x=(1,1,...,1,0,1,1,,...) and y=(0,0,...,0,1,0,0,,...). Here x does not Pareto-dominate y.

Regarding (0,1,0,...) and (0,1,0,...): I am familiar with this example from some of the literature. Recall in the first post that I wrote that the argumentation in much of the literature is not so good? This is the literature that I meant. I was hoping for more.

undefined @ 2015-02-08T09:28 (+1)

I forgot the reference for relative anonymity: See the paper by Asheim, d'Aspremont and Banerjee (J. Math. Econ., 2010) and its references.

undefined @ 2015-01-05T07:25 (+1)

Some kind of nitpicky comments:

3.2: Note that the definition of intergenerational equity in Zame's paper is what you call finite intergenerational equity (and his definition of an ethical preference relation involves the same difference), so his results are actually more general than what you have here. Also, I don't think that "almost always we can’t tell which of two populations is better" is an accurate plain-English translation of "{X,Y: neither XY} has outer measure one", because we don't know anything about the inner measure. In fact, if the preference relation respects the weak Pareto ordering, then {X,Y: neither XY} has inner measure 0. So an ethical preference relation must be so wildly nonmeasurable that nothing at all can be said about the frequency with which we can't tell which of two populations is better.

4.1:

Partial translation scale invariance: suppose after some time T, X and Y become the same. Then we can add any arbitrary utility vector A to both X and Y without changing the ordering. (I.e. X > Y iff X+A > Y+A)

X+A and Y+A won't necessarily be valid utility vectors. I assume you also want to add the condition that they are.

4.3: What does "truncated at time T" mean? All utilities after time T replaced with some default value like 0?

4.5:

Weak sensitivity: for any utility vector, we can modify its first generation somehow to make it better

Since you defined utilities as being in the closed interval [0,1], if you have a utility vector starting with 1, you can't get anything better just by modifying the first generation, so weak sensitivity should never hold in any sensible preference relation. I'm guessing you mean that we can modify its first generation to make it either better or worse (not necessarily both, unless you switch to open-interval-valued utilities).

4.7: Your definition of dictatorship of the present naively sounded to me like it's saying "there's some time T after which changing utilities of generations cannot affect the ordering of any pairs of utility vectors." But from theorem 4.8, I take it you actually meant "for any pair of utility vectors X and Y such that X<Y, there exists a time T such that changing utilities of generations after T cannot reverse the preference to get X>=Y."