Moral strategies at different capability levels

By richard_ngo @ 2022-07-27T20:20 (+24)

This is a linkpost to http://thinkingcomplete.blogspot.com/2022/07/moral-strategies-at-different.html

Let’s consider three ways you can be altruistic towards another agent:

I think a lot of unresolved tensions in ethics comes from seeing these types of morality as in opposition to each other, when they’re actually complementary:

Cooperation-morality and deference-morality have the weakness that they can be exploited by the agents we hold those attitudes towards; and so we also have adaptations for deterring or punishing this (which I’ll call conflict-morality). I’ll mostly treat conflict-morality as an implicit part of cooperation-morality and deference-morality; but it’s worth noting that a crucial feature of morality is the coordination of coercion towards those who act immorally.

Morality as intrinsic preferences versus morality as instrumental preferences

I’ve mentioned that many moral principles are rational strategies for multi-agent environments even for selfish agents. So when we’re modeling people as rational agents optimizing for some utility function, it’s not clear whether we should view those moral principles as part of their utility functions, versus as part of their strategies. Some arguments for the former:

Some arguments for the latter:

The rough compromise which I use here is to:

Rederiving morality from decision theory

I’ll finish by elaborating on how different decision theories endorse different instrumental strategies. Causal decision theories only endorse the same actions as our cooperation-morality intuitions in specific circumstances (e.g. iterated games with indefinite stopping points). By contrast, functional decision theories do so in a much wider range of circumstances (e.g. one-shot prisoner’s dilemmas) by accounting for logical connections between your choices and other agents’ choices. Functional decision theories follow through on commitments you previously made; and sometimes follow through on commitments that you would have made. However, the question of which hypothetical commitments they should follow through with depends on how updateless they are.

Updatelessness can be very powerful - it’s essentially equivalent to making commitments behind a veil of ignorance, which provides an instrumental rationale for implementing cooperation-morality. But it’s very unclear how to reason about how justified different levels of updatelessness are. So although it’s tempting to think of updatelessness as a way of deriving care-morality as an instrumental goal, for now I think it’s mainly just an interesting pointer in that direction. (In particular, I feel confused about the relationship between single-agent updatelessness and multi-agent updatelessness like the original veil of ignorance thought experiment; I also don’t know what it looks like to make commitments “before” having values.)

Lastly, I think deference-morality is the most straightforward to derive as an instrumentally-useful strategy, conditional on fully trusting the agent you’re deferring to - epistemic deference intuitions are pretty common-sense. If you don’t fully trust that agent, though, then it seems very tricky to reason about how much you should defer to them, because they may be manipulating you heavily. In such cases the approach that seems most robust is to diversify worldviews using a meta-rationality strategy which includes some strong principles.


Noah Scales @ 2022-07-28T07:12 (+1)

You wrote:

"I also don’t know what it looks like to make commitments “before” having values"

Well, in a world where an almighty god dishes out the 10 Commandments, there's a list of commitments that might disagree with my values. In such a world, being created is enough to warrant those commitments. The commitments might not even represent God's values for his human creatures, lol, but instead serve an instrumental purpose.  As a matter of fact, I treat marriage contracts as a commitment that I respect others have even if I am not personally bound by the contract. I'm not religious though.

You seem to be looking for a decision theory that demonstrates better returns from its utility functions in some unusual thought experiments that aim to challenge:

If you find a more practical thought experiment that demonstrates the difference between instrumental care-morality and intrinsic care-morality, you'll be closer to a solution for a decision theory that justifies care-morality on instrumental (rational) grounds. 

A classic evolution in human families is to go from care-morality (as young parents) to cooperation-morality (as older parents) to deference-morality (as aging parents). Your write-up has a lot of insights about parenting in it, I believe. Parenting might serve as a source of such thought experiments.

After browsing the references you gave, I wonder if you would be willing to post your take on the differences between:

Anyway, really interesting post you wrote, thank you.