Will Sentience Make AI’s Morality Better?

By Ronen Bar @ 2025-05-18T04:34 (+26)

TLDR

I propose to explore what I call the AI Sentience Ethics Conundrum: Will the world be better with a sentient AI or an insentient AI?

In this article, we will explore whether a sentient AI would behave more ethically and robustly than an insentient one. I will propose an initial framework of four key factors - understanding reality, understanding morality, power and willingness to act - that are potentially influenced by AI sentience. Those factors are key in determining how it behaves, i.e., the degree of AI Moral Alignment

In the next post, I will address how AI sentience might affect the world, taking a broader view than just AI's behavior, including its own welfare.

Existing opinions on this questions

The Sentience Ethics Conundrum represents a crucial yet underexplored aspect of AI's potential impact on the world. In his Nonsentient Optimizers post, Eliezer Yudkowsky dismisses this concern entirely, arguing that those who believe AI requires empathy fundamentally misunderstand utility functions and decision theory, unable to conceive of intelligence operating unlike human minds.

However, most AI Safety and EA people I spoke with seem to disagree with Yudkowsky's conclusion. It seems this question is very hard, deep, and complicated, and I find it one of the most essential questions in AI safety and Moral Alignment. It is the kind of question where there is much more than meets the eye.

Epistemic status

This analysis presents preliminary thoughts mapping critical questions and postulations emerging from this conundrum. The simple framework suggested in this article is a very rough starting point for thinking about this topic, which in my opinion should be developed much further.

The four dependent factors

Examples are provided in parentheses.

The key question of this article

We will examine the influence of AI Sentience, the independent factor, on the four dependent factors, which in turn influence AI Moral Alignment. All this will be done to gain insights into the question: will a sentient AI behave more ethically than an insentient one, and if so, what sentience intensity, valence (spectrum & intensity), and particular core traits will optimize its morality?

For simplicity, in this post I disregard the option of sentience without valence, though this also deserves consideration.

When I say 'should we build a sentient AI?,' I mean whether we—or an AI—should build it.

In this post I will use sentience and consciousness interchangeably.

AI sentience and the four dependent factors - questions and thoughts

AI Sentience and 🌀 Understanding Reality

What unique information does an agent obtain through subjective experience?

Does an agent need sentience to figure out what causes sentience?

Will an insentient AI be able to better solve the conundrum

Is it possible to create ASI without it being sentient

The uncertainty problem

AI Sentience and ⚖️ Understanding Morality

Can genuine moral understanding exist without subjective experience?

Robustness?

Intrinsic alignment, autonomy, recursive ethical self-improvement and complexity

Moral judgment will be damaged by experiences?

What we know from human sentience–ethics interactions

Different kinds of consciousness, valence, senses and time

AI Sentience and 🤖 AI Power

AI Sentience and 🎬 Willingness to Act

Influence on AI behavior

According to my suggestions in this post, AI's behavior stems from a combination of an AI’s understanding of reality, morality, its power, and willingness to act according to the first three factors. Some other questions that relate multiple factors at ones should also be examined, such as:

Research suggestions

Epilog


ChrisPercy @ 2025-06-16T19:03 (+1)

This is tricky stuff, thank you for getting into the conversation. The part that resonates most with me would run something like this: I can imagine a conscious AI that personally experiences valence is more likely to adopt ethical positions that suffering is bad (irrespective of moral relativism), although it might be more likely to rank its own / its kin's suffering above others' (as pretty much all animals do), which could make things worse for us than a neutral optimiser. Assuming AI is powerful enough that it can manage its own suffering to minimal levels and still have considerable remaining resources*, there is a chance it would then devote some resource to others' valence, albeit perhaps only as part of a portfolio of objectives. 
* Itself a major assumption even under ASI, e.g. if suffering is relative or fairly primitive in consciousness architecture so that most all functions end up involving mixed valence states.

I can see a risk in the other direction, where the AI does suffer and that suffering is somehow intrinsic to its operations (or deemed unbearable/unavoidable in some fashion, e.g. perhaps negative utilitarianism or certain Buddhist positions on 'all life is suffering' are correct for it). If so, a suffering AI might not only want to extinct itself but also to extinct those that gave rise to it, whether as punishment, to ensure it is never re-created, or as a gift to reduce what it sees as our own 'inevitable suffering'. 

Overall, feels very hard to assess, but worth thinking about even if not the highest priority alignment topic. Both aspects tie issues around AI consciousness/sentience into AI safety more directly than most of the safety movement acknowledges...  

Ronen Bar @ 2025-06-18T08:51 (+1)

An ASI may be much smarter in helping oneself not feel so much suffering. We as humans are good in engineering our environment, but not the inside. AI may excel in inner engineering as well... very speculative

ChrisPercy @ 2025-06-18T09:26 (+2)

Agreed - lots to aspire towards, even if speculative/challenging