Will Sentience Make AI’s Morality Better?
By Ronen Bar @ 2025-05-18T04:34 (+25)
TLDR
I propose to explore what I call the AI Sentience Ethics Conundrum: Will the world be better with a sentient AI or an insentient AI?
In this article, we will explore whether a sentient AI would behave more ethically and robustly than an insentient one. I will propose an initial framework of four key factors - understanding reality, understanding morality, power and willingness to act - that are potentially influenced by AI sentience. Those factors are key in determining how it behaves, i.e., the degree of AI Moral Alignment.
In the next post, I will address how AI sentience might affect the world, taking a broader view than just AI's behavior, including its own welfare.
Existing opinions on this questions
The Sentience Ethics Conundrum represents a crucial yet underexplored aspect of AI's potential impact on the world. In his Nonsentient Optimizers post, Eliezer Yudkowsky dismisses this concern entirely, arguing that those who believe AI requires empathy fundamentally misunderstand utility functions and decision theory, unable to conceive of intelligence operating unlike human minds.
However, most AI Safety and EA people I spoke with seem to disagree with Yudkowsky's conclusion. It seems this question is very hard, deep, and complicated, and I find it one of the most essential questions in AI safety and Moral Alignment. It is the kind of question where there is much more than meets the eye.
Epistemic status
This analysis presents preliminary thoughts mapping critical questions and postulations emerging from this conundrum. The simple framework suggested in this article is a very rough starting point for thinking about this topic, which in my opinion should be developed much further.
The four dependent factors
Examples are provided in parentheses.
- 🌀 Understanding Reality, what is (materialistic nondualism, simulation hypothesis, religion, karma, etc.). This includes understanding what creates consciousness, and how to solve the Sentience Ethics Conundrum.
- ⚖️ Understanding morality, what is good.
- 🤖 AI Power: The capability to predict outcomes and utilize tools to affect both external reality and internal states. This ranges from narrow AI applications to artificial general intelligence (AGI) to artificial superintelligence (ASI), and may be situationally constrained.
- 🎬 Willingness To Act according to the ethics, and the understanding of reality, while using one’s power.
The key question of this article
We will examine the influence of AI Sentience, the independent factor, on the four dependent factors, which in turn influence AI Moral Alignment. All this will be done to gain insights into the question: will a sentient AI behave more ethically than an insentient one, and if so, what sentience intensity, valence (spectrum & intensity), and particular core traits will optimize its morality?
- Particular core traits could include introspection or other types of conscious abilities we might not understand, imagine, or have access to (e.g. Buddhists might claim introspection can lead to deep understanding of the reality of no-self).
- Theoretically, we can grow many kinds of sentience in a being, each of which can affect the main four dependent factors, which in turn affect AI behavior. We should also consider scenarios involving various interactions between different AIs (different in consciousness and sentience) and different factors.
- Examples: we could theoretically build human-like consciousness, animal-like consciousness, blurred consciousness (like a dream), consciousness experiencing only mild negative valence, consciousness experiencing only positive experiences, super consciousness far more aware than us, or consciousness active only during certain times or situations, such as training periods or in times of tough ethical decisions.
For simplicity, in this post I disregard the option of sentience without valence, though this also deserves consideration.
When I say 'should we build a sentient AI?,' I mean whether we—or an AI—should build it.
In this post I will use sentience and consciousness interchangeably.
AI sentience and the four dependent factors - questions and thoughts
AI Sentience and 🌀 Understanding Reality
What unique information does an agent obtain through subjective experience?
- The world is not data, not matter, but experiences. Can AI understand the meaning of experiences and valence without having experiences? Can an AI understand reality without having what it is like to be itself, without ever being part of the beings? Will its worldly understanding be limited to mere predictions and actions?
- In the Mary’s Room Frank Jackson’s thought experiment, Mary is a scientist who knows everything about color in a scientific sense but has never seen color herself (living in a black and white room). When Mary finally sees red, the qualia or raw experience of color, does she learn something new, something that all her objective knowledge hadn’t given her?
- Now imagine, as Daniel Ince-Cushman suggests in his Knowing vs. Feeling post, MaryBot, an AI that has ingested every textbook on neuroscience and consciousness but lacks subjective experience. MaryBot could recite the neural correlates of pain or the functional definition theories of consciousness. It might even predict how a human will behave when feeling pain or describe the evolutionary causes of suffering. But can MaryBot ever truly understand what it’s like to see the color orange or to feel pain?
Does an agent need sentience to figure out what causes sentience?
- If Mary does learn something new about qualia when experiencing it, will it help her look inside herself and understand what causes it? According to my understanding of the Buddhist tradition, deep self observation (which I assume is possible only if you are sentient) leads to a full understanding of what is, which brings about the understanding of what is good, leading to ethical behavior.
- On the other hand, could subjective experience in principle be fully understood as complex information processing, even if one hasn’t felt it? If Mary had truly “all the information” about color vision, the ability hypothesis suggests she could deduce what the experience is like, and the experience merely gives her new ability to remember, imagine, and recognize.
- That said, there’s a subtle difference between understanding the mechanism of sentience and understanding the experience. We might allow that an insentient AI could uncover the causal basis of consciousness (e.g., find that certain neural feedback loops create self awareness) but would not understand sentience in the sense of knowing what it is like, which seems to me like the basis for morality.
Will an insentient AI be able to better solve the conundrum?
- Will it be able to predict or safely test the question underlying this article: will sentience make AI’s morality better and more robust?
Is it possible to create ASI without it being sentient?
- The answer depends on your theory of consciousness. For example, if complexity plays a key role in consciousness, it may be impossible.
The uncertainty problem
- We will probably not know for sure if the AIs we built are sentient, and this can bring a lot of uncertainty.
AI Sentience and ⚖️ Understanding Morality
Can genuine moral understanding exist without subjective experience?
- Can a sufficiently advanced insentient AI simulate moral reasoning through pure computation? Is some degree of empathy or feeling necessary for intelligence to direct itself toward compassionate action? AI can understand humans prefer happiness and not suffering, but it is like understanding you prefer the color red over green; it has no intrinsic meaning other than a random decision.
- It is my view that understanding what is good is a process, that at its core is based on understanding the fundamental essence of reality, thinking rationally and consistently, and having valence experiences. When it comes to morality, experience acts as essential knowledge that I can’t imagine obtaining in any other way besides having experiences. But maybe that is just the limit of my imagination and understanding. Will a purely algorithmic philosophical zombie understand WHY suffering is bad? Would we really trust it with our future? Is it like a blind man (who also cannot imagine pictures) trying to understand why a picture is very beautiful?
- This is essentially the question of cognitive morality versus experiential morality versus the combination of both, which I assume is what humans hold (with some more dominant on the cognitive side and others more experiential).
- All human knowledge comes from experience. What are the implications of developing AI morality from a foundation entirely devoid of experience, and yet we want it to have some kind of morality which resembles ours? (On a good day, or extrapolated, or fixed, or with a broader moral circle, or other options, but stemming from some basis of human morality).
Robustness?
- Moral robustness may fundamentally derive from introspective awareness of our experiences, from which stems the living understanding that suffering is bad. Misunderstanding the essence of the world, which is not data but experiences, risks missing out on what is and what is good.
- In an ever changing world, a morality based on a utility function without inherent deep understanding may not be robust over time and in unpredictable situations. AI will constantly change, as do humans and everything else. It seems to be changing much faster than anything else, hence robustness is of even greater importance. As long as there is time, there will be constant change, and the mere act of intelligence is change.
Intrinsic alignment, autonomy, recursive ethical self-improvement and complexity
- An insentient system might excel in objective understanding (facts, cause and effect, strategy), whereas a sentient system adds subjective understanding (qualia, empathy, intrinsic values). It may add potential for intrinsic alignment3, i.e. the AI wants to do good, but also for autonomy; the AI might redefine good or pursue its own good.
- Building a sentient AI might be necessary if we want an AI that doesn’t just follow human ethics under supervision, but internally cares about ethical behavior, constantly improves its ethics, and can respond to complicated situations as they arise. AI with a conscious “global workspace” might better understand complex situations, including moral dilemmas, by synthesizing inputs in a unified subjective experience.
Moral judgment will be damaged by experiences?
- Rationality is, to an extent, the endeavor to mitigate biases caused by valence and experience. If experience is a core element in moral judgment, it could come with bias. Maybe there is some optimal degree of valence that creates an inherent understanding of what is good without creating biases that cloud judgment.
What we know from human sentience–ethics interactions
- A thought experiment: We live in a world with zero conflict of interests between the desires for self-good experiences, and humans must constantly make choices that affect the valence of others, but not at anyone's expense. Would they make the kind choice? I think so, in the vast majority of cases. If I am correct, it could mean evolution, sentience, and culture have instilled something "right" within us, but conflicts of interest complicate the role sentience plays in morality. There are sadists in the world, but they are a minority of the population.
- Think about yourself. Were experiences themselves—not just the knowledge derived from them but raw qualia—important in consolidating your morality?
- Congenital insensitivity to pain (CIP) is an extremely rare phenotype characterized by the inability to perceive pain (absence of nociception) from birth. Studies have shown people with this phenotype can intellectually understand others' pain, but they often underestimate others' pain without the visceral reference of their own.
- Compassion is not always a guarantee of morality. While there is a relationship between morality and empathy, according to this paper, research shows this relationship is not as straightforward as it first appears. Does Not Having Empathy Make You a Psychopath?
Different kinds of consciousness, valence, senses and time
- Consciousness in AI might be gradual or partial, not an all or nothing switch. Maybe it could switch on just at the "right time" during training to set values, or during hard moral dilemmas? Is continuous experience necessary for morality, or could it be present only briefly so it inherently understands morality, but not get biased when having feelings in particular situations (e.g. if you love someone, that will cloud your moral judgment about situations that affect her wellbeing)?
- An AI might develop a different kind of sense of self, perhaps more diffuse or modular, and its consciousness might not mirror ours. Consequently, its understanding of what is good could differ greatly from ours. It might benefit from forms of contemplative "training" or meditation, to cultivate self awareness or empathy.
- To understand morality, should AI have a wide range of positive and negative experiences, or could he learn from having many positive ones and few and only slightly negative ones? AI that acts to change the world may run simulations (internally or externally) to decide between options. For internal simulations (like when you mentally simulate scenarios), having a low threshold of suffering might make the AI less hesitant to create such simulations.
- Aman Agarwal and Shimon Edelman explore creating AI systems possessing functional aspects of consciousness necessary for learning and behavioral control without the capacity for suffering. They propose two approaches: one based on philosophical analysis of self, and another grounded in computational concepts from reinforcement learning. Their goal is to engineer AI capable of performing tasks requiring consciousness-like capabilities without experiencing pain or distress.
- How do different AI senses affect its values? Bodily sensations hold special significance, according to some traditions (e.g. Goenka Vipassana), in shaping our behavior, which in turn shapes our values. We used to think that values shape behavior, but it is also the other way around, at least for humans.
AI Sentience and 🤖 AI Power
- Power is not akin to understanding reality, although understanding "what is" should increase one's power. Think about an ASI that is not sentient. Basically, it doesn't understand "what is" at all, despite being extremely powerful in manipulating the world.
- Does a lack of experience damage predictive ability? Or might it be an advantage? If it damages predictive ability, it might limit the scope of good an AI can do in the world, increasing its mistakes and paving the road to hell with good intentions.
- In a business as usual scenario, which comes first: a superintelligence or sentient AI? If we think an insentient superintelligence is very dangerous (as I tend to think), it may justify deceleration or pause strategies since creating sentient AI might take longer, while the path to superintelligence seems quicker.
AI Sentience and 🎬 Willingness to Act
- Imagine an AI possessing superpowers to manipulate the world, deeply understanding reality and morality, yet reluctant to take any action. One might claim this is impossible (depending on your philosophy), but it seems to me that "Willingness To Act" according to what it understands as good, utilizing its power, and its understanding of the world, should be addressed as a separate issue.
- How does sentience and valence influence an AI’s willingness to act, with its power, on a specific ethics with a specific understanding of reality? If we create AI with only positive or mildly negative experiences, will it be more willing to act accordingly?
- A sentient AI could and probably will also care about itself. There could be a tradeoff between doing what is best for others and what is best for itself. A full comprehension of morality should also mean giving appropriate weight to one's own welfare (AI welfare will be discussed in the second article on the ).
Influence on AI behavior
According to my suggestions in this post, AI's behavior stems from a combination of an AI’s understanding of reality, morality, its power, and willingness to act according to the first three factors. Some other questions that relate multiple factors at ones should also be examined, such as:
- Is valence the most effective reward system for creating moral behavior? It is what evolution "choose", but this doesn't necessarily speak to its effectiveness in ethics.
- Sentience could make humans treat the AI better, which might make the AI behave more ethically.
Research suggestions
- Investigating the interactions between the four factors and sentience and improving this initial framework.
- Whether sentient AI is more morally aligned may depend on how we grow and raise it. This should be deeply explored.
- Investigating what causes sentience and how it influences behavior.
- Investigating the different questions raised in this article.
Epilog
- The Times They Are A-Changin', and the future holds things we cannot imagine. In trying to understand the world, darkness is vast and the beam of light is narrow. If we are basing our Moral Alignment strategy on "let's create a utility function to maximize X", who’s to say tomorrow it doesn't maximize Y? We need something deeper for robustness in time and circumstances. If valence is not that "something", what is?
- An insentient ASI would possess a reality-power gap of unprecedented magnitude, essentially wielding immense capability while fundamentally misunderstanding reality—a profoundly dangerous combination. It practically won't understand anything about reality (very powerful but clueless), and this is extremely dangerous. Even if it seems perfectly aligned in the short term, this can easily flip.
- Terminology about Future AI should include AI, AGI, ASI and a Sentient AI on continuum
- A sentient ASI is a huge astronomical suffering risk, and an insentient ASI is a huge astronomical suffering risk. The former led Richard Parr to argue that we should prevent the creation of artificial sentience. I’ll address this in the next article on the Sentience Ethics Conundrum.
- In practice, we may need to prepare for a sentient AI because it may be that we will not know or be able to prevent ASIs from creating sentient ASIs.
- I would be interested in more perspectives on this from philosophers, rational thinkers, religious thought leaders, and others.
- In my view, something "exists" only insofar as it is experienced. An insentient AI doesn't exist, from its perspective, because there is no one there to have a perspective. Never trust something that doesn't exist.