Developing AI Safety: Bridging the Power-Ethics Gap (Introducing New Concepts)

By Ronen Bar @ 2025-04-16T11:25 (+18)

TLDR

This post can be seen as a continuation of the this post.

(To further explore this topic, you can watch a 34-minute video outlining a concept map of the AI space and potential additions. Recommended viewing speed: 1.25x).

This post drew some insights from the Sentientism podcast and the Buddhism for AI course.

My Point of View

I am looking at the AI safety space mainly through the three fundamental questions: What is? What is good? How do we get there?

Human History Trends

Historically, human power – driven by increasing data and intelligence – is scaling rapidly and exponentially. Our ability to understand and predict "what is" continues to grow. However, our ethical development ("understanding what is good") is not keeping pace. The power-ethics gap is the car driving increasingly faster, while the driver’s skill is improving just a little bit as the ride goes on and on. This arguably represents one of the most critical problems globally. This imbalance has contributed significantly to increasing suffering and killing throughout history, potentially more so in recent times than even before. The widening power-ethics gap appears correlated with large-scale, human-caused harm.

The Focus of the AI Safety Space

Eliezer Yudkowsky, who describes himself as 'the original AI alignment person,' is one of the most prominent figures in the AI safety space. His philosophical work, many concepts he created, and his discussion forum platform and organizations have significantly shaped the AI safety field. I am in awe of his tremendous work and contribution to humanity, but he has a significant blind spot regarding his understanding of “what is”. Yudkowsky operates within a framework where (almost) only humans are considered sentient, that is his claim, whereas scientific evidence suggests that probably all vertebrates, and possibly many invertebrates, are sentient. This discrepancy is crucial: one of the key founders of the AI safety space has built his perspective on an unscientific assumption that limits his view to a tiny fraction of the world's sentience.

The potential implications of this are profound and this highlights the necessity of re-evaluating AI safety from a broader ethical perspective encompassing all sentient beings, both present and future. This requires introducing new concepts and potentially redefining existing ones. This work is critical since the pursuit of artificial intelligence is primarily focused on increasing power (capabilities), hence it risks further widening the existing power-ethics gap within humanity.

Since advanced AI poses the threat talking away control and mastery of from humans, two crucial pillars for AI safety emerge: maintaining meaningful human control (power) and ensuring ethical alignment (ethics). Currently, the field heavily prioritizes the former, while the latter remains underdeveloped. From an ethical perspective, particularly one concerned with the well-being of sentientkind ('Sentientkind' being analogous to 'humankind' but inclusive of all feeling beings), AI safety and alignment could play a greater role. Given that AI systems may eventually surpass human capabilities, their embedded values will have immense influence.

We must strive to prevent an AI-driven power-ethics gap far exceeding the one already present in humans.

Suggesting New Concepts, Redefining or Highlighting Existing Ones

A monkey (representing evolution), a human, an AI, and a superintelligence. In order to achieve a good world, we probably need the last three to be aligned.

Beyond Singularity @ 2025-04-16T12:56 (+3)

Thank you for this deep and thought-provoking post! The concept of the "power-ethics gap" truly resonates and seems critically important for understanding current and future challenges, especially in the context of AI.

The analogy with the car, where power is speed and ethics is the driver's skill, is simply brilliant. It illustrates the core of the problem very clearly. I would even venture to add that, in my view, the "driver's skill" today isn't just lagging behind, but perhaps even degrading in some aspects due to the growing complexity of the world, information noise, and polarization. Our collective ability to make wise decisions seems increasingly fragile, despite the growth of individual knowledge.

Your emphasis on the need to shift the focus in AI safety from purely technical aspects of control (power) to deep ethical questions and "value selection" seems absolutely timely and necessary. This truly is an area that appears to receive disproportionately little attention compared to its significance.

The concepts you've introduced, especially the distinction between Human-Centric and Sentientkind Alignment, as well as the idea of "Human Alignment," are very interesting. The latter seems particularly provocative and important. Although you mention that this might fall outside the scope of traditional AI safety, don't you think that without significant progress here, attempts to "align AI" might end up being built on very shaky ground? Can we really expect to create ethical AI if we, as a species, are struggling with our own "power-ethics gap"?

It would be interesting to hear more thoughts on how the concept of "Moral Alignment" relates to existing frameworks and whether it could help integrate these disparate but interconnected problems under one umbrella.

The post raises many important questions and introduces useful conceptual distinctions. Looking forward to hearing the opinions of other participants! Thanks again for the food for thought!

VeryJerry @ 2025-04-16T13:44 (+1)

How can we unhypocritically expect AI superintelligence to respect "inferior" sentient beings like us when we do no such thing for other species? 

How can we expect AI to have "better" (more consistent, universal, compassionate, unbiased, etc) values than us and also always only do what we want it to do? 

What if extremely powerful, extremely corrigible AI falls into the hands of a racist? A sexist? A speciesist? 

Some things to think about if this post doesn't click for you

Ronen Bar @ 2025-04-17T12:00 (+2)

And thinking more long term, when AGI builds a superintelligence, that will build the next agents, and humans are somewhere 5-6 scales down the intelligence scale, what chance do we have for moral consideration and care by those superior beings? unless we realize we need to care for all beings, and build an AI that cares for all beings...