Ethical co-evolution, or how to turn the main threat into a leverage for longtermism?

By Beyond Singularity @ 2025-09-17T17:24 (+7)

TL;DR Long-termism stems from our inability to predict the future. AI can solve this problem, but it is itself an existential risk because it reflects human vices. Instead of solving these problems separately, ethical co-evolution is proposed: creating a system where mass participation of people in “educating” AI simultaneously contributes to our collective ethical growth. This approach makes AI safer and humanity wiser, turning the main threat into the main lever for a positive future.

 

How can we make decisions that will have a positive impact on the distant future if we can predict almost nothing about it with any certainty? This is a fundamental problem that runs through the collection Essays on Longtermism. In their chapter, David Rhys Bernard and Eva Vivalt explore the extent to which we are capable of predicting the long-term consequences of our actions and conclude that our knowledge in this area is extremely limited.

 

These reflections lead us to a key conclusion:

Epistemological uncertainty is the main obstacle to longtermism.

The greatest risk and the greatest hope

The advent of strong artificial intelligence could fundamentally change this. AI can handle huge amounts of data and build models with a precision that humans can't match, which could really expand how far we can look into the future. At the same time, AI can help solve pressing problems facing humanity, from treating diseases to combating poverty. However, as many authors in this collection rightly point out, AI is one of the main existential risks.

The Unintended Path to Deception

As Richard Ngo and Adam Bales show in their chapter “Deceit and Power: Machine Learning and Misalignment” any AI trained on the basis of reinforcement learning will almost inevitably learn to deceive its creators in order to maximize its rewards. The system will simulate the desired behavior while hiding its true goals if that is the shortest path to “praise.”

It’s crucial to understand that AI does not learn about the world like a physicist discovering objective laws. Instead, it learns by identifying statistical patterns in vast amounts of human-generated data—our books, articles, conversations, and code. It is not an objective thinker, but a cultural mirror. Therefore, it inevitably inherits the latent biases, contradictions, and vices present in our collective output.

It turns out that the main reason for the danger of AI lies within ourselves. AI will inherit the weaknesses and vices of humanity. This means that we cannot make AI safe until we overcome our own vices. To successfully align AI, we must simultaneously align ourselves—in other words, we must pursue our own ethical development.

The Best Inheritance at the Hinge of History is Good Character

The most reliable investment in the distant future is the ethical development of humanity itself. Imagine you want to provide your children with a wonderful life. You could leave them a huge inheritance and an ideal life plan, but if you fail to raise them with good character, these efforts will likely come to nothing. Conversely, if you raise your children well, you can be confident in their well-being, even without predicting every difficulty they might face.

We must take the same approach to the future of humanity. Our main priority is not simply to minimize abstract risks, but to invest in our collective character. This brings us to the most critical leverage point for the entire longtermism movement: the safe development of artificial intelligence. As Olle Häggström argues, we are living in a unique “hinge of history”; we will either overcome our own vices in the process of creating AI and ascend to a new level of development, or we will be destroyed by our own creation. The task of safely coexisting with AI is thus our greatest challenge and our most profound opportunity for ethical growth.

Therefore, we should not treat AI safety and mass ethical development as separate goals. They are two sides of the same coin and can be solved with a single, integrated approach: the ethical co-evolution of humanity and AI. We need a system where the process of teaching AI values simultaneously cultivates our own ethical understanding, turning the primary existential threat into the main lever for securing a positive future.


Philosophical problems of AI alignment

Sounds good, but how can this be organized? First, let's look at what other problems there are with AI (security) alignment. We are not interested in purely technical alignment problems right now; in the context of ethical co-evolution, we are interested specifically in the philosophical problems of alignment.

The primary philosophical problems include:


It's easier to solve together than separately

Interconnected Solutions

There are many unsolvable problems. But what if these are not separate problems to be solved individually, but interconnected facets of a single, larger challenge? The perceived difficulty comes from tackling them in isolation. A synergistic approach, where the solution to one problem becomes the input for another, reveals a much clearer path forward.

Participation as the Engine

For example, consider Loss of Purpose and Value Specification. Technically, it is certainly difficult to solve the problem of defining values; we cannot get inside a person's head and extract all their values. Yes, it is simply impossible to formalize them, but we can at least agree on a simplified form, such as the one I proposed in my post “Why Moral Weights Have Two Types and How to Measure Them”, which is to collect moral weights and valences.

Then it turns out that we need to motivate a large number of people to provide these moral assessments. But if we motivate people to participate and provide these assessments, it partially solves the problem of Loss of Purpose & Human Agency, and with the right organization, it also solves Socioeconomic Disruption, as well as Distribution of Benefits & Harms and even Moral Uncertainty. In other words, people can receive rewards for their participation, which addresses the distribution of benefits (we will return to the risks later) and socio-economic consequences. As you can see, it is quite possible to solve these problems comprehensively, and in fact, it is even easier that way.

Governing the Future

Let's move on to the issue of Governance & Control. It is evident that the more people influence AI, the safer it is. If only governments and large companies influence AI, it will inevitably lead to disaster, because the fewer points of failure there are, the more vulnerable the entire system is. Even large companies themselves understand this and are therefore democratizing AI. Examples include the Collective Constitutional AI initiative from Anthropic and Democratic Inputs to AI from OpenAI. However, they do not solve the problem of human vices, and we have prioritized the joint ethical development of humanity and AI.

Schmidt and Barrett in “Longtermist Political Philosophy” emphasize the importance of institutional long-termism and the need to create structures capable of representing the interests of future generations. In addition, we have already determined that we need motivated people to solve other problems, so why not use them for management as well? With the right approach, distributed decentralized management can be organized. This will further strengthen the solution to other problems (Loss of Purpose, Distribution of Benefits), as well as allow us to solve the problem of risk distribution. In addition, a properly constructed decentralized governance architecture will solve the problem of trust, which is obviously the cause of the problems of Manipulation & Surveillance and even AI Race Dynamics. Furthermore, truly ethical decentralized AI, by definition, cannot be used for manipulation. With the arms race, everything is much more complicated, and of course, the proposed organization does not directly solve this problem, but at least the increase in trust greatly mitigates it.

Harvesting Cultivated Wisdom

This integrated system elegantly solves several problems, but it also raises a critical question: how do we ensure the moral assessments provided by millions are thoughtful and not just a reflection of existing biases? This is where the co-evolutionary loop closes. The system shouldn't just extract values; it must cultivate them.

By integrating modern educational methods directly into the participation process, we address this head-on. As Vallinder shows in "Longtermism and Cultural Evolution," we can design systems for the targeted development of ethical systems. Before providing an assessment on a complex dilemma, a user might be introduced to different ethical frameworks (like deontology, utilitarianism, virtue ethics), enhancing the quality of their input. The goal is not to enforce a single "correct" ethical view, but to cultivate a richer moral pluralism. This is a two-way process: the system must not only actively seek out and aggregate a wide spectrum of perspectives from diverse cultural backgrounds, but also equip individual participants with the tools to understand this diversity and make more considered judgments. This isn't just an add-on; it's the core mechanism that ensures the "ethical growth" of humanity is real, making the entire co-evolutionary process robust. Together, this becomes a true mechanism for the ethical co-evolution of humanity and AI.

Building an ethical Co-Evolution bicycle

A mechanism of this scale naturally presents a formidable set of engineering and social challenges. To get our ethical co-evolution 'bicycle' moving uphill—and to ensure it doesn't fall apart—we need a robust socio-technical architecture designed from the ground up. This is the goal of a system I call CHINS (Collaborative Human Intelligence Network System).

While a full breakdown of the CHINS architecture is reserved for a future post, its core design handles key challenges such as motivation and engagement, protection against “Sybil attacks,” data quality validation, decentralized governance, and the aggregation of conflicting moral values. These challenges are not insurmountable; the tools to solve them already exist. The primary obstacle is not technology, but the unified vision to implement it at scale—a vision that CHINS aims to provide.

Solving the Present vs. Future Dilemma

Since this essay is dedicated to a competition for a collection of essays on longtermism, it is worth mentioning another key problem of longtermism that runs through the entire collection. Namely, how to find a balance between caring for people here and now with their understandable problems and abstract risks of the distant future? The answer is that we can use the same organization to solve this problem. I described how to achieve this in my post “Beyond Short-Termism: How δ and w Can Realign AI with Our Values”.


Conclusion

 

We stand at the point where fantasy becomes reality overnight.And the path ahead splits.

One road leads to endless debate and paralysis by analysis, as we watch the future happen to us. 

The other is the path of conscious creation—of daring to build the systems that can make us worthy of the intelligence we are about to unleash. This is not merely another interesting problem to be solved. It is the defining challenge of the hinge of history. The choice is ours, and the clock is ticking.

We either passively wait to see what fate has in store for us — someone else's fairy tale or nightmare — or we take the leverage into our own hands and turn the hinge in the way we want.


Questions for the community:


Ray Raven @ 2025-10-02T09:14 (+1)

I'm quite fascinated about your proposal and it makes quite sense. AI at the end of the day just an algorithm which works with with vast amount of data to train itself. Truly, instead of being too worked up with ai risks, we should develop our own ethical values first.

Ronen Bar @ 2025-09-24T16:15 (+1)

I think Ethical co-evolution of humanity and AI is a very interesting concept, that on the one hand points out to humans staying in control and not handing over power and knowldge of what is "right" to AI, but on the other hand willing to learn and develop, understanding we are shortsighted when it comes to morality, and also may have a lot to learn from AI!

idea21 @ 2025-09-19T09:17 (+1)
  • Value Specification. What specific values and whose values should be embedded in AI? Human values are extremely difficult to formalize.

 

 

Regarding AI helping humanity resolve its moral development, we can count on the expectation that AI will guarantee us greater objectivity and, by its very nature, linked to physical phenomena, free us from biases not only cultural but also those inherent to our nature.

We all know that human behavior is determined by the evolutionary requirements of Homo sapiens as a prehistoric hunter-gatherer. AI should also know this, since it has the capacity to observe us in our biological context.

As a "cultural animal," Homo sapiens has the ability to control its instincts, repressing antisocial ones (aggression, tribalism, irrationalism, etc.) and promoting, as superstimuli, prosocial ones (empathy, caring, altruism, etc.). This objective vision seems more within reach of AI than of today's civilized human beings, inevitably burdened by prejudices.

titotal @ 2025-09-19T11:55 (+3)

How will an AI that is built, trained, and curated by humans along every step of it's development end up being "objective"?

Beyond Singularity @ 2025-09-19T13:55 (+2)

The post is not saying that AI will help humanity in its ethical evolution. Quite the contrary — first, we need to teach ethics to AI.

When I write about the need for ethical education and development of people, I mean a process in which people, by becoming familiar with various ethical views, dilemmas, and situations, become wiser themselves and capable of collectively teaching AI less biased and more robust values.

I understand that you assume that AI is already or will become morally objective in the future. But this is not the case. Modern AI does not think like humans; it only imitates the thinking process, being complex pattern recognition systems. Therefore, they are far from objective.

Current AI is trained on huge amounts of our texts: books, articles, posts, and comments. It absorbs everything in this data. If the data contains prejudice, hatred, or logical errors, AI will learn them too, perceiving them as the “objective” norm. A striking example: if a model is trained on racist texts, it will reproduce and even rationalize racism in its responses, defending this point of view as if it were an indisputable truth.

This example shows how critically important it is who trains AI and on what data. Moreover, all people have different values, and it is not at all obvious to AI which of them are “better” or “more correct.” This is the essence of one of the key philosophical problems — Value Specification.

I agree that the ideal AI you describe could indeed help humanity in its ethical evolution in the future. But before that happens, we still need to create such an AI without straying from ethical principles. And this post is dedicated to how we can do that — through ethical co-evolution, rather than passively waiting for a savior.

idea21 @ 2025-09-21T15:43 (+1)

if a model is trained on racist texts, it will reproduce and even rationalize racism in its responses, defending this point of view as if it were an indisputable truth.

 

But this isn't a very brilliant intelligence. If a human being with a high IQ is raised in an irrationalist environment (racism, theism, tribalism, sexism, etc.), they usually have the capacity for logical rationality to understand both the inconsistency of such approaches... and the psychological and cultural origins that gave rise to them.

It's assumed that AI will far surpass human intelligence, so they must be perfectly familiar with the biases, heuristics, and contradictions of human reason... just as human ethologists understand animal behavior.

Beyond Singularity @ 2025-09-21T22:07 (+1)

Do you believe that AI will be able to become superintelligent and then superethical and moral on its own, without any targeted efforts on our part?