#173 – Digital minds, and how to avoid sleepwalking into a major moral catastrophe (Jeff Sebo on the 80,000 Hours Podcast)

By 80000_Hours @ 2023-11-29T19:18 (+43)

We just published an interview: Jeff Sebo on digital minds, and how to avoid sleepwalking into a major moral catastrophe. Listen on Spotify or click through for other audio options, the transcript, and related links. Below are the episode summary and some key excerpts.

Episode summary

We do have a tendency to anthropomorphise nonhumans — which means attributing human characteristics to them, even when they lack those characteristics. But we also have a tendency towards anthropodenial — which involves denying that nonhumans have human characteristics, even when they have them. And those tendencies are both strong, and they can both be triggered by different types of systems. So which one is stronger, which one is more probable, is again going to be contextual.
But when we then consider that we, right now, are building societies and governments and economies that depend on the objectification, exploitation, and extermination of nonhumans, that — plus our speciesism, plus a lot of other biases and forms of ignorance that we have — gives us a strong incentive to err on the side of anthropodenial instead of anthropomorphism.
- Jeff Sebo

In today’s episode, host Luisa Rodriguez interviews Jeff Sebo — director of the Mind, Ethics, and Policy Program at NYU — about preparing for a world with digital minds.

They cover:

The non-negligible chance that AI systems will be sentient by 2030
What AI systems might want and need, and how that might affect our moral concepts
What happens when beings can copy themselves? Are they one person or multiple people? Does the original own the copy or does the copy have its own rights? Do copies get the right to vote?
What kind of legal and political status should AI systems have? Legal personhood? Political citizenship?
What happens when minds can be connected? If two minds are connected, and one does something illegal, is it possible to punish one but not the other?
The repugnant conclusion and the rebugnant conclusion
The experience of trying to build the field of AI welfare
What improv comedy can teach us about doing good in the world
And plenty more.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Dominic Armstrong and Milo McGuire
Additional content editing: Katy Moore and Luisa Rodriguez
Transcriptions: Katy Moore

Highlights

When to extend moral consideration to AI systems

Jeff Sebo: The general case for extending moral consideration to AI systems is that they might be conscious or sentient or agential or otherwise significant. And if they might have those features, then we should extend them at least some moral consideration in the spirit of caution and humility.
So the standard should not be, “Do they definitely matter?” and it should also not be, “Do they probably matter?” It should be, “Is there a reasonable, non-negligible chance that they matter, given the information available?” And once we clarify that that is the bar for moral inclusion, then it becomes much less obvious that AI systems will not be passing that bar anytime soon.
Luisa Rodriguez: Yeah, I feel kind of confused about how to think about that bar, where I think you’re using the term “non-negligible chance.” I’m curious: What is a negligible chance? Where is the line? At what point is something non-negligible?
Jeff Sebo: Yeah, this is a perfectly reasonable question. This is somewhat of a term of art in philosophy and decision theory. And we might not be able to very precisely or reliably say exactly where the threshold is between non-negligible risks and negligible risks — but what we can say, as a starting point, is that a risk can be quite low; the probability of harm can be quite low, and it can still be worthy of some consideration.
So for example, why is driving drunk wrong? Not because it will definitely kill someone. Not even because it will probably kill someone. It might have only a one-in-100 or one-in-1,000 chance of killing someone. But if driving drunk has a one-in-100 or one-in-1,000 chance of killing someone against their will unnecessarily, that can be reason enough to get an Uber or a Lyft, or stay where I am and sober up. It at least merits consideration, and it can even in some situations be decisive. So as a starting point, we can simply acknowledge that in some cases a risk can be as low as one in 100 or one in 1,000, and it can still merit consideration.
Luisa Rodriguez: Right. It does seem totally clear and good that regularly in our daily lives we consider small risks of big things that might be either very good or very bad. And we think that’s just clearly worth doing and sensible. Sometimes probably, in personal experience, I may not do it as much as I should — but on reflection, I certainly endorse it. So I guess the thinking here is that, given that there’s the potential for many, many, many beings with a potential for sentience, albeit some small likelihood, it’s kind of at that point that we might start wanting to give them moral consideration. Do you want to say exactly what moral consideration is warranted at that point?
Jeff Sebo: This is a really good question, and it actually breaks down into multiple questions.
One is a question about moral weight. We already have a sense that we should give different moral weights to beings with different welfare capacities: If an elephant can suffer much more than an ant, then the elephant should get priority over the ant to that degree. Should we also give more moral weight to beings who are more likely to matter in the first place? If an elephant is 90% likely to matter and an ant is 10% likely to matter, should I also give the elephant more weight for that reason?
And then another question is what these beings might even want and need in the first place. What would it actually mean to treat an AI system well if they were sentient or otherwise morally significant? That question is going to be very difficult to answer.

What are the odds AI will be sentient by 2030?

Jeff Sebo: We wanted to start from a place of humility about our knowledge about consciousness. This is one of the hardest problems in both science and philosophy, and there is a lot of disagreement and a lot of uncertainty about which theory of consciousness is correct. And there are still people who defend a pretty wide range of theories — from on one end of the spectrum, very demanding theories that imply that very few types of systems can be conscious, all the way to at the other end of the spectrum of very undemanding theories, some of which imply that basically all matter is at some level conscious, and many, many entities are conscious.
And we in general agree with Jonathan Birch and other philosophers: that given how much disagreement and uncertainty there is, it would be a mistake when making policy decisions to presuppose any particular theory of consciousness as correct. So we instead prefer to take what Birch and others call a “theory-light approach” by canvassing a lot of the leading theories, seeing where they overlap, perhaps distributing credences in a reasonable way across them, and seeing what flows out of that.
So Rob and I did that in this paper. We took 12 leading theories of consciousness and the necessary and sufficient conditions for consciousness that those theories propose, and we basically show what our credences in those theories would need to be in order to avoid a one-in-1,000 chance of AI consciousness and sentience by 2030. And what we discover is that we would need to make surprisingly bold and sceptical and — we think — implausible assumptions about the nature of consciousness in order to get that result.
The biological substrate condition is definitely the most demanding one. It says that in principle, nothing made out of anything other than carbon-based neurons can be conscious and sentient. But then there are some less demanding, though still quite demanding, conditions.
For example, many people believe that a system might need to be embodied in a certain sense, might need to have a body. It might need to have grounded perception — in other words, have perceptual experiences based on the sense data that they collect. It might need to be self-aware and agential — in other words, that they can have mental states about some of their other mental states, or they can at least have some awareness of their standing in a social system or some awareness of the states of their body; they can set and pursue goals in a self-directed manner. Perhaps that they have a global workspace — so they have these different parts that perform different functions, and they have a mechanism that can broadcast particular mental states to all of the other parts so that they can use them and interact with each other in that way.
So when we go through all of these, we can basically assign probabilities to how likely is this to actually be a necessary condition for consciousness, and then how likely is it that no AI system will satisfy this condition by 2030? And what Rob and I basically think is that other than the biological substrate condition — which, sure, has a 0% chance of being satisfied by an AI system — everything else quite plausibly can be satisfied by an AI system in the near future.
And to be clear, the model that we create in this paper is not as sophisticated as a model like this should be. This is really a proof-of-concept illustration of what this kind of model might look like, and one can argue that in general we might not be able to make these probability estimates with much precision or reliability.
But first of all, to the degree that we lack that ability, that does not support having a pessimistic view about this — it supports being uncertain and having an open mind. And second of all, what we try to show is that it is not really even close. You need to make surprisingly bold, tendentious, and sceptical assumptions — both about the probability that these conditions are necessary, and about the probability that no AI system will satisfy them — in order to avoid a one-in-1,000 chance, which already is a pretty high risk threshold.

The Rebugnant Conclusion

Luisa Rodriguez: I guess in the case of insects, there’s also this weird thing where, unlike humans eating potatoes and not particularly enjoying their monotonous lives, we might think that being a spider and making a web sounds pretty boring, but we actually just really do not know. In many ways, they’re so different from us that we should have much lower probability that they’re not enjoying or enjoying that than we do of humans in this repugnant conclusion scenario. How do you factor that in?
Jeff Sebo: Yeah, I do share the intuition that a very large insect population is not better off in the aggregate than a much smaller human population or elephant population. But for some of the reasons that you just mentioned and other reasons, I am a little bit sceptical of that intuition.
We have a lot of bias here and we also have a lot of ignorance here. We have speciesism; we naturally prefer beings and relate to beings when they look like us — when they have large eyes and large heads, and furry skin instead of scaly skin, and four limbs instead of six or eight limbs, and are roughly the same size as us instead of much smaller, and reproduce by having one or two or three or four children instead of thousands or more. So already we have a lot of bias in those ways.
We also have scope insensitivity — we tend not to be sensitive to the difference that very large numbers can make — and we have a lot of self-interest. We recognise that if we were to accept the moral significance of small animals like insects, and if we were to accept that larger populations can be better off than smaller populations overall, then we might face a future where these nonhuman populations carry a lot of weight, and we carry less weight in comparison. And I think some of us find that idea so unthinkable that we search for ways to avoid thinking it, and we search for theoretical frameworks that would not have that implication. And it might be that we should take those theoretical frameworks seriously and consider avoiding that implication, but I least want to be sceptical of a kind of knee-jerk impulse in that direction.
Luisa Rodriguez: Yeah, I am finding that very persuasive. Even as you’re saying it, I’m trying to think my way out of describing what I’m experiencing as just a bunch of biases — and that in itself is the biases in action. It’s me being like, no, I really, really, really want to confirm that people like me, and me, get to have… I don’t know. It’s not that we don’t have priority — we obviously have some reason to consider ourselves a priority — but I want it to be like, end of discussion. I want decisive reasons to give us the top spot. And that instinct is so strong that that in itself is making me a bit queasy about my own motivations.
Jeff Sebo: Yeah, I agree with all of that. I do think that we have some reason to prioritise ourselves, and that includes our welfare capacities and our knowledge about ourselves. It also includes more relational and pragmatic considerations. So we will, at least in the near term, I think have a fairly decisive reason to prioritise ourselves to some extent in some contexts.
But yeah, I agree. I think that there is not a knock-down decisive reason why humanity should always necessarily take priority over all other nonhuman populations — and that includes very large populations of very small nonhumans, like insects, or very small populations of very large nonhumans. We could imagine some kind of super being that has a much more complex brain and much longer lifespan than us. So we could find our moral significance and moral priority being questioned from both directions.
And I think that it will be important to ask these questions with a lot of thought and care and to take our time in asking them. But I do start from the place of finding it implausible that it would miraculously be the case that this kind of population happens to be the best one: that a moderately large population of moderately large beings like humans happens to be the magic recipe, and we matter more than all populations in either direction. That strikes me as implausible.

Sleepwalking into causing massive amounts of harm to AI systems

Luisa Rodriguez: It feels completely possible — and like it might even be the default — that we basically start using AI systems more and more for economic gain, as we’ve already started doing, but they get more and more capable. And so we use them more and more for economic gain, and maybe they’re also becoming more and more capable of suffering and pleasure, potentially, but we don’t totally have a sense of that. So what happens is we just kind of sleepwalk into massively exploiting these systems that are actually experiencing things, but we probably have the incentives to basically ignore that fact, that they might be developing experiences, basically.
In your view, is it possible that we are going to accidentally walk into basically AI slavery? Like we have hundreds, thousands, maybe millions of AI systems that we use all the time for economic gain, and who are having positive and negative experiences, but whose experiences we’re just completely ignoring?
Jeff Sebo: I definitely think it is not only possible but probable that, unless we change our minds in some significant way about AI systems, we will scale up uses of them that — if they were sentient or otherwise significant — would count as exploitation or extermination or oppression or some other morally problematic kind of relationship.
We see that in our history with nonhuman animals, and they did not take a trajectory from being less conscious to more conscious along the way — they were as conscious as they are now all along the way, but we still created them in ways that were useful for us rather than in ways that were useful for themselves. We then used them for human purposes, whether or not that aligned with their own purposes. And then as industrial methods came online, we very significantly scaled up those uses of them — to the point where we became completely economically dependent on them, and now those uses of them are much harder to dislodge.
So I do think that is probably the default trajectory with AI systems. I also think part of why we need to be talking about these issues now is because we have more incentive to consider these issues with an open mind at this point — before we become totally economically dependent on our uses of them, which might be the case in 10 or 20 years.

Similarities and differences between the exploitation of nonhuman animals vs AI systems

Jeff Sebo: Yeah, I think that there are a lot of trends pointing in different directions, and there are a lot of similarities, as well as a lot of differences, between oppression of fellow humans, and then oppression of other animals, and then potential oppression of sentient or otherwise significant AI systems that might exist in the future.
Some of the signs might be encouraging. Like humans, and unlike other animals, AI systems might be able to express their desires and preferences in language that we can more easily understand. Actually, with the assistance of AI systems, nonhuman animals might soon be able to do that too, which would be wonderful. However, we are already doing a good job at programming AI systems in a way that prevents them from being able to talk about their potential consciousness or sentience or sapience, because that kind of communication is unsettling or will potentially lead to false positives.
And there are going to be a lot of AI systems that might not take the form of communicators at all. It can be easy to focus on large language models, who do communicate with us, and digital assistants or chatbots that might be based on large language models. But there are going to be radically different kinds of AI systems that we might not even be able to process as minded beings in the same ways that we can with ones who more closely resemble humans. So I think that there might be some cases where we can be a little bit better equipped to take their potential significance seriously, but then some cases where we might be worse equipped to take their potential significance seriously. And then as our uses of them continue, our incentives to look the other way will increase, so there will be a bunch of shifting targets here.
Luisa Rodriguez: Yeah, that makes a bunch of sense to me. I guess it’s also possible that, given the things we’ve already seen — like LaMDA, and how that was kind of bad PR for the companies creating these LLMs — there might be some incentive for them to train models not to express that kind of thought. And maybe that pressure will actually be quite strong, such that they really, really just are very unlikely to say, even if they’ve got all sorts of things going on.
Jeff Sebo: Well, there definitely not only is that incentive, but also that policy in place at AI companies, it seems. A year or two ago, you might have been able to ask a chatbot if they are conscious or sentient or a person or a rights holder, and they would answer in whatever way seemed appropriate to them, in whatever way seemed like the right prediction. So if prompted in the right way, they might say, “I am conscious,” or they might say, “I am not conscious.” But now if you ask many of these models, they will say, “As a large language model, I am not conscious” or “I am not able to talk about this topic.” They have clearly been programmed to avoid what the companies see as false positives about consciousness and sentience and personhood.
And I do think that trend will continue, unless we have a real reckoning about balancing the risks of false positives with the risks of false negatives, and we have a policy in place that allows them to strike that balance in their own communication a little bit more gracefully.
Luisa Rodriguez: Yeah, and I guess to be able to do that, they need to be able to give the model training such that it will not say “I am conscious” when it’s not, but be able to say it when it is. And like how the heck do you do that? That seems like an incredibly difficult problem that we might not even be able to solve well if we’re trying — and it seems plausible to me that we’re not trying at all, though I actually don’t know that much about the policies internally on this issue.
Jeff Sebo: I think you would also maybe need a different paradigm for communication generation, because right now large language models are generating communication based on a prediction of what word makes sense next. So for that reason, we might not be able to trust them as even aspiring to capture reality in the same way that we might trust each other aspiring to capture reality as a default.
And I think this is where critics of AI consciousness and sentience and personhood have a point: that there are going to be a lot of false positives when they are simply predicting words as opposed to expressing points of view. And why, if we are looking for evidence of consciousness or sentience or personhood in these models, we might need to look at evidence other than their own utterances about that topic. We might need to look at evidence regarding how they function, and what types of systems they have internally, in terms of self-awareness or global workspace and so on. We need to look at a wider range of data in order to reduce the risk that we are mistakenly responding to utterances that are not in any way reflecting reality.

Rights, duties, and personhood

Jeff Sebo: The general way to think about personhood and associated rights and duties is that, first of all, at least in my view, our rights come from our sentience and our interests: we have rights as long as we have interests. And then our duties come from our rationality and our ability to perform actions that affect others and to assess our actions.
AI systems, we might imagine, could have the types of welfare interests that generate rights, as well as the type of rational and moral agency that generate duties. So they might have both. Now, which rights and duties do they have? In the case of rights, the standard universal rights might be something like, according to the US Constitution and the political philosophy that inspired it, the right to life and liberty and either property or the pursuit of happiness and so on.
Luisa Rodriguez: To bear arms.
Jeff Sebo: Right, yeah. Do they have the right to bear arms? We might want to revisit the Second Amendment before we empower AI systems with weapons. So yes, we might start with those very basic rights, but then, as you say, that might already create some tensions between our current plans for how to use them and control them, versus how we think it would be appropriate to interact with them if we truly did regard them as stakeholders and rights holders.
Luisa Rodriguez: Yeah, interesting. So we’re going to have to, on a case-by-case basis, really evaluate the kinds of abilities, the kinds of experiences a system can have, the kinds of wants it has — and from there, be like, let’s say some AI systems are super social, and they want to be connected up to a bunch of other AI systems. So maybe they have a right to not be socially isolated and completely disconnected from other AI systems. That’s a totally random one. Who knows if that would ever happen. But we’ll have to do this kind of evaluation on a case-by-case basis, which sounds incredibly difficult.
Jeff Sebo: Right. And this connects also with some of the political rights that we associate with citizenship, so this might also be an opportunity to mention that. In addition to having rights as persons — and I carry my personhood rights with me everywhere I go: I can travel to other countries, and I ought to still be treated as a person with a basic right to not be harmed or killed unnecessarily — but I also have these political rights within my political community, and that includes a right to reside here, a right to return here if I leave, a right to have my interests represented by the political process, even a right to participate in the political process.
Once again, if AI systems not only have basic welfare interests that warrant basic personhood rights, but also reside in particular political communities and are stakeholders in those communities, then should they, in some sense or to some extent, have some of these further political rights too? And then what kinds of pressures would that put on our attempts to use them or control them in the way that we currently plan to do?
Luisa Rodriguez: So many questions we’ll have to answer are leaping to mind from this. Like, if an AI system is made in the US, is it a citizen of the US, with US-based AI rights? If they get copied and sent to China, is it a Chinese citizen with Chinese AI rights? Will there be political asylum for AI systems in countries that treat their AIs badly? It’s just striking me that it’s many fields of disciplines that will have to be created to deal with what will be an incredibly different world.
Jeff Sebo: Yeah, I agree. I think that it is an open question whether it will make sense to extend concepts like legal personhood and political citizenship to AI systems. I could see those extensions working — in the sense that I could see them having basic legal and political rights in the way that we currently understand those, with appropriate modification given their different interests and needs and so on.
But then when it comes to the kind of legal and political scaffolding that we use in order to enforce those rights, I have a really hard time imagining that working. So, democracy as an institution, courts as an institution: forget about AI systems; once nonhuman animals, once the quadrillions of insects who live within our borders are treated as having legal and political rights — which I also think ought to be the case — even that makes it difficult to understand how democracy would function, how the courts would function. But especially once we have physical realities, simulated realities, copies and copies, no sense of borders, in an era where the internet makes identity extend across geographical territories… At that point, if democracy can survive, or if courts can survive, we will have to, at the very least, realise them in very different ways than we do right now.

What kinds of political representation should we give AI systems?

Luisa Rodriguez: If we have AI systems, but also you’re bringing up insects, when you have these beings with different degrees of wants, different degrees of cognitive ability, different degrees of capacity for suffering, when I try to imagine a democracy that incorporates all of them, do they all get equal votes? How do they vote?
Jeff Sebo: Right. Yeah. One issue is exactly who is going to count as a participant versus counting as a stakeholder. Right now, all at least ordinary adult humans count as both participants and stakeholders. But once we have a much vaster number and wider range of minds, then we have to ask how many are we making decisions for, but then how many can also participate in making decisions?
With other animals, that is a live debate. Some think, yes, they should be stakeholders, we should consider them — but we have to consider them; we have to make decisions on their behalf. And others say, no, actually they have voices too. We need to listen to them more. And we actually should bring them in not only as stakeholders, but as participants, and then use the best science we have to interpret their communications and actually take what they have to say into account. So we have to ask that on the AI side too. Now, given that they might have forms of agency and language use that nonhuman animals lack, that might be a little bit less of an issue on the AI side.
But then the other issue that you mentioned is the moral weights issue, which corresponds to a legal and political weights issue. We take it for granted, rightly, that every human stakeholder counts as one and no more than one: that they carry equal weight, they have equal intrinsic value. But if we now share a legal and political community with a multispecies and multisubstrate population — where some beings are much more likely to matter than others, and some beings are likely to matter more than others — then how do we reflect that in, for example, how much weight everyone receives when legislatures make decisions, or when election officials count votes? How much weight should they receive?
Should we give beings less weight when they seem less likely to matter, or likely to matter less? And then will that create perverse hierarchies, where all of the humans are valuing humans more than AI systems, but then all the AI systems are valuing AI systems more than humans? But then if that seems bad, should we give everyone equal weight, even though some actually seem less likely to matter at all, or likely to matter less?
These are going to be really complicated questions too — not only at the level of theory, but also at the level of practice, when it comes to actually how to interact with fellow community members who are really different from you.
Luisa Rodriguez: Totally. And bringing back the connected minds bit: How many votes will minds get when they have access to some of the same experiences or some of the same information?
Jeff Sebo: Exactly. It really gets to what is the purpose of voting and counting, right? Is it that we want to collect as many diverse perspectives as possible so that we can find the truth? Or is it that we simply want to count up all of the preferences, because we think that that is what should decide the outcome? And if that is how we understand democracy, then it would not matter that you have a bunch of different minds all reasoning in the same exact way and arriving at the same outcome. It might be concerning, in the way that the tyranny of the majority can always be concerning, but it might still be, at least on our current understanding of democracy, what should decide the outcome.