Owen Cotton-Barratt: What does (and doesn't) AI mean for effective altruism?

By EA Global @ 2017-08-11T08:19 (+10)

This is a linkpost to https://www.youtube.com/watch?v=gATWIWiIy_8&list=PLwp9xeoX5p8NnWYsybl_ZRMtaK7uBr4sN&index=4&t=7s


In this 2017 talk, The Future of Humanity Institute's Owen Cotton-Barratt discusses what strategy effective altruists ought to adopt with regards to the development of advanced artificial intelligence. He argues that we ought to adopt a portfolio approach — i.e., that we ought to invest resources in strategies relevant to several different AI scenarios. At the very end you will find an added section on what you can do to help.

The below transcript is lightly edited for readability.

The Talk

Some of you may have noticed that a bunch of people in this community seem to think that AI is a big deal. I was going to talk about that a little bit. I think that there are a few different ideas which feed into what we should be paying a lot of attention to. One is that from a moral perspective, the biggest impacts of our actions - and perhaps even overwhelmingly so - are the effects of our actions today on what happens in the long term future. Then there's some pretty empirical ideas. One is that artificial intelligence might be the most radically transformative technology that has ever been developed. Then actually artificial intelligence is something that we may be able to influence the development of. Influencing that could be a major lever over the future. If we think that our actions over the long term future are important, this could be one of the important mechanisms. Then as well, that artificial intelligence and the type of radically transformative artificial intelligence could plausibly be developed in the next few decades.

I don't know what you think of all of these claims. I tend to think that they're actually pretty plausible. For the rest of this talk, I'm going to be treating these as assumptions, and I want to explore the question: if we take these seriously, where does that get us? If you already roughly agree with these, then you can just have a like sit back and see how much you agree with the analysis, and maybe that's relevant for you. If you don't agree with one of those claims, then you can treat this as an exercise in understanding how other bits of the community might think. Maybe some of the ideas will actually be usefully transferrable. Either way, if there are some of these that you haven't thought much about before, I encourage you to go and think about them - take some time afterwards. Because it seems to me at least that these are, each of these ideas is something which potentially has large implications for how we should be engaging in the world in this project of trying to help it. It seems like it's therefore the kind of thing which is worth having an opinion on.

Okay, so I'm going to be exploring where this gets us. I think a cartoon view people sometimes hold is if you believe in these ideas, then you think everybody should quit what they're working on, and drop everything, and go and work on the problem of AI safety. I think this is wrong. I think there are some related ideas in that vicinity where there's some truth. But it's a much more nuanced picture. I think for most people, it is not correct to just quit what they're doing, to work on something safety related instead. But I think it's worth understanding in what kind of circumstances it might be correct to do that, and also how the different pieces of the AI safety puzzle fit together.

I think that thinking about timelines is important for AI. It is very hard to have any high level of confidence in when AI might have different capabilities. Predicting technology is hard, so it's appropriate to have uncertainty. In fact, here's a graph.

You can see the bunch of faint lines showing individual estimates of people working in machine learning research of when they expect high level AI to be developed. Then this bold red thing is the median of those. That's quite a lot of uncertainty. If you take almost any individual's view, and certainly this aggregate view, that represents very significant uncertainty over when transformative AI might occur. So we should be thinking about that.

Really our uncertainty should follow some kind of smooth distribution. For this talk, I'm gonna talk about four different scenarios. I think that the advantage of anchoring the possibilities as particular scenarios and treating them as discrete rather than continuous is that it becomes easier to communicate about, and it becomes easier to visualize, and think, "Okay, well what would you actually do if the timeline has this type of length?"

The first scenario represents imminent AI, maybe something on the scale of 0 to 10 years away. In this case, it's more likely that we actually know or can make educated guesses already about who the important actors will be around the development of AI.

I want to explore a little bit about what strategies we might pursue based on each of these different timelines. If you assume this first one, then there's no time for long processes. If your idea was, "Well, I'll do a degree in CS, and then I'll go and get a PhD in machine learning, and then I'll go into research," you're too late. On the other hand, if you are already in a position where you might be able to do something in the short term, then it could be worth paying attention to. But I feel for a lot of people, even if you think there is some small chance of this first scenario happening (which in general you want to pay attention to) it may be that there isn't a meaningful way to engage.

The next possible scenario would be maybe between 10 and 25 years out. This is a timescale in which people can naturally build careers. They can go and they can learn things. They can develop networks. They can build institutions. They can also build academic fields. You can ask questions, get people motivated, and get them interested in the framing of the question that you think is important. You can also have time for some synthesis and development of relevant ideas. I think that building networks where we persuade other people who maybe aren't yet in a direct position of influence, but might be later, can be a good idea.

If we look a bit further to another possible scenario, maybe between 25 to 60 years out, that's a timescale at which people who are in the important fields today may be retiring. Paradigms in academic fields might have shifted multiple times. It becomes hard to take a zoomed in view of what it is that we need, but this means that it's more important and build things right rather than quickly. We want to build solid foundations for whatever the important fields are. When I say the important fields here, I'm thinking significantly about technical fields of how we build systems which do what we actually want them to do. I'm also thinking about the kind of governance, policy, and processes in our society around AI. Who should develop AI? How should that be structured? Who is going to end up with control over the things which are produced?

These scenrios are all cartoons. I'm presenting a couple of stylized facts about each kind of timeline. There will be a bit of overlap of these strategies, but just to give an idea of how actually the ideal strategy changes. Okay. The very distant maybe more than 60 years out, anything, maybe it's even hundreds of years, at this level predictability gets extremely low. If it takes us this long to develop radically transformative AI, it is quite likely that something else radically transformative will have happened to our society in the meanwhile. We're less likely to predict what the relevant problems will be. Instead, it makes sense to think of a strategy of building broad institutions, which are going to equip the people of that time to better deal with the challenges that they're facing then.

I think actually it's plausible that the effective altruism community, and the set of ideas around that community, might be one broad, useful institution for people of the far future. If we can empower people with tools to work out what is actually correct, and the motivation and support to act on their results, then I'd be like, "Yep, I think we can trust those future people to do that."

The very long term is the timescale at which other very transformative things occurring in our society are more likely to happen. This can happen on the shorter timescales as well. But if you think on a very long timescale, there is much more reason to put more resources toward other big potential transitions, rather than just AI. I think that AI could be a big deal, but it's definitely not the only thing that could be a big deal.

Okay. I've just like talked us through different timelines. I think that most reasonable people I know put at least some nontrivial probability on each of these possible scenarios. I've also just outlined how we probably want to do different things for the different scenarios. Given all of this, what should we actually be doing? One approach is to say, "Well, let's not take these on the timelines. Let's just do things that we think are kind of generically good for all of the different timelines." I think that that's a bad strategy because I think it may miss the best opportunities. There may be some things which you only notice are good if you're thinking of something more concrete rather than just an abstract, "Oh, there's gonna be AI at some point in the future." Perhaps for the shorter timelines, that might involve going and talking to people who might be in a position to have any effect in the short term, and working out, "Can I help you with anything?"

Okay. The next kind of obvious thing to consider is, well, let's work out which of these scenarios is the most likely. But if you do this, I think you're missing something very important, which is that we might have different degrees of leverage over these different scenarios. The community might have different leverage available for each scenario. It can also vary by individual. For the short timelines, probably leverage is much more heterogeneous between different people. Some people might be in a position to have influence, in that case it might be that they have the highest leverage there. By leverage, I mean roughly, conditional on that scenario actually pertaining, how much does you putting in a year of work, trying your best, have an effect on the outcome? Something like that.

Okay. Maybe we should just be going for the highest likelihood multiplied by leverage. This of course is like the place where we have the most expected impact. I think there's something to that. I think that if everybody properly does that analysis for themselves and updates as people go and take more actions in the world, then in theory that should get you to the right things. But the leverage of different opportunities varies both as people take more opportunities and also even just for an individual. I've known people who think that they've had different opportunities they can take to help short timelines and then a bunch of other opportunities to help with long timelines. This is a reason not to naively go for highest likelihood multiplied by leverage.

Okay. What else? Well, can we think about what portfolio of things we could do? I was really happy about the theme of this event because thinking about the portfolio and acting under uncertainty is something I've been researching for the past two or three years. On this approach, I think we want to collectively discuss the probabilities of different scenarios, the amount of leverage we might have for each, and the diminishing returns that we have on work aimed at each. Then also we should discuss about what that ideal portfolio should look like. I say collectively because this is all information where when we work things out for ourselves, we can help inform others about it as well, and we can probably do better using collective epistemology than we can individually.

Then we can individually consider, "Okay, how do I think in fact the community is deviating from the ideal portfolio? What can I do to correct that?" Also, "What is my comparative advantage here?" Okay. I want to say a couple of words about comparative advantage. I think you know the basic idea. Here's the cartoon I think of it in terms of:

You've got Harry, Hermione, and Ron, they have three tasks to do, and they've gotta do one task each. Hermione is best at everything, but you can't just get Hermione to do all the things. You have to allocate them one to one. So it's a question of how do you line the people up to the things so that you have everyone doing something that they're pretty good at it, and overall you get all of the important things done? I think that this is something that we can think of at the level of individuals choosing, "What am I going to work on? Well, I've got this kind of skillset." It's something that we can think of at the level of groups as well. We can ask, "What is my little local community going to work on?" or "What is this organization going to do, and how do we split up responsibility between different organizations?"

Comparative advantage is also a concept you can think of applied over time. This is a little bit different because people's actions in the past are fixed, so we can't affect those. But you can think there's things that might want to be done and we can do some of these. People in the past did some of them. People in the future might do some of them and there's a coordination question of what we have a comparative advantage at relative to people in the future. This is why when I was looking at longer scenarios, the next generation in the distant cases, I was often thinking it was better to let people in the future solve the concrete problems. They're gonna be able to see more clearly what is actually to be solved. Meanwhile, we have a comparative advantage at building the processes, the communities, the institutions which compound over time, and where getting in early is really helpful.

If you're taking something like this portfolio approach, I think that most projects should normally have at least a main scenario in mind. This forces you to be a little bit more concrete and to check that the things you're thinking of doing actually line up well with the things which are needed in some possible world. I also think you want to be a bit careful about checking that you're not doing anything which would be bad for other scenarios. There's always an opportunity cost. If you're doing something where you're thinking, "I want to help with this short timeline scenario," then you're not doing something else you could've done to help with the next generation in a longer timeline scenario.

You could also have situations where maybe I would think that if AI is imminent, the right thing to do is to run around and say, "Everybody panic. AI is coming in five years. It's definitely coming in five years." If it definitely were coming in five years, maybe that would be the right thing to do. I actually don't think it is. Even if it were, I think that would be a terrible idea because if you did that, then people, if it didn't occur in five years and we were actually in a world where radically transformative AI was coming in 25 years, then in 15 years, a lot of people are gonna go, "We've heard that before," and not want to pay attention to it. This is a reason to have an idea of paying some attention to the whole idea of the portfolio that as a community we want to be paying attention to even if individually, most projects should have a main scenario in mind. Maybe as an individual, your whole body of work has a main scenario in mind. It's still worth having an awareness of where other people are coming from, and what they're working on, and what we're doing collectively then.

I've mostly talked about timelines here. I think that there are some other significant uncertainties about AI. For instance, how much is it that we should be focusing on trying to reduce the chances of catastrophic accidents from powerful AI? Or how much of the risk is coming from people abusing powerful technologies? We hypothesized it was gonna be a radically transformative technology with influence over the future. How much of that influence actually comes through things which are fairly tightly linked to the AI development process? Or how much influence appears after AI is developed? If most of the influence comes from what people want in the world after an AI is developed, it might makes sense to try to affect people's wants at that point.

In both of these cases, I think we might do something similar to portfolio thinking. We might say, "Well, we've put some weight on each of these possibilities," and then we think about our leverage again. Maybe for some of them, we shouldn't be split. Some of them we might do. We can't do this with all of the uncertainties. There are a lot of uncertainties about AI.

Here's a slide from another talk. It just lists a lot of questions. A lot of them about how AI might develop. We can all have nuanced views about each of these questions. That's fine. We need to do some picking and choosing here. But I do think that we should strive for nuance. I think the reason is that there's a lot of uncertainty, and we could potentially have extremely nuanced views about a lot of different things. The world is complicated, and we have a moderately limited understanding of it. One of the things which may make us better equipped for the future is trying to reduce our limits on our understanding.

What can individuals do? I think consider personal comparative advantage. You can ask yourself, "Could I seriously be a professional researcher in this?" Check with others as well. I think people vary in their levels of self-confidence, so I actually think that others' opinions often can be more grounding than our own opinion for this. It's a pretty specialized skillset that I think is useful for doing technical safety research. Most people in the community are not gonna end up with that skillset and that's fine. They should not be quitting their jobs, and going to try and work on safety research. They could be saying, "Well, I want to give money to support this," or they could be aiming at other parts of this portfolio. They could say, "Well, I want to help develop our institutions to build something where we're gonna be better placed to deal with some of the longer timeline scenarios."

You could also diversify around those original assumptions that I made. I think that each of them is pretty likely to be true. But I don't think we should assume that they are all definitely true. We can check whether in fact there are worlds where they're not true that we want to be putting some significant weight onto. I think also just helping promote good community epistemics is something that we can all play a part in. By this I mean pay attention to why we believe things and communicate our real reasons to people. Sometimes you believe a thing because of a reason like: "Well, I read this in a blog post by Carl Shulman, and he's really smart." He might provide some reasons in that blog post, and I might be able to pallet the reasons a little bit. But if the reason I really believe it is I read that, that's useful to communicate to other people because then they know where the truth is grounded in the statements I'm making, and it may help them to be able to better see things for themselves, and work things out. I also think we do want to often pay attention to trying to see the underlying truth for ourself. Good community epistemics is one of these institutions which I think are helpful for the longer timelines, but I think they're also helpful for our community over shorter periods. If we want to have a portfolio, we are going to have to coordinate and exchange views on what the important truths are.

What does AI mean for effective altruism? My view is that it isn't the one thing that everyone has to pay attention to, but it is very plausibly a big part of this uncertain world stretching out in front of us. I think that we collectively should be paying attention to that and working out what we can do, so we can help increase the likelihood of good outcomes for the long-term future.