Holden Karnofsky on the most important century

By 80000_Hours @ 2021-08-19T15:17 (+6)

This is a linkpost for #109 - Holden Karnofsky on the most important century. You can listen to the episode on that page, or by subscribing to the '80,000 Hours Podcast' wherever you get podcasts.

In this interview Holden and Rob discuss:

…it doesn’t look like, well, things have been normal for a long time and now all these people are saying it’s about to change.

It looks more like we just live on this rocket ship that took off five seconds ago, and nobody knows where it’s going.

–Holden Karnofsky

Key points

Why Holden wrote this series

Holden Karnofsky: A lot of it was just this feeling of gosh, we’re making really big decisions based on this strong belief that there’s this very important thing that very few people are paying attention to. And if I try to explain it to someone, I can’t point them to anywhere where it’s been clearly written down in one place, and it’s just driving me crazy. I think that some of the motivation was just this unstrategic “Gosh, that’s a weird situation. We should do something about that.” It’s definitely true that also I’m thinking a lot of what’s holding us back is that so few people see the world the way we do, or are looking at the thing we’re looking at when they think about what’s most important. And so maybe having something you could read to see what we think is most important would be really good, which there hasn’t been.

Holden Karnofsky: That’s in terms of personally why I’ve written these posts. I think it’s also good though to situate it in the context of the larger project Open Phil’s been engaging in over the last two years. The longtermist team has been thinking for a while that we are really making very large decisions about large amounts of money and talent on the basis of this hypothesis that different people would put different ways, but I would basically say the hypothesis is that we could be in the most important century of all time. And you could also say well, if there’s only a 0.1% chance that we’re in the most important century, then maybe a lot of the stuff still follows.

Holden Karnofsky: I’m not really sure it does, but I certainly… I think a lot of how we think about it is just, no, there’s a really good chance this is the most important century. Or at least it’s very high up there on the list of important centuries, because we could be looking at the development of some kind of technology, notably AI, that then causes a massive explosion in economic growth and scientific advancement and ends in the kind of civilization that’s able to achieve a very high level of stability and expansion across the galaxy. And so then you’ve got this enormous future civilization that’s much bigger than ours that, if and when that AI is developed that speeds things up, that’s going to be the crucial time for what kind of civilization that is, and what values it has, and who’s in charge of different parts of it.

Holden Karnofsky: That’s a premise that we believe is really likely enough that it’s driving a lot of our decisions, and it felt very unhealthy to be making that bet without basically… All of the reasoning for what we think was based on this informal reasoning, informal conversations, whiteboard-y stuff, Google Docs floating around. And it wasn’t really rigorous. And it wasn’t really in a form that a skeptic could look at and criticize. And so we started this project called worldview investigations, where we were just trying to take the most important aspects of this thing that we believed and write them up, even in a very technical long form, just so we could get a skeptic’s eyes on them and have the skeptic engage with them reasonably. Because it just wasn’t working to go to a random person, say what we believe, and try and work it out in conversation.

Holden Karnofsky: There’s just too much there. It was too hard to advance the hypothesis in the first place. And it was an enormous amount of work. And the worldview investigations team produced these technical reports that I think are phenomenal, and they’re public now… I would recommend that anyone who wants to read something really fascinating read them, but a lot of them are pretty dense, pretty technical, and it’s hard for an interested layperson to understand them. It’s also hard for someone to put all the pieces together. I just talked about a bunch of reports on different topics, and it’s not immediately obvious how they all fit together. That’s where it was just starting to drive me crazy. I was like, the picture is crystallizing in my head. I can point to all these reports, but there’s nowhere that’s just like, all right, here it all is, here’s the argument. And so that’s what the Most Important Century series is.

Key messages of the series

Holden Karnofsky: So there’s this diagram I use over and over again in the series that might be good to insert here, because that’s how I tried to make the whole thing follow-able.

So basically there’s a few key claims made in the series. So one is that eventually we could have this galaxy-spanning civilization that has this high degree of stability and this digital nature that is deeply unfamiliar from today’s perspective. So that’s claim number one. And I think claim number one, I mean, different people have different intuitions, but if you think we have 100,000 years to get to that kind of technology, I think a lot of people would find that pretty plausible. And that already is pretty wild, because that means that we’re among the earliest intelligent life in the galaxy. Claim two is that this could happen much more quickly than you might imagine, because we’re all used to constant growth, but a lot of our history is accelerating growth.

Holden Karnofsky: And if we changed from constant growth to accelerating growth via the ability to duplicate or automate or copy the things humans do to move the economy forward, then 100,000 years could become 10 years, could become 1 year. So that’s claim two. And then claim three is that there’s a specific way that that automation might take place via AI. And that when we try to estimate that and try to forecast it, it looks like all the estimation methods we have and all the best guesses we can make, and they’re so far from perfect, but they do point to this century, and they actually tend to point to a fair amount sooner than the end of the century. And so that’s claim three.

Holden Karnofsky: And so when you put those three together, we’re going to a crazy place. We can actually get there quickly if we have the right kind of tech, and the right kind of tech could be coming this century. And therefore it’s a crazy century. The final piece of the puzzle is just, gosh, that all sounds too crazy. And a lot of the series is just trying to point out that we live in a crazy time, and it’s not too hard to see it, just by looking at charts of economic growth, by looking at timelines of interesting events that have happened in the history of the galaxy and the planet.

All Possible Views About Humanity's Future Are Wild

Holden Karnofsky: One thing that I say in the series is that at the current rate of economic growth, it’s almost impossible to imagine that it could last more than another 8,200 years or something. Which sounds like a lot, but human civilization has only been around for thousands of years. And again, we’re talking about these timescales of billions of years. So is the idea that we’re going to stay at the current level of growth, that we’re going to stop, or we’re going to gradually slow down?

Holden Karnofsky: One way of putting it is if we’re slowing down now and we’re never going to speed up again, then we live at the tail end of the fastest economic growth that will ever be, that we will ever see in millions of years of human existence to date, and maybe billions of years of human existence going forward. There were a couple hundred years of a few percent per year economic growth. That was the craziest time of all time, and that’s the time we live in.

Holden Karnofsky: So if you believe that we’re eventually going to build this technologically mature civilization… And this is something that does require a bit of explanation and defending, which I do talk about in the series, but the idea is that we could eventually have a civilization that spans the galaxy and that is very long lasting and is digital in nature. So the way we live our lives today could be simulated or put into digital form. That’s something that needs explanation and defense. But if you believe it’s possible eventually that we’ll have this robust digital civilization that’s able to exist in a stable form across the galaxy, if you believe that’ll happen eventually, and if eventually means in 10,000 years or 100,000 years, then yeah, if you make a timeline of the galaxy, it still looks like we’re in the most important pixel.

Holden Karnofsky: Or at least in the pixel where that all happened. In the pixel where we went from this tiny civilization on this one planet of this one star to a civilization that was capable of going across the whole galaxy. And then it’s like, do you think that’s actually possible? And we could talk about that, but one thing is that we are, for the first time in history, as far as we know, we are actually starting to do space travel now.

Holden Karnofsky: And then another thing that can happen is it could turn out that it’s actually just impossible, and we’ll literally never get there, and I’m making this stuff up about a galactic civilization. And in that case we just stay on Earth forever. But I basically think there’s two wild things about that. One is, again, there was this period of scientific and technological advancement and economic growth, and it was like… Maybe it was a few thousand years long, but it was really, really quite a tiny slice of our history, and we’re living in it.

Holden Karnofsky: And two is, I just think it’s like… I don’t know, to just rule out that we would ever have that galaxy-scale civilization to me feels a little weird in some sense. By galactic timeline standards, it’s a few seconds ago we built the first computers and the first spaceships, and you’re saying, “No, we’ll never build a civilization that could span the galaxy.” It just doesn’t… That to me is a weird view in its own way.

“Can’t we just focus on, you know, the real world?”

Holden Karnofsky: I mean, I’ve been there, for sure. My history is that I spent the first part of my career co-founding GiveWell, which is entirely focused on these straightforward ways of helping people in poor countries by improving health and wellbeing, distributing proven interventions like bed nets and deworming pills for children with intestinal parasites. I mean, that’s where I’m coming from. That’s where I started. That’s what motivates me. That’s what I’m interested in.

Holden Karnofsky: One of the things that I say in the description of my blog is that it’s about ‘avant-garde effective altruism.’ So, the analogy for me would be if you hear jazz, you might hear Louis Armstrong and you might think, “That sounds great. I want to get into jazz.” And then if you meet people who’ve spent their entire life listening to jazz, a lot of their favorite music, you’re just going to be like, “What the hell is that? That’s not music. That’s not jazz. What is that? That’s just noise. That’s just someone kind of screeching into a horn or something.” Avant-garde effective altruism has a similar feel for me. I started by saying, “Hey, gosh, people are dying of malaria and a $5 bed net can prevent it.” And I was really interested in using my career to prevent that, but I was greedy about it. Over the years, I’d always be like, “But could we do even better? Is there a way we can help even more people?”

Holden Karnofsky: Well, maybe instead of helping more people, we could help more persons — things that aren’t people, but that we should still care about. Animals are having a terrible time in factory farms, and they’re being treated horribly. What if someday we’ll decide that animals are like us, and we should care about them? Wouldn’t that be horrible? Wouldn’t it be great if we did something about it today? Just pushing and pushing and pushing, and thinking about it. And I think that that is a lot of what your audience likes to do. That’s a lot of what I like to do. A lot of what I am trying to do is bring people along that avant-garde effective altruism route, and say, “If you just keep pushing and pushing, where do you go?” And in my opinion, where you go is… Yeah, of course, it’s wild to talk about digital people living by other stars in weird virtual environments that are designed to do certain things. Of course, it’s weird.

Holden Karnofsky: But if it’s the kind of thing that we think will eventually happen, or could eventually happen, then most of the people we can help are just future people who are digital people. And if you say, “Well I don’t care about them because they’re future people,” I would say, “Gosh, that didn’t sound very good; you may regret saying that. History may not judge you kindly for saying, ‘I don’t care about people that are future people. I don’t care about people that are digital people. They’re digital. I’m made out of cells.’” There’s a lot of philosophical debates to be had here, but I’ve definitely reached the conclusion that it’s at least pretty dicey to say that kind of thing.

Holden Karnofsky: And so, I think you start from, “I want fewer people to die from malaria.” And I think it actually is logical that you get to, “Well, I care about all people. I care about future people. I care about digital people, and I really care what happens to them.” And there are just awful, awful, huge stakes for a huge, huge, huge number of digital people in this thing that could be happening in this century. And that is something that I need to get a grip on, because the stakes are enormous.

Process for Automating Scientific and Technological Advancement

Holden Karnofsky: The basic idea is that if you could imagine an automated digital scientist — or engineer, or entrepreneur — someone who could do all the things a person does to advance science and technology, and then you imagine that digital person could be copied and could just work in this digital sped-up advanced form. If you just imagine that, then you can pretty fairly easily get to a conclusion that you would see a massive, crazy explosion in the rate of scientific and technological advancement. And at that point you might start thinking something like anything that is scientifically and technologically possible, we will get fairly soon. A lot of my argument is that it’s not too hard to imagine that really, really wild stuff could happen in the next 100,000 years. Stuff about building stable digital-based civilizations that go across the galaxy. Not too hard to imagine that. The interesting thing is that if we get the right sort of meta technology or the right automated process, then 100,000 years, as you intuitively think of it, could become 10 years.

Holden Karnofsky: So fundamentally, as far as we can tell, it’s sure it looks like this is happening. The way that science and technological advancement is happening right now is that there are these people with brains and the brains are like… They’re pretty small. There’s a lot of them. They’re built out of not very expensive materials in some sort of sense. You could think of a brain as being made out of food or something. There’s no incredibly expensive process that needs to be done to create a brain. And these brains are doing the work. They’re doing it. So why can’t we build something, could be anything, but I would guess a computer, that could… Whatever it is our brain is doing, why couldn’t we build something that did that?

Holden Karnofsky: And of course that’s a lot harder than building something that plays chess. It raises new challenges with how do you train that thing? How do you define success? And probably it has to be a lot more powerful than these computers to play chess. Because something that some people don’t know is that actually today’s most powerful computers, based on estimates such as the one that Joe Carlsmith did for Open Philanthropy… It’s very rare to see a computer that’s even within range of having the computational power of a human brain right now. So it’s like, sure, to do these hard things that humans do, we’re going to need something that’s a lot more powerful than what we have now, does more computations probably, and we’re going to need creativity and ingenuity and figure out how to train it. But fundamentally, we have an existence proof, we have these brains, and there’s a lot of them, and why can’t we build something that fundamentally accomplishes the same thing they’re accomplishing?

Holden Karnofsky: Humans somehow learn how to do science. It’s not something that we’ve been doing for most of our history, but somehow we learn it in the space of a human lifetime, learn it pretty quickly. So if we could build something else that’s able to learn how to do the same thing, whether it’s in the same way or not, you could imagine building an AI that’s able to watch a training video and learn as much from it as a human does, as measured by its answers to some test.

Holden Karnofsky: And then you start to ask, so what technologies could we develop? And it’s like… There’s two answers. One answer is like, “Oh my God, I have no idea.” And like wow, maybe that’s enough. Maybe we should just say if we could develop that kind of system this century, then we should think of this as the most important century, or one of the most important centuries. We should just be freaking out about this possibility, because I have lost the script. Once we’ve got the ability to automate science, to get to where we might be going in 100,000 years, but to get there in 10 years, 1 year, gosh, we should just really worry about that, and that should be what we’re spending our time and energy on, what could happen there.

Digital People Would Be An Even Bigger Deal

Holden Karnofsky: The basic idea of a digital person is like a digital simulation of a person. It’s really like if you just take one of these video games, like The Sims, or… I use the example of a football game because I was able to get these different pictures of this football player, Jerry Rice, because every year they put out a new Madden video game. So, Jerry Rice looks a little more realistic every year. You have these video game simulations of people, and if you just imagine it getting more and more realistic until you have a perfect simulation… Imagine a video game that has a character called Holden, and just does everything exactly how Holden would in response to whatever happens. That’s it. That’s what a digital person is. So, it’s a fairly simple idea. In some ways it’s a very far-out extrapolation of stuff we’re already doing, which is we’re already simulating these characters…

Holden Karnofsky: A lot of people have the intuition that well, even if digital people were able to act just like real people, they wouldn’t count morally the same way. They wouldn’t have feelings. They wouldn’t have experiences. They wouldn’t be conscious. We shouldn’t care about them. And that’s an intuition that I disagree with. It’s not a huge focus of the series, but I do write about it. My understanding from… I think basically if you dig all the way into philosophy of mind and think about what consciousness is, this is something we’re all very confused about. No one has the answer to that. But I think in general, there isn’t a great reason to think that whatever consciousness is, it crucially relies on being made out of neurons instead of being made out of microchips or whatever.

Holden Karnofsky: And one way of thinking about this is, I think I’m conscious. Why do I think that? Is the fact that I think I’m conscious, is that connected to the actual truth of me being conscious? Because the thing that makes me think I’m conscious has nothing to do with whether my brain is made out of neurons. If you made a digital copy of me and you said, “Hey, Holden, are you conscious?” That thing would say, “Yes, of course, I am,” for the same exact reason I’m doing it. It would be processing all the same information. It’d be considering all the same evidence, and it would say yes. There’s this intuition that whatever consciousness is, if we believe it’s what’s causing us to think we’re conscious, then it seems like it’s something about the software our brain is running, or the algorithm it’s doing, or the information it’s processing. It’s not something about the material the brain is made of. Because if you change that material, you wouldn’t get different answers. You wouldn’t get different beliefs.

Holden Karnofsky: That’s the intuition. There’s a thought experiment that’s interesting that I got from David Chalmers, where you imagine that if you took your brain and you just replaced one neuron with a digital signal transmitter that just fired in all the same exact ways, you wouldn’t notice anything changing. You couldn’t notice anything changing, because your brain would be doing all the same things, and you’d be reaching all the same conclusions. You’d be having all the same thoughts. Now, if you replaced another one, you wouldn’t notice anything, and if you replaced them all, you wouldn’t notice anything…

Holden Karnofsky: I think it is the better bet that if we had digital people that were acting just like us, and the digital brains were doing the same thing as our brains, that we should care about them. We should think of them as people, and we probably would. Even if they weren’t conscious — we’d be friends with them. We’d talk to them and we would relate to them. There are people I’ve never met, and they would just be like any other people I’ve never met, but I could have video calls with them and phone calls with them. And so, we probably will and should care about what happens to them. And even if we don’t, it only changes some of the conclusions. But I basically think that digital people would be people too.

Transformative AI Timelines

Holden Karnofsky: [A common intuition that people have is that] we have AI systems that can do the low-paying jobs. Then they can do the medium-paying jobs, then they can do the high-paying jobs. And it’s like, “Gosh, that would be a really polite way for AI to develop.” They can just get right into our economy’s valuations on people and our opinions of what kind of work is valuable. And I think when people talk about unemployment, they’re just assuming.

Holden Karnofsky: They’re just like, “Well, the people right now who aren’t paid very much, those are going to be all the people who are unemployed. And we’ll have to wait for the AI to catch up to the people who are paid a lot right now.” And a lot of what I wanted to point out is just, we don’t know how this is going to go, and how this goes could be a lot more sudden. So a lot of the ones you said, where it’s just going to be like, “Alright, now it’s an ant.” How would we even know that? In my opinion, it could already be at ant-level intelligence, because we don’t have the hardware.

Holden Karnofsky: We can’t build things that can do what ants do in the physical world. And we wouldn’t particularly want to, so it’s just hard to know if you’re looking at an ant brain-level AI or a honeybee brain-level AI or a mouse brain-level AI. We’ve tried a little bit to compare what AIs can do to what very simple animals can do. There’s a report by Guille Costa on trying to compare AIs to honeybees on learning from a few examples, but it’s really all inconclusive stuff. And that’s the whole point, is it might just happen in a way that’s surprising and quick and weird, where the jump from chimp brain to human brain could be a small jump, but could be a really big deal.

Holden Karnofsky: So anyway, if I had to guess one [thing to use as a reference point], I would go with we don’t yet have an AI that could probably do whatever a human could do in one second. But I would imagine that, once we’re training human-sized models, which we’re not yet, that’d be the thing you might expect to see us getting closer to. And then you might get closer to things that a human can do in 10 seconds, or 100 seconds. And I think that where that would put us now is we’re just not at human level yet. And so you just wouldn’t be able to make much of what you see yet, except to say maybe make lower animal comparisons, or simpler animal comparisons.

Holden Karnofsky: Just to be clear, I’m definitely not saying it’s going to happen overnight. That’s not the point I’m trying to make. So I think before we have this super transformative AI that could automate science or whatever, we’ll probably be noticing that other crazy stuff is happening and that AIs are getting more and more capable and economically relevant. I don’t think there’s going to be no warning. I don’t think it’s going to be overnight, although I can’t totally rule that out either. But what I do think is that it might simultaneously be the case that it’s too early to really feel the trend today, and that a few decades could be plenty. And one way of thinking about that is that the whole field of AI is only a few decades old.

Holden Karnofsky: It’s only like 64 years old, as of this recording. And so if you imagine… And we’ve gone from these computers that could barely do anything to these image recognition models, these audio recognition models that can compete with humans in a lot of things, at least in an experimental laboratory-type setting. And so an analogy that I use at one point is the COVID pandemic. Where it’s like, it wasn’t like it happened completely overnight, but there was an early phase where you could start to see it coming. You could start to see the important trends, but there weren’t any school closures, there weren’t any full hospitals. And that’s, I think, maybe where we are right now with AI. Where you can start to think about the trends, and you can start to see where it’s all going. You haven’t started to feel it yet, but just because you haven’t started to feel it yet… I mean, a few decades is a long time.

Articles, books, and other media discussed in the show

Holden’s blog

Open positions

Open Phil reports

80,000 Hours Podcast episodes

Other links

Transcript

Rob’s intro [00:00:00]

Rob Wiblin: Hi listeners, this is the 80,000 Hours Podcast, where we have unusually in-depth conversations about the world’s most pressing problems, what you can do to solve them, and why meeting a billionaire is a great cover story for a date. I’m Rob Wiblin, Head of Research at 80,000 Hours.

For most of you Holden Karnofsky won’t need any introduction, because he’s a — or maybe the — driving force behind GiveWell and Open Philanthropy.

He was also a guest back in 2018 for episode 21, explaining one effective altruist approach to doing the most good through philanthropy.

He has recently been publishing a series of essays on his blog Cold Takes, which lay out the worldview that he and a significant part of Open Philanthropy are now operating from as they try to disburse billions of dollars.

It’s an outstanding series and naturally made an outstanding episode as well.

I’m passionate about this topic in part because it’s frustrating to keep hearing that for people who want to do a lot of good, longtermism or the things that longtermists are doing are weird or in some way violate common sense. It seems like many people repeat that without having properly thought it through.

To me, when you zoom out and look carefully at the broad situation that humanity is actually in, then the topics and projects that longtermists are excited about make a lot more sense — both common and otherwise — than the alternatives that are often proposed.

There’s simply no need to start out defending this work on the back foot, because the prima facie common-sense case for longtermism is as strong as what anyone else has.

I don’t think that’s Holden’s view, but I hope that if you listen to this episode or read his blog posts you’ll understand where I’m coming from when I say that.

One quick notice is that we’re currently hiring a new Head of Marketing to help more people who would be interested in this show, or our online articles, or our one-on-one advising, find out that they exist. If you’d like more people to think about improving the world using the mentality we demonstrate to this show and might want to work at 80,000 Hours, then click through the link to the job ad on the blog post associated with this episode.

Finally, in the interest of full disclosure, Open Philanthropy is one of 80,000 Hours’ biggest funders.

We’re releasing this episode in two parts, so without further ado, here’s the first half of my conversation with Holden Karnofsky.

The interview begins [00:02:22]

Rob Wiblin: Today, I’m speaking with Holden Karnofsky. In 2007, Holden co-founded the charity evaluator GiveWell. In 2014, he co-founded the foundation Open Philanthropy, where he now leads their longtermist grantmaking. Open Phil, as it’s often called, works to find the highest impact grant opportunities, and has so far recommended over $1 billion in grants. He also recently started a blog called Cold Takes, where he hopes to share his personal ideas about futurism, quantitative macrohistory, and applied epistemology, among many other topics. Thanks for returning to the podcast, Holden.

Holden Karnofsky: It’s great to be back.

Rob Wiblin: I hope we’re going to get to talk about your views on career choice and your take on how this century might be particularly exceptional. But first, what are a couple of important updates from Open Philanthropy over the last year or two?

Holden Karnofsky: Well, I’ll give you a big one, which is that very recently we announced that Alexander Berger was promoted to co-CEO, and so Open Philanthropy is now a two-headed organization. And the reason we did that is because there’s been this growing distinction that’s been growing in importance at Open Philanthropy, between two kinds of frameworks for the question of how do I do the most good possible with my money? And that’s generally the mission of Open Phil, is to take all the money that our donors are working with and spend it in a way that’s going to do as much good as possible, whatever that means. And it ends up meaning slightly different things. We’ve written about this before in the context of things like worldview investigation and dividing causes into categories, but it just increasingly feels like there’s two big ways of thinking about how to do good in the world.

Holden Karnofsky: And one of them is something that I think you’ve talked about a lot on your podcast, which is this idea of longtermism. And so you say, “We want to help the most persons possible, and there’s an overwhelmingly large number of persons in the future. And so whatever we can do that can make the whole future go better, that is the best way to do the most good possible.” And so what you end up doing is you’re measuring every grant and everything you’re doing not in terms of some measure like how many years of healthy life did we add for someone, you’re just measuring it in terms of the expected goodness of the whole future. Or more proximately, the probability that humanity gets through some difficult period of big risk and gets a good outcome from some sort of big technological transition.

Holden Karnofsky: And so it’s this kind of… I think it’s Nick Bostrom who called it to “maximize the probability of things going okay.” And that just, that tends to be the metric you’re using. And the other mentality, and the other way of doing good, is just the more traditional common sense way of doing good, which is to say, “It’s not all rolled into one bet. It’s not all about the long-run future. We’re going to just do things that are a little bit more using an unusual quantitative, analytic mindset to do things that would be a little bit more recognizable as nice things to be doing for people.” And so global health and wellbeing tends to measure their work in things like disability-adjusted life years (DALYs). That’s like how many years of healthy life did we allow people to have more of? Did we avert unnecessary deaths? Did we cause the world to be richer? Did we reduce suffering? Did we increase happiness? That kind of thing, global health and wellbeing.

Holden Karnofsky: And they’re just two very different mentalities. And there’s a lot of things that are different between the two of them. And Alexander has been increasingly just taking charge of the global health and wellbeing side. I think he has a really great framework in his head. I think this podcast is coming shortly after a podcast with him, so I don’t need to talk about him a lot, but he’s been phenomenal. And there’s a couple reasons we made this change. One is that I just thought he was the right person to be officially in charge of global health and wellbeing. He’d been unofficially in charge; I wanted that to be recognized.

Holden Karnofsky: The other thing is that I wanted to be able to personally just really focus. I really believe it’s good when you can focus on one kind of problem. And I think longtermism is a very different kind of problem, and so I wanted to get my head all the way into the space of just this longtermist framework and spend my time that way. And so I’m excited about getting to do that. And that’s related to me sort of recently becoming a little more solid in my feeling that this century could be the most important century, and I really wanted to focus on the implications of that.

Rob Wiblin: I guess Open Phil has been growing in terms of its headcount and its aspirations for how much money to give away. Had it just become unmanageable in your head to keep track of this AI stuff, all the longtermist work, as well as the animal thing? I imagine a bunch of it was getting neglected perhaps.

Holden Karnofsky: I think you can always use an org chart. You can always make this stuff work. The funny thing is Open Phil is going to grow staff and is going to grow its giving, but in the near term at least a lot more of that growth is going to come on the global health and wellbeing side. I think Alexander is going to end up probably running a bigger team than the whole organization is now, and we’re going to have to make that work. But I think there is a big difference, which is that with global health and wellbeing, you’re doing work that’s a little bit more recognizable as charity work or as philanthropy. And so you have a lot of grantees and you can deploy a huge amount of capital and that creates a need to hire a lot of people.

Holden Karnofsky: And the longtermist side, I’ve just come to feel that money is not the biggest bottleneck right now, and a lot of what we need to do is we need to be active grantmakers. We need to have a really good vision of how to get that money used more and used better. And I think a lot of the blocker on longtermism right now is the headspace we’re living in. The view we’ve got of what’s going on in the world and what’s most important for the long run is just not held by a lot of people. And when you don’t really have your head in that space and when you don’t really have that picture, it’s hard, I think, to be a grantee that we’re very excited to fund. So they just feel like really different missions.

Holden Karnofsky: It actually feels like the longtermist team right now should be small, nimble, experimental, improvisational and just understanding and owning that we don’t really know what we’re doing. We’re this early stage organization that is figuring itself out because we don’t have a giant pile of giving opportunities to sort through and write the checks to. Global health and wellbeing is going to have to operate at scale, much bigger scale than we’ve been at.

Rob Wiblin: Interesting.

Holden Karnofsky: And I’m just a person who historically… I like to work on really early-stage things where we don’t know what we’re doing, and I like to build them into things that are more mature and that can scale. But I think there’s usually a better person to take it from there.

Rob Wiblin: It’s very interesting that you’re saying that. I guess you’ve spent years now trying to increase the amount of giving you’re doing within the longtermist portfolio, and just concluded that to some extent you have to drum up business a bit, or bring in new people who have what it takes to start projects rather than just give money to people who are there already. I guess it starts to seem almost like building a team, trying to find people who are good enough to hire, in a loose way, to do projects that you think are valuable.

Holden Karnofsky: I think that’s a reasonable analogy, and I think that’s basically where we are. We definitely give away tens of millions of dollars for longtermist causes. There’s a lot of really great people out there that we fund. But there’s a lot more capital available than that. I definitely am not saying we’re funding all the good people, because in order to fund all the good people, we’d have to have this incredible searching ability, and recognition and evaluation ability. It’s not that, oh, we got everyone. It’s more like, oh, we could spend a lot of time making sure we got every single person. But even then, I would be surprised… There’s no way that would double the amount of money we’re giving away every year. We just have to look in a different direction.

Holden’s background [00:09:47]

Rob Wiblin: Alright, we’ll come back to that topic later on. Just first off, I wanted to hear a little bit more about your background. One story that I haven’t heard is, you were at GiveWell for a number of years and then things changed quite a lot when Cari Tuna and Dustin Moskovitz, who had significant aspirations to give away billions — possibly I guess tens of billions — of dollars over the course of their lives, they got in touch and were interested to hear your thoughts on how they could do that more effectively. How did you feel personally when they emailed or called you? What’s the background there?

Holden Karnofsky: The story of how the first meeting happened is kinda funny. What happened was that we got an email, I think, from a mutual friend, connecting us to Dustin and Cari and he was just like, “Hey, do you want to meet and talk about giving?” I don’t remember exactly what he said. And I looked him up and I was like, huh, he’s the world’s youngest billionaire. And then somehow, I don’t know how, and I don’t have any defense of this, but Elie and I were just like, “This just doesn’t seem very high priority.” I don’t know why. I don’t know how we reached that conclusion. I’m trying to think through the logic in my head…

Rob Wiblin: You had a lot on that week.

Holden Karnofsky: Yeah, I don’t know what was going on. And we were just like, we should definitely take this meeting. Next time someone’s in California we should definitely take this meeting, but it’s like, this isn’t the kind of thing we would rush for, because it just… I don’t know, we had some vision of what kind of person would be interested in GiveWell, and we didn’t think it would be them, or that they would be a really good match with what we were doing. I don’t know. But then at the same time, I really wanted to go to San Francisco because there was this girl who had recently become single, and I wanted to go on a date with her. And I had a dilemma, which was I didn’t want to just show up in San Francisco being like, “Hey, I came out to go on a date.” I just, I don’t know. I just thought that would be weird.

Rob Wiblin: Little bit too keen.

Holden Karnofsky: And so Elie and I made an agreement, which was that I paid for the plane ticket to go out to San Francisco to meet Cari and Dustin, but then I got to officially say, if anyone asked me, that I was there to meet Cari and Dustin, not to go on a date with this girl. That was what Cari and Dustin meant to me at the time — just a cover story. And then we had this wonderful meeting. They were just, I almost never hear someone say this when they’re a big philanthropist, they were just like, “We just want to do the most good possible.” And I asked all the questions you’re supposed to ask, like “What do you care about?” And, “What do you want to do?” And they were like, “…the most good possible.”

Holden Karnofsky: It was this amazing meeting. And I shared a lot of my early opinions with them. And then after that, Cari followed up and said that she was really excited to work together closely, and they offered an initial donation. And it was… It really got off to a very fast start. And we were very excited, and we were talking about it and saying, “GiveWell’s cool, but I don’t know if it’s the best product for someone like them. Why don’t we try this thing GiveWell Labs.” It started as just trying to give away $1 million as well as we could with no special rules or criteria, like GiveWell had. And so it was awesome, but it just… The original thinking, I don’t know, it was a cover story. But the girl that I was going on a date with, now we’re married. And by the time this podcast comes out, we’ll have just had our first kid. So that also was good. It was a good trip.

Rob Wiblin: Huge weekend, wow. Nice. Did you find it intimidating at all? I suppose it sounds like you didn’t really appreciate how important of a meeting this could potentially be. You were going in unbothered.

Holden Karnofsky: Unbothered is a good word for it, yes. Unbothered is a good word for pretty much how I would interact with anyone in those days, and still now, I would say. And it’s not necessarily… I wouldn’t say that’s a good idea in every case, but I think it was good in that case.

Tips for managing high-stakes relationships [00:13:19]

Rob Wiblin: You might be the wrong person to ask this to, but over the years you’ve had to manage some pretty important relationships, where how well these relationships go influences a lot how your life goes, or how at least your career goes. And I know a bunch of people out there whose work has this property, that a big thing that they spend their time doing is figuring out how to manage this relationship with someone who’s much richer or much more significant or influential than them, and they’ve got to make sure they don’t annoy them. And I think many of them find that quite anxiety inducing. And when I’ve had to do that in the past, I’ve found it incredibly anxiety inducing as well. Do you have any tips for people who find themselves in that sort of position?

Holden Karnofsky: I have tips that’ll work for some people and not others, or something. Or maybe I’m just the wrong person to ask. For me, I’ve just always been, rightly or wrongly, and maybe wrongly in some sense, but just not very attentive to that stuff. And it makes me really unhappy when people treat me that way. People who are, let’s say, report up to me, or they’re a grantee of Open Philanthropy and they approach me as this very scary figure you have to be very careful with. That makes me really unhappy; I don’t like it. And I’ve never really treated people who I’m really trying to build a relationship with that way either. It’s just been a little bit more a peer attitude, or a friend-making attitude, which is this very naive, be yourself, alright well if this person doesn’t like me, then I should move on and find another person to work with.

Holden Karnofsky: I think there are definitely cases where that can work incredibly well, and there’s probably cases where it backfires hugely. It depends what kind of person you’re dealing with. An advantage of it in my case is it’s got a selection effect. Whatever you’re doing, if you have a style that’s very consistent, then you’re going to bounce people who don’t work with it and you’re going to attract people who do. And so that in many ways is better than having these relationships that are hanging by a thread.

Rob Wiblin: Where you constantly have to be somewhat insincere or put on an act. I guess it’s exhausting and it might well not work anyway.

Holden Karnofsky: Exactly. And look, there’s a lot of good to be done that way too, especially when the relationships are high stakes enough. But actually, there’s a fair number of billionaires out there. It’s not infinite, but it’d be different if you’re dealing with a head of state, where there’s only one. There’s a fair number of billionaires out there. I think it made some sense to just… I think if we had somehow found a way to give one exactly what they wanted, it would have been a much more fragile partnership. And the thing is that GiveWell always grew up in this world of charities and philanthropy advising that was all about the donor’s needs, and all about making the donor happy, and all about catering to the donor. And as donors, we didn’t like it. We were trying to give away just a few thousand dollars as hedge fund people in our early twenties. And we didn’t like the way that every conversation turned to what we wanted. We were trying to figure out how much good they were doing.

Holden Karnofsky: And so GiveWell was always just this zig when everyone else zags kind of thing. And so our sales pitch was like, “We don’t care about you.” This backward sales pitch that’s just… Every other philanthropic advisor, their whole mission is to say, “What do you care about?” And GiveWell would show up and be like, “We don’t care about what you care about. We’re going to help other people as well as we can. If you like that, you can give here. And if you don’t, that’s too bad, because we’re not doing custom research.” That was always the attitude. When it’s unique and when there’s nothing else like it, you can form a more robust relationship if there is someone who likes that, than by trying to conform yourself exactly to someone. That does not mean that’s the advice I’d give everyone. But I think it’s a shame to miss out on it when it would have worked, is the thing I would say.

Rob Wiblin: You mentioned that people sometimes have a deferential attitude now. The thing is Open Phil is kind of a big deal, and you’re kind of a big deal. Do you have any fond memories of the kind of things that you were able to do with GiveWell and Elie Hassenfeld just running a scrappy organization back in 2007 or 2011, the kind of thing that you couldn’t, or just wouldn’t, do now that things are more professionalized?

Holden Karnofsky: Most of the tangible stuff I can think of that’s different seems better now. The way that I used to spend my time, I think I spent a whole week trying to unravel our finances or something at one point. Most of the tangible stuff I think is just better now. We’re able to do so much more. There’s a lot of great people. The things that come to my head, well, we moved all of GiveWell to India for four months. That’s not a thing we would really be able to do today, although with all the remote work, maybe.

Holden Karnofsky: But that was a cool thing to be able to do. I liked having a blog where I was just able to rant. I would write a post in an hour and just put it up. That was fun for me. I don’t know if that was actually a good idea. And I think that the new blog I’m going to have is probably closer to that than other stuff I’ve done in a long time, but it’s still not going to be like that. That was just fun. But again, a lot of this stuff it’s…I don’t know. It was fun, but was it actually a good idea? Not really. And I think most of the things that have changed have changed for the better, in terms of how good they are. I have said that I just don’t enjoy heavily hierarchical interactions. In terms of how it feels, I like to have peers more than those standard report relationships when I can. And generally in practice right now I can’t.

Rob Wiblin: That is a shame, but I suppose actually with colleagues you’ve had for a long time potentially, or people who you’ve been managing for many years, maybe it does become more of a peer relationship gradually? Or they begin to trust that you actually do really mean that?

Holden Karnofsky: Yeah. It’s more like that. It moves in that direction.

The most important century [00:18:41]

Rob Wiblin: Alright. Let’s push on to our first big topic for the day, which is this series of blog posts that you’ve been working on and are now gradually releasing, called the Most Important Century, which is going on your Cold Takes blog, which we’ll link to. I’ve been lucky enough to get a bit of a preview of them. Basically, it explains why you think the world could change really dramatically in the next 100 years in a way that would make it likely to be one of the most important times in the entire history of the accessible universe. And I guess I hadn’t quite connected that it sounds like one motivation for writing this is that you think in order to find more good grantmaking opportunities in longtermism, you need more people who share your worldview, and you realize that you haven’t actually explained your worldview in public for a very long time, so now you’re getting it down. Are there any other motivations for working on this series?

Holden Karnofsky: I’m not going to say it’s the most strategic thing in the world. I did an awful lot of it on personal time. A lot of it was just this feeling of gosh, we’re making really big decisions based on this strong belief that there’s this very important thing that very few people are paying attention to. And if I try to explain it to someone, I can’t point them to anywhere where it’s been clearly written down in one place, and it’s just driving me crazy. I think that some of the motivation was just this unstrategic “Gosh, that’s a weird situation. We should do something about that.” It’s definitely true that also I’m thinking a lot of what’s holding us back is that so few people see the world the way we do, or are looking at the thing we’re looking at when they think about what’s most important. And so maybe having something you could read to see what we think is most important would be really good, which there hasn’t been.

Holden Karnofsky: That’s in terms of personally why I’ve written these posts. I think it’s also good though to situate it in the context of the larger project Open Phil’s been engaging in over the last two years. The longtermist team has been thinking for a while that we are really making very large decisions about large amounts of money and talent on the basis of this hypothesis that different people would put different ways, but I would basically say the hypothesis is that we could be in the most important century of all time. And you could also say well, if there’s only a 0.1% chance that we’re in the most important century, then maybe a lot of the stuff still follows.

Holden Karnofsky: I’m not really sure it does, but I certainly… I think a lot of how we think about it is just, no, there’s a really good chance this is the most important century. Or at least it’s very high up there on the list of important centuries, because we could be looking at the development of some kind of technology, notably AI, that then causes a massive explosion in economic growth and scientific advancement and ends in the kind of civilization that’s able to achieve a very high level of stability and expansion across the galaxy. And so then you’ve got this enormous future civilization that’s much bigger than ours that, if and when that AI is developed that speeds things up, that’s going to be the crucial time for what kind of civilization that is, and what values it has, and who’s in charge of different parts of it.

Holden Karnofsky: That’s a premise that we believe is really likely enough that it’s driving a lot of our decisions, and it felt very unhealthy to be making that bet without basically… All of the reasoning for what we think was based on this informal reasoning, informal conversations, whiteboard-y stuff, Google Docs floating around. And it wasn’t really rigorous. And it wasn’t really in a form that a skeptic could look at and criticize. And so we started this project called worldview investigations, where we were just trying to take the most important aspects of this thing that we believed and write them up, even in a very technical long form, just so we could get a skeptic’s eyes on them and have the skeptic engage with them reasonably. Because it just wasn’t working to go to a random person, say what we believe, and try and work it out in conversation.

Holden Karnofsky: There’s just too much there. It was too hard to advance the hypothesis in the first place. And it was an enormous amount of work. And the worldview investigations team produced these technical reports that I think are phenomenal, and they’re public now. There’s a report by Joe Carlsmith about the amount of computation that you would need to match what the human brain does: New Report on How Much Computational Power it Takes to Match the Human Brain. About what’s a good guess at that. Because that’s an important part of the picture of when you expect to start getting in the ballpark of being able to build AI that’s able to massively speed up scientific and technological advancement. And then Ajeya Cotra, who you’ve talked to, wrote this report on biological anchors: Forecasting TAI with Biological Anchors. That one’s trying to estimate when we would get a very advanced AI that could speed up scientific and technological advancement. When we would get that based on these analogies to human and animal brains.

Holden Karnofsky: And then Tom Davidson wrote a couple of reports. One of them was Could Advanced AI Drive Explosive Economic Growth? Another was Semi-Informative Priors Over AI Timelines, about what should we believe, just based on how much effort has been put in and how much effort will be put in in the future? David Roodman wrote a really cool report called Modeling the Human Trajectory that’s also about explosive economic growth, but asks this question, if you just take all of human economic history and draw the line on a chart and try to project it out in a smart way, where does it go? And his answer was that it goes to infinity this century.

Holden Karnofsky: Really cool reports. I would recommend that anyone who wants to read something really fascinating read them, but a lot of them are pretty dense, pretty technical, and it’s hard for an interested layperson to understand them. It’s also hard for someone to put all the pieces together. I just talked about a bunch of reports on different topics, and it’s not immediately obvious how they all fit together. That’s where it was just starting to drive me crazy. I was like, the picture is crystallizing in my head. I can point to all these reports, but there’s nowhere that’s just like, all right, here it all is, here’s the argument. And so that’s what the Most Important Century series is.

Rob Wiblin: Nice.

How this story is different [00:24:34]

Rob Wiblin: How is the general story that you’re laying out in this series different from stories that people might’ve heard before, in books like Superintelligence or other predictions about how the future might go?

Holden Karnofsky: Yeah sure. There’s definitely points of commonality. I think a huge amount of my series is… So a book like Superintelligence, it’s about what could happen eventually. And I think the treatment of when it could happen, and whether it could be soon, is shorter. And so it’s like we could build an AI that is misaligned. And if we do, we have a huge problem for humanity. And so a lot of my series is just arguing that this is the century to expect it, or at least the chances are quite high.

Holden Karnofsky: And a ton of it is about trying to lower the burden of proof, address all the feelings that it would be too crazy for it to happen soon. And then just do a lot of detailed analysis that, a lot of it is from Open Philanthropy technical reports, about just like, how do you actually estimate when that’s happening? And I think there are some quick arguments given in other sources, but I think it is more focused on that.

Holden Karnofsky: And then I think in terms of the overall vibe, I think in addition to the sort of urgency of like, “Well, this is now, and this is a special time.” I think there’s an additional thing, which is that a lot of other sources, and I think a lot of other EA arguments, are very focused on the question, do we go extinct or not? And certainly extinction, or rather existential risk, certainly that would be very bad. But I think when you just consider, this could be the most important century, this is when we could determine the shape of a future, stable, galaxy-spanning civilization, that does raise a lot of other questions too. And it starts making you think, “Gosh, we’re going to build something.” Or, “We could build something this century, and this is our chance to shape what it is, not just to determine if it’s okay or not.” And so that is an area where I think we could use more thought.

Holden Karnofsky: We could use more discussion of, if we’re going to build a big world sooner than we think, what kind of world do we want that to be? And how can we get that to happen? Those are some differences. But the biggest thing is really laying out a case and being very argumentative and very… Looking at all angles of the ‘when.’

All Possible Views About Humanity’s Future Are Wild [00:26:42]

Rob Wiblin: One of the first points that you make is that people often respond to the claim that this is the most important century and that lots of things can be transformed by thinking, “This is a wild claim, can’t I just believe — and I shouldn’t I believe — something that’s not wild, something that’s more sensible and not weird?” But you think that no matter how you slice it, we almost necessarily find ourselves at a weird time in history. Why is that?

Holden Karnofsky: So first I just want to acknowledge that that’s exactly where I’ve been coming from for years. I would meet people and they would say, “Holden, why are you doing all this stuff that helps make low income people’s lives better? What you should really be doing is working on the AI alignment problem, because we could all go extinct if we have the wrong AI. And if we have the right AI, we could get this wonderful civilization that stretches across the galaxy.” And I would say, “I feel like there’s a burden of argumentation here. We’re funding organizations, giving out bed nets that have been shown in randomized controlled trials to improve people’s health, and you’re telling me it would be a better use of time to worry about this thing that would be the most important event in all of history, happening pretty much in my lifetime? Or maybe you think we should be preparing hundreds of years in advance, but that’s got its own issues.”

Holden Karnofsky: It was hard to articulate exactly what the problem was, but now having thought about it more, I would say that the problem was in some sense a big claim that at initial look seems to need a lot of evidence. And one way to put it — I think this has been said in a piece by William MacAskill, among other things — is, so now I’ve formulated this hypothesis. Different people would put it in different ways. I formulated it as the most important century. It might not be literally the most important, it might just be so high up there that you should really pay attention, but let’s say it’s the most important. And it’s, well, there are a lot of centuries. And especially if you think we’re going to get this galactic civilization, there are a lot of centuries. The initial position of saying this is the most important out of all of them, well, you need to argue for that. That’s a high burden of proof to overcome.

Holden Karnofsky: That’s one way of thinking about it, but there’s a lot of ways of thinking about it. Another thing I would think is just, well, for hundreds of years, before any of us were alive, the economy’s been growing at a few percent a year. And you’re talking about this explosion in growth that takes us to this super-advanced civilization. Anytime you talk about well, it’s always been this way, but now it’s going to be that way… You need to provide evidence. You need to argue that. This is not just a thing where you can be like, doesn’t it seem like this could happen?

Holden Karnofsky: And so that has always been a blocker for me. And it’s always been like, this stuff is really interesting, I’m really fascinated by it, but it’s hard for me to really buy into it and get my head there. And so a lot of what we’ve done with the worldview investigations is examine that intuition, examine different angles of it. And a lot of what I’m trying to do with this series is not only talk about the fact that Ajeya’s report estimates that you would get transformative AI this century, but also talk about all the ways in which that would be a strange event, and why we think it’s still reasonable to place a good probability on it.

Holden Karnofsky: And a lot of the series is about that. And so a lot of what I’ve learned, I guess, is that when you zoom out and look at the whole human story, or the whole galaxy story, it doesn’t look like, well, things have been normal for a long time and now all these people are saying it’s about to change. It looks more like, gosh, we just live on this rocket ship that took off five seconds ago, and nobody knows where it’s going. And it’s all a matter of the timescale and the timeline that you’re looking at. If you’re thinking about your lifetime, your parents’ lifetime, and their parents’ lifetime, then economic growth has been about 2% a year. The world has changed a fair amount every year, but not a huge amount in a year.

Holden Karnofsky: If you plot it on a chart, if you plot the world economy on a chart, on a log chart, it just looks like a line going up, just a very boring line. And so last year… Well last year was really crazy because of COVID. But most of the years of my life, it’s like, well, there’ve been some cool new computers, but we don’t seem to be on pace for anything really crazy to happen. And then when you zoom out and you just say, “What is the story of history to date?” There’s a couple ways you could look at it.

Holden Karnofsky: One way is you could zoom out and you could say, “What is the human history?” And the human history is the economy, in some sense. You could think of it as being a few thousand years old, and it’s been accelerating. And so it used to grow much slower than now, and it’s been accelerating. And this is actually an incredibly high rate of economic growth by historical standards. It’s the highest it’s ever been. And it’s at this rate that could only be sustained for so long. And so when you look at a chart of economic growth over the last 5,000 years, it looks really weird. And it looks like this just this line that’s getting steeper and steeper.

Rob Wiblin: People have said that, a flat line and then a vertical line.

Holden Karnofsky: Yeah, exactly.

Rob Wiblin: If you zoom out enough.

Holden Karnofsky: Even on a log chart it looks that way. And I know every time I show this chart to someone they’re like, “Did you log it?” And I’m like, “Yes.” And they’re like, “Well, log it again.” It just looks weird, and there’s no way… You can’t plot this chart in a way that’s not going to make it look weird. There’s a way that makes it look a little less weird, but weird in another way. That’s one way of thinking about it, is just no, we’re in this strange time of economic acceleration, and the few percent per year growth is a few hundred years old out of thousands of years. And things are very unstable, and wacky stuff is happening, and this is a weird time in history.

Holden Karnofsky: And then if you zoom out even further, I would say well, the universe is about 13 billion or 14 billion years old. Life on Earth began a few billion years ago. If we build this galaxy-scale civilization, it should probably be around for tens of billions of years. When you’re thinking about billions of years and then you’re like, humanity is a few millions of years old, and the computers are 70 years old… And then it’s just like oh, space travel. The first space travel was less than 100 years ago. Then it just looks… If you try and make a timeline, you get this thing that just looks totally busted because all of the interesting events happened in the same pixel of the timeline.

Holden Karnofsky: And again, you can try to log it, but it doesn’t help. It’s just… We live in a really weird time. And if you think we’re ever going to have this galaxy-scale civilization, if you think it’s going to happen, that we’re going to start building it anytime in the next 100,000 years, then you have to think that we are among the earliest intelligent life that’s ever existed in the galaxy. And that we just live in this very strange time where anything could happen. And so it’s that perspective that has made me think that we live in a really strange time. Weird stuff is happening. Anything could happen next. And as people trying to make the world go better, we should really be thinking, what is the next crazy thing that’s going to happen? Not only thinking about how we help people in the here and now. Although I’m glad Open Phil continues to do that as well.

Rob Wiblin: So I guess there’s various different ways in which things could play out. So one possibility would be that things go kind of as you’re laying out here. We get increasing economic growth, maybe even accelerating economic growth, and then we go to space, we have AI. The future looks extremely different and much bigger and there’s a lot more good stuff going on. It’s one path. In which case, we’re at the beginning of that out of a timeline of millions or billions of years. So that’s pretty wild. An alternative would be that we go extinct or we just disappear, and none of this stuff happens — in which case a collapse around about now or in the next 200 years makes this seem like an interesting time in history as well.

Rob Wiblin: Maybe the most boring one would be if economic growth continues, but at an ever slower rate. So the world changes a bit, but it’s still pretty identifiable in 100 or 500 years, which I think is the view that people by default have — the future will be like now, but a bit richer. Or they’ll have more interesting phones. Why is that still wild?

Holden Karnofsky: So how long is that supposed to go on? I mean, how long would you say that happens? There’s different possibilities. One thing that I say in the series is that at the current rate of economic growth, it’s almost impossible to imagine that it could last more than another 8,200 years or something. Which sounds like a lot, but human civilization has only been around for thousands of years. And again, we’re talking about these timescales of billions of years. So what is the idea? So is the idea that we’re going to stay at the current level of growth, that we’re going to stop, or we’re going to gradually slow down?

Holden Karnofsky: One way of putting it is if we’re slowing down now and we’re never going to speed up again, then we live at the tail end of the fastest economic growth that will ever be, that we will ever see in millions of years of human existence to date, and maybe billions of years of human existence going forward. There were a couple hundred years of a few percent per year economic growth. That was the craziest time of all time, and that’s the time we live in.

Rob Wiblin: That would be pretty interesting if for some reason we’re at the point where it just stagnates for reasons that we can’t really anticipate right now. I guess another argument is just that even if we continue growing slowly, even if these changes take a long time, in the broader scope of history, this is still the wildest millennium, even if it’s not the wildest century.

Holden Karnofsky: Yeah, and that’s assuming that… So if you believe that we’re eventually going to build this technologically mature civilization… So the vision here, the idea is that… And this is something that does require a bit of explanation and defending, which I do talk about in the series, but the idea is that we could eventually have a civilization that spans the galaxy and that is very long lasting and is digital in nature. So the way we live our lives today could be simulated or put into digital form. That’s something that needs explanation and defense. But if you believe it’s possible eventually that we’ll have this robust digital civilization that’s able to exist in a stable form across the galaxy, if you believe that’ll happen eventually, and if eventually means in 10,000 years or 100,000 years, then yeah, if you make a timeline of the galaxy, it still looks like we’re in the most important pixel.

Holden Karnofsky: Or at least in the pixel where that all happened. In the pixel where we went from this tiny civilization on this one planet of this one star to a civilization that was capable of going across the whole galaxy. And then it’s like, do you think that’s actually possible? And we could talk about that, but one thing is that we are, for the first time in history, as far as we know, we are actually starting to do space travel now. So that’s the intuition pump there.

Rob Wiblin: Yeah, so not only are we beginning to do space travel, clearly making significant advances in AI, even if we’re quite a long way away, but we’re talking about and speculating about all of these ways in which the future could be radically different. So at least we will be among the era that first anticipated all of these dramatic changes, even if it takes 1,000 years for us to get there.

Holden Karnofsky: Yeah, exactly. And then another thing that can happen is it could turn out that it’s actually just impossible, and we’ll literally never get there, and I’m making this stuff up about a galactic civilization. And in that case we just stay on Earth forever. But I basically think there’s two wild things about that. One is, again, there was this period of scientific and technological advancement and economic growth, and it was like… Maybe it was a few thousand years long, but it was really, really quite a tiny slice of our history, and we’re living in it.

Holden Karnofsky: And two is, I just think it’s like… I don’t know, to just rule out that we would ever have that galaxy-scale civilization to me feels a little weird in some sense. By galactic timeline standards, it’s a few seconds ago we built the first computers and the first spaceships, and you’re saying, “No, we’ll never build a civilization that could span the galaxy.” It just doesn’t… That to me is a weird view in its own way.

Rob Wiblin: It starts to feel like a very strong claim, or very specific claim that would require a deep understanding.

Holden Karnofsky: Yeah, exactly.

This Can’t Go On [00:38:14]

Rob Wiblin: One thing we slightly skipped over is, just briefly, how long can present growth rates continue without us having to imagine the laws of physics being totally upended?

Holden Karnofsky: I mean, this is total back-of-the-envelope math, but I estimated (based on an analysis from Overcoming Bias) that 8,200 years at the current level of economic growth would be way, way plenty. At that point you would need to be supporting multiple economies the size of the world economy for every atom in the galaxy. And then if the rate continued, it wouldn’t take long for the numbers to get even wilder. It’s so hard to imagine 10,000 years of this level of growth… I mean, it’s very hard to imagine how that could be. And if it could be, it would be a very, very strange civilization that was able to pack that much value in per atom. It’s a little hard to imagine that we’d still be walking around in the same bodies that we’re in now, and things like that.

Rob Wiblin: And that’s a bit unintuitive, but I guess that’s just the magic of exponential growth. Where you keep adding… Well, what were you assuming, 3%? 4%? 5%?

Holden Karnofsky: I think I was assuming 2%. I have a tendency in these posts to just make a bunch of assumptions that are ridiculous — against the direction I’m arguing — and then say, “Well, it’s still pretty crazy, even if you make these very ridiculous assumptions.” I think when I talked about how long it would take to spread across the galaxy, I literally took a spaceship today and just took its speed, and I was like, “Well, if we move at that speed, how long will it take?” That’s a silly way to… We’re going to go faster than that, obviously, if we ever go across the galaxy. But still, even then, it only takes 1.5 billion years to reach the outer reaches of the galaxy, and that’s just, again, a really tiny amount of time if you think that we’re going to be able to last for amounts of time that make sense in galactic context.

Rob Wiblin: Yeah, it’s a nice example of exponentials being unintuitive to humans. The idea that humanity could continue economic growth, at least in terms of the value that’s produced, by 2% a year for 10,000 years doesn’t sound particularly strange, especially if we’re already saying, “Well, we’re going to spread to the entire galaxy.” But it turns out no, you just can’t, because that would require just an Earth-sized economy for every single atom.

Holden Karnofsky: This is a good time to just emphasize really hard that this is a series where I’m pulling together a lot of ideas that are out there in a lot of different places. A lot of them are from the technical reports of the worldview investigations team. A lot of them have been floating around in the EA community forever. So I don’t want anyone to interpret this as Holden’s theory that he came up with. I’m trying to pull together stuff that people have been talking about for a long time, and put it where people can read it.

How Holden changed his mind [00:40:48]

Rob Wiblin: So you mentioned earlier that initially you were not so keen on this worldview. What’s been the timeline of which you’ve shifted your mind on this to the point where most of your work on a day-to-day basis is to some extent guided by the possibility of this previously quite outlandish-sounding expectation?

Holden Karnofsky: It’s true I’ve been known in the community as a skeptic of a lot of this longtermist stuff, that I was for a long time. I think it’s important not to overstate that too much though. I was a skeptic in the sense that I was running an organization trying to help people give away their money as well as possible, and I was like, “Okay, we’ve got a methodology, we’ve got a project, we’re working on it. No, we’re not pivoting the whole thing to analyze AI. I don’t even know how we’d do that.” If I had been a guy who just went to parties and talked about what seemed cool to me, I probably would have said, “This seems really cool, and I care about it.” And maybe I would have personally donated, I don’t know.

Holden Karnofsky: But that’s different. I was not ready to make big professional decisions around this stuff, and that is definitely true. But I went to the Singularity Summit a year after GiveWell started. I was interested in it. It was an interesting hypothesis and I wanted to learn more about it and I always wanted to learn more. But yeah, I started from a place of, “This is very interesting, but before making really high-stakes decisions around it, I need more, I need to understand it better, and I don’t buy it right now.” So in terms of the trajectory of that, a lot of stuff has changed. Maybe I’ll just start with the timeline. We started GiveWell in 2007. Very shortly after, I started meeting all the people who were the very fastest to notice GiveWell and say, “This is cool.” So I really quickly met people who are into this stuff.

Holden Karnofsky: So I was thinking about this stuff like a month after I left my job to start GiveWell, or something like that. And that was 2007. We started Open Philanthropy under the name GiveWell Labs in late 2012, and I think it was probably 2015 or 2016 that we started saying, “Hey, we want to spend money on this stuff. We think this stuff is really serious.” And then it’s probably just in the last year that I’ve moved toward, “I want to spend all my professional time on it. I want to focus on it. This is where I want to put the energy that I’ve got.”

Rob Wiblin: So that’s the timeline. What do you think are the key driving factors that caused you to have a significant change in opinion?

Holden Karnofsky: It was a lot of different stuff. I mean, one that I’ve written about is assuming for a long time it’s this particular group of people talking about this stuff, but the people you would expect to be the experts are not talking about it. So getting to better understand what was going on there, and feel that the experts weren’t talking about it because they hadn’t thought about it, not because they had thought about it and they had some detailed objection. This is going to be a whole post in the series, but I think that someone who studies AI is not studying when super advanced AI is coming, they’re studying AI — what it does today, what it can do. That’s certainly relevant, but it’s not the same thing as forecasting. It’s similar to if you’re working on energy technology to reduce carbon emissions, you’re not necessarily who I want to ask about what you think of the IPCC projection on climate change. Obviously there is no IPCC in this case, but the point is they’re different expertises.

Holden Karnofsky: So that was a big factor, was learning that. That was a hard thing to learn. Because originally I just talked to my friends who were smart technical people, and they would have the same naive objections I had and that I hadn’t heard good responses to, in my opinion. I wanted to just see what the experts actually had. So that was a big factor. Other factors… I mentioned the thing about just coming to believe that we’re in a wild time and really struggling with the burden of proof argument, which is something that I really focus on a lot. Another really big factor for me was a shift in the nature of the argument. So I think traditionally there’s a very similar argument to what I’ve been making that motivates a lot of effective altruists, but it is different in an important way.

Holden Karnofsky: And the argument goes… It’s much more of a philosophy argument and much less of an empirical argument. The argument goes, the best thing we could do is reduce the risk of an existential catastrophe, or the best thing we could do is increase the probability that humanity builds a really nice civilization that spans the galaxy. And then now let’s look for something that could affect that. And the thing that could is AI. So now if there’s any reasonable probability, that’s what we’re going to work on. And that’s an argument where a lot of the work is getting done by this philosophical point about valuing… Having this astronomical valuation on future generations, because you’re not making an argument that it’s going to happen or that it could happen with reasonable probability, you’re just saying, “Well, if it did.” And then of course you can get to there, you can say—

Rob Wiblin: You’re not starting there. You’re starting the argument with looking for something else, rather than saying, “It seems like if we look out into the world that this thing could possibly happen quite soon, just on its own face.”

Holden Karnofsky: Exactly. You’re not starting there. And then you’re also not ending there either. It’s like… The way the argument goes is someone says to me, “Well this would be a really big deal.” And I go, “Okay, would be. So what? Is that going to happen?” And they’d be like, “Well I don’t know. It seems like it could be happening.” I don’t think that’s a terrible argument, I really don’t, but it does feel less compelling to me than going the other direction and saying, “Let’s not worry about exactly how big a deal this would be. Let’s not worry about if causing the whole future to go well is 100 times or 1,000 times or 1 kajillion times as good as saving as many lives as are alive today. Let’s not worry about that. Let’s just sit here and contemplate that we could be in the most important century of all time, and say, ‘Shouldn’t we focus on that?’ Shouldn’t you have a starting point in saying, ‘Shouldn’t we focus on that?'”

Holden Karnofsky: I’ve had internal discussions at Open Philanthropy about why we work on what we work on, and I call them the ‘track one’ and ‘track two’ arguments. Track one is the empirical, this is a huge event that is coming and I don’t know exactly how to value it, but even if you value it pretty conservatively, even if you just say, like, I don’t know, preventing an existential catastrophe from AI is 100 times as good as saving all the lives of people on Earth today, then it looks like something we should work on. And then the other direction is saying, no, this is just so astronomically consequential that even a very tiny probability makes it something we should work on. And they’re different arguments, but I’ve never been a person who’s incredibly comfortable making a big contrarian bet that is based on that reasoning. The kind of philosophical, “This is how much value there is.”

Holden Karnofsky: And I do feel way more conviction and way more buy-in from just saying I don’t know if I’ve got the values right. I don’t know if I’ve got the population ethics right. But there is something really, really, really big that could be happening, and we have to zoom out, look at our place in history, see how weird it is, and ask what’s going to be coming next. That’s the most responsible thing for us to do as people trying to help the world. And then we can run the numbers and make sure it pencils, but that’s a better starting point for me.

Holden Karnofsky: I also finally want to acknowledge there is just structural and incentive and personal stuff going into this too. So there was a point in time when I was trying to help create an organization that could tell individual donors who didn’t know anything where to give, and it was going to be much more helpful to have something really evidence backed, explainable, transparent, and legible. And that was GiveWell, and that was our mission. That’s what I was focused on.

Holden Karnofsky: And there came a later date when I had a lot more freedom to engage in wild arguments and do really risky stuff, and that was more my professional duty. So a lot of this affects how much time I put into it, and what I’m thinking of as my job, and what I want to focus on. And I think that’s also an underrated thing in these kinds of debates is like… It really does matter what your position is in life, and what it makes sense for you to be contemplating and doing. And that has changed for me, and I want to acknowledge that’s been part of this.

Process for Automating Scientific and Technological Advancement [00:48:46]

Rob Wiblin: Makes sense. Let’s talk about some of the specific technologies that you bring up in this series, as things that might be feasible or might be more feasible than people appreciate, and also might have more significant and revolutionary social consequences than people already understand. One that you raised is the idea that we might be able to create an artificial intelligence system that is specialized at generating scientific and technological progress. You refer to the targeted AI, somewhat amusingly, as PASTA, which stands for Process for Automating Scientific and Technological Advancement. What exactly do you imagine this PASTA being able to do?

Holden Karnofsky: Yeah, the basic idea is that if you could imagine an automated digital scientist — or engineer, or entrepreneur — someone who could do all the things a person does to advance science and technology, and then you imagine that digital person could be copied and could just work in this digital sped-up advanced form. If you just imagine that, then you can pretty fairly easily get to a conclusion that you would see a massive, crazy explosion in the rate of scientific and technological advancement. And at that point you might start thinking something like anything that is scientifically and technologically possible, we will get fairly soon. A lot of my argument is that it’s not too hard to imagine that really, really wild stuff could happen in the next 100,000 years. Stuff about building stable digital-based civilizations that go across the galaxy. Not too hard to imagine that. The interesting thing is that if we get the right sort of meta technology or the right automated process, then 100,000 years, as you intuitively think of it, could become 10 years.

Rob Wiblin: I suppose many people might have the reaction that they understand that machine learning systems can learn to play chess, because it’s a very legible game where you can see the result, you can see the outcome. They understand why it could learn to play computer games, or even this game that GPT-3 plays of guessing what the next word should be. But isn’t scientific and technological advancement too complicated? How would you even tell whether you’re succeeding or what the different moves in the game are? Do you have anything to say on whether this is more feasible than people understand?

Holden Karnofsky: I mean I would think of it as a difference in degree rather than kind, is how I think about it. So fundamentally, as far as we can tell, it’s sure it looks like this is happening. The way that science and technological advancement is happening right now is that there are these people with brains and the brains are like… They’re pretty small. There’s a lot of them. They’re built out of not very expensive materials in some sort of sense. You could think of a brain as being made out of food or something. There’s no incredibly expensive process that needs to be done to create a brain. And these brains are doing the work. They’re doing it. So why can’t we build something, could be anything, but I would guess a computer, that could… Whatever it is our brain is doing, why couldn’t we build something that did that?

Holden Karnofsky: And of course that’s a lot harder than building something that plays chess. It raises new challenges with how do you train that thing? How do you define success? And probably it has to be a lot more powerful than these computers to play chess. Because something that some people don’t know is that actually today’s most powerful computers, based on estimates such as the one that Joe Carlsmith did for Open Philanthropy… It’s very rare to see a computer that’s even within range of having the computational power of a human brain right now. So it’s like, sure, to do these hard things that humans do, we’re going to need something that’s a lot more powerful than what we have now, does more computations probably, and we’re going to need creativity and ingenuity and figure out how to train it. But fundamentally, we have an existence proof, we have these brains, and there’s a lot of them, and why can’t we build something that fundamentally accomplishes the same thing they’re accomplishing?

Rob Wiblin: I guess part of the idea is that humans over the course of their development from being children, they learn these skills through experience, so can’t we reverse engineer the process by which humans, as they grow, learn how to do science and then put it into a different system?

Holden Karnofsky: Yeah, exactly. I don’t know if it’s specifically reverse engineering. A lot of what AI research looks like today is just trial and error. You just have these systems where if you can show them what success looks like, you don’t really need to know anything else. You don’t need to think about… You can build an AI that’s world champion at a game that you barely understand yourself. You barely understand the game. You just have it play the game and you have a method for it knowing whether it’s doing well or poorly, and it’s able to figure that out. I don’t know if it’s reverse engineering, but the point is there’s a lot of different ways to build an AI, and the question is just…

Holden Karnofsky: Humans somehow learn how to do science. It’s not something that we’ve been doing for most of our history, but somehow we learn it in the space of a human lifetime, learn it pretty quickly. So if we could build something else that’s able to learn how to do the same thing, whether it’s in the same way or not, you could imagine building an AI that’s able to watch a training video and learn as much from it as a human does, as measured by its answers to some test. And that’s a measurable thing, so that’s an example of what I’m talking about.

Rob Wiblin: Alright. Let’s set aside the feasibility for a minute and think about what effects this would have on society if a PASTA-like system actually came into being, and how long it might take to have those effects. Do you want to elaborate on that?

Holden Karnofsky: Sure. It’s this basic idea that we’re doing this thing today, which is that every year people have more ideas, and they create more technologies, and they scale up and they make cheaper the technologies we already have. Just imagine if you could automate that. That’s the basic intuition. And I explained it a bit more in the series, but that’s the basic intuition, that we’re doing this thing, we’re doing it manually in a sense. There’s humans doing it, having ideas and making technologies cheaper and better. And if you could automate that, then you would just expect it to speed up an awful lot. And then the question becomes where could science get in 100,000 years? Because maybe that’s where it’ll get in 10 years, or 1 year, or something like that. So then where could science get?

Rob Wiblin: That vision relies on the idea that once we’ve developed an appropriate system, once we’ve trained it, that the amount of computational ability, the number of processes required to run it, isn’t so great — you could run the equivalent of hundreds of thousands of actual human scientists on these systems. Do you think that’s very likely to be the case, or is that a question mark?

Holden Karnofsky: Yeah, you definitely have to get off the ground. So if you had one sort of thing that was automated that could do what a human could do, but it costs the entire world economy to run that thing for a year, then yeah, the things I’m talking about would not happen. And a lot of the analysis that is done in the biological anchors report is about cost, and it’s about when we get to the point where it’s an imaginable amount of money, where a government might actually do this, and it might actually be affordable to create these things. It might be economic. It might have returns. And the way I think you need to get off the ground… One thing that I would say is that when you’re using these deep learning systems that learn by trial and error, it takes… In some sense, making the first one, or training it, costs an enormous amount of money and computation compared to using it after you have it.

Holden Karnofsky: So AlphaGo, which is the system that plays this board game Go, it played an enormous, I don’t know how many, but an enormous number of Go games with itself to get from being really terrible at Go to being really great at Go. And that took an enormous amount of money in computation compared to once it was done, you could make as many copies as you want. I mean, you could make a bunch of copies and they would all be able to play Go, and for the amount of money and computation you spent just creating that first one and training it, you could then run a very large number of them.

Holden Karnofsky: So the basic idea is you look at this biological anchors report, and you say, “When will it be affordable in some sense?” Affordable might mean $100 billion. It might be something that a government would stretch to do, but when would it be affordable in some sense to train a single automated scientist? Because that data… Once that is done, it is now also affordable to run a large number of automated scientists. Now how large a number? Well, I mean, at that point I would think a large enough number that those automated scientists are going to be able to further bring the cost down. And that’s where you get into this feedback loop dynamic that is one of the focuses of the series, and has also been discussed in the reports by Tom Davidson and David Roodman on economic growth and how it could become explosive.

Rob Wiblin: I slightly interrupted you on the social effects that this might have. I suppose in broad strokes, you’re compressing the discovery of lots of different natural laws or technologies or engineering issues into a very fast time period, such that us living our lives, we expect that year after year there’s going to be some more interesting products available on the market, but even over the course of an entire human lifetime, things are still recognizable. But here you would just have this sudden flood of insights that would allow you to potentially do new things, and I guess maybe there’ll be a bit of a delay to having them flow through to products or actually get applied in the world, but it would be an extremely bizarre and out-of-equilibrium situation.

Holden Karnofsky: Yeah, that’s how I feel. And then you start to ask, so what technologies could we develop? And it’s like… There’s two answers. One answer is like, “Oh my God, I have no idea.” And like wow, maybe that’s enough. Maybe we should just say if we could develop that kind of system this century, then we should think of this as the most important century, or one of the most important centuries. We should just be freaking out about this possibility, because I have lost the script. Once we’ve got the ability to automate science, to get to where we might be going in 100,000 years, but to get there in 10 years, 1 year, gosh, we should just really worry about that, and that should be what we’re spending our time and energy on, what could happen there.

Digital People Would Be An Even Bigger Deal [00:58:15]

Holden Karnofsky: But if you want to get specific about it, what I like to do is I like to have one or two concrete examples of specific stuff, so that you could see just how crazy it could get. And then think to yourself it could probably get even crazier than that if there’s other stuff we haven’t thought of. So a particular technology that I focus on a lot in this series is this idea of digital people, and I focused on it because it’s a very simple idea. My guess is it will eventually be feasible. It’s the kind of thing that science would get us to in 100,000 years. And it would be just so radical. It would just change everything about the world and it would get us to the… Maybe get us to the world that I’ve been talking about, this very stable, digital, galaxy-scale civilization.

Rob Wiblin: Let’s talk about these digital people, which is this potentially next revolutionary follow-up option. First, what do you mean by digital people?

Holden Karnofsky: So the basic idea of a digital person is like a digital simulation of a person. It’s really like if you just take one of these video games, like The Sims, or… I use the example of a football game because I was able to get these different pictures of this football player, Jerry Rice, because every year they put out a new Madden video game. So, Jerry Rice looks a little more realistic every year. You have these video game simulations of people, and if you just imagine it getting more and more realistic until you have a perfect simulation… Imagine a video game that has a character called Holden, and just does everything exactly how Holden would in response to whatever happens. That’s it. That’s what a digital person is. So, it’s a fairly simple idea. In some ways it’s a very far-out extrapolation of stuff we’re already doing, which is we’re already simulating these characters.

Rob Wiblin: I guess that’d be one way to look at it. I guess the way I’ve usually heard it discussed or introduced is the idea that, well, we have these brains and they’re doing calculations, and couldn’t we eventually figure out how to basically do all of the same calculations that the brain is doing in a simulation of the brain moving around?

Holden Karnofsky: Yeah, exactly. Yeah, you would have a simulated brain in a simulated environment. Yeah, absolutely, that’s another way to think of it.

Rob Wiblin: This is a fairly out-there technology, the idea that we would be able to reproduce a full human being, or at least the most important parts of a human being, running on a server. Why think that’s likely to be possible?

Holden Karnofsky: I mean, I think it’s similar to what I said before. We have this existence proof. We have these brains. There’s lots of them, and all we’re trying to do is build a computer program that can process information just how a brain would. A really expensive and dumb way of doing it would be to just simulate the brain in all its detail, just simulate everything that’s going on in the brain. But there may just be smarter and easier ways to do it, where you capture the level of abstraction that matters. So, maybe it doesn’t matter what every single molecule in the brain is doing. Maybe a lot of that stuff is random, and what really is going on that’s interesting, or important, or doing computational work in the brain is maybe the neurons firing and some other stuff, and you could simulate that.

Holden Karnofsky: But basically, there’s this process going on. It’s going on in a pretty small physical space. We have tons of examples of it. We can literally study animal brains. We do. I mean, neuroscientists just pick them apart and study them and try to see what’s going on inside them. And so, I’m not saying we’re close to being able to do this, but when I try to think about why would it be impossible, why would it be impossible to build an artifact, to build a digital artifact or a computer that’s processing information just how a brain would… I guess I just come up empty. But I can’t prove that this is possible.

Holden Karnofsky: I think this is one of the things that Open Phil hasn’t looked into because we generally do find it intuitive and haven’t had a ton of pushback on it, but that might have something to do with which particular skeptics we’re encountering, and it’s something we could certainly do a more in-depth investigation of in the future. But yeah, the basic argument is just, it’s here, it’s all around us. Why wouldn’t we be able to simulate it at some point in time?

Rob Wiblin: Are you envisaging these digital people as being conscious like you and me, or is it more like an automaton situation?

Holden Karnofsky: So, I think one of the things that’s come up is when I describe this idea of a world of digital people, a lot of people have the intuition that well, even if digital people were able to act just like real people, they wouldn’t count morally the same way. They wouldn’t have feelings. They wouldn’t have experiences. They wouldn’t be conscious. We shouldn’t care about them. And that’s an intuition that I disagree with. It’s not a huge focus of the series, but I do write about it. My understanding from… I think basically if you dig all the way into philosophy of mind and think about what consciousness is, this is something we’re all very confused about. No one has the answer to that. But I think in general, there isn’t a great reason to think that whatever consciousness is, it crucially relies on being made out of neurons instead of being made out of microchips or whatever.

Holden Karnofsky: And one way of thinking about this is, I think I’m conscious. Why do I think that? Is the fact that I think I’m conscious, is that connected to the actual truth of me being conscious? Because the thing that makes me think I’m conscious has nothing to do with whether my brain is made out of neurons. If you made a digital copy of me and you said, “Hey, Holden, are you conscious?” That thing would say, “Yes, of course, I am,” for the same exact reason I’m doing it. It would be processing all the same information. It’d be considering all the same evidence, and it would say yes. There’s this intuition that whatever consciousness is, if we believe it’s what’s causing us to think we’re conscious, then it seems like it’s something about the software our brain is running, or the algorithm it’s doing, or the information it’s processing. It’s not something about the material the brain is made of. Because if you change that material, you wouldn’t get different answers. You wouldn’t get different beliefs.

Holden Karnofsky: That’s the intuition. I’m not going to go into it a ton more than that. There’s a thought experiment that’s interesting that I got from David Chalmers, where you imagine that if you took your brain and you just replaced one neuron with a digital signal transmitter that just fired in all the same exact ways, you wouldn’t notice anything changing. You couldn’t notice anything changing, because your brain would be doing all the same things, and you’d be reaching all the same conclusions. You’d be having all the same thoughts. Now, if you replaced another one, you wouldn’t notice anything, and if you replaced them all, you wouldn’t notice anything.

Holden Karnofsky: Anyway, so I think there’s some arguments out there. It’s not a huge focus of the series, and I don’t have a lot more to say about it for now, but I think it is the better bet that if we had digital people that were acting just like us, and the digital brains were doing the same thing as our brains, that we should care about them. And we should think of them as… We should think of them as people, and we probably would. Even if they weren’t conscious—

Rob Wiblin: Because they would act, yeah.

Holden Karnofsky: Yeah, well, we’d be friends with them.

Rob Wiblin: They would complain the same way.

Holden Karnofsky: We’d talk to them and we would relate to them. There are people I’ve never met, and they would just be like any other people I’ve never met, but I could have video calls with them and phone calls with them. And so, we probably will and should care about what happens to them. And even if we don’t, it only changes some of the conclusions. But I basically think that digital people would be people too.

Rob Wiblin: Yeah, I mean, the argument that jumps to mind for me is if you’re saying, “Well, to be people, to be conscious, to have value, it has to be run on meat. It has to be run on these cells with these electrical charges going back and forth.” It’d be like, “Did evolution just happen to stumble on the one material that could do this? Evolution presumably didn’t choose to use this particular design because you’d be conscious. So, why would there be this coincidence that we have the one material out of all of the different materials that can do these calculations that produces moral value?”

Holden Karnofsky: That’s an interesting way of thinking of it. I mean, if I were to play devil’s advocate, I would be like, “Well, maybe every material has its own kind of consciousness, and we only care about the kind that’s like our kind,” or something. But then that would be an interesting question why we only care about our kind.

Novel properties of running human minds as software [01:05:49]

Rob Wiblin: Setting aside the consciousness issue for now, this idea of digital people appears regularly in movies and computer games and so on, but the designers of this fiction don’t take it to what I would think is a natural conclusion, which is that society would be just massively, completely upended by this. What are the most important novel properties of running human minds as software, rather than as these difficult physical systems that we don’t know how to intervene in?

Holden Karnofsky: Yeah, so a lot of times when people talk about digital people, or often the term used is ‘mind uploads’ (I chose to go with digital people for reasons I go into, that are not that interesting). You know, a lot of times people just focus on the immortality bit. They’re just like, “Well, now you’re digital, so you can live forever. Because there’s no reason you have to age, because you’re virtual.” And I mean, I think that’s interesting. I think it’s really not… I mean, I like thinking about ways the whole world could be just radically transformed, and that I think is smaller than a lot of the things that could come of digital people.

Holden Karnofsky: And so, when you just imagine… Just start with a very simple idea. Just imagine that we were virtual. Just imagine that we could do all the things with us that we can do with software. What are some things that you can do with virtual people today, like these video game characters, that you can’t do with real people? One, you can run them at different speeds. Software tends to have that property. You can copy them, make perfect copies of them, starting from any state. Software tends to have that property. You can put them in whatever virtual environment you can code up. So, if you have a virtual football player, you could put them in a virtual tennis match, or whatever. You have this complete control over what environment they’re in. It’s just whatever you can code.

Holden Karnofsky: Those are key properties. And I think you just take those three properties and the implications I’ve listed… Especially the copying. I think that would mean that there’s this enormous amount of productivity that becomes possible. This self-accelerating feedback loop of more people have more ideas, and then the more ideas lead to more resources, then the more resources lead to more people. And so, just being able to copy people… Instead of an organization being like, “Well, we need to do this project, so we need to hire a bunch of people and train them,” they might just take whoever’s already best at doing whatever they need done, make as many copies of them as they need, make each copy start from a state where they’re just coming back from vacation and they’re really excited to get to work… A lot of this stuff is laid out in a lot more detail in the book of Age of Em by Robin Hanson.

Holden Karnofsky: So, there’s the productivity aspects. There’s this idea of having an easier ability to do reflection and learning about human nature and behavior. I’ve always been very interested in social science, in reading social science papers. That’s my background from GiveWell, is saying, “Do bed nets really help people? Does nutrition really help people?” And it’s really hard to answer questions like that, because you have some people who got good nutrition and some people who got bad nutrition, and you’re trying to see how well they’re doing, and the problem is that there’s lots of things that are different between people who get good nutrition and people who get bad nutrition. Generally, people who get good nutrition tend to be wealthier and more educated, and all kinds of things.

Holden Karnofsky: And so, what do you always wish you could learn, the study you wish you could read is one where there were a bunch of copies of the same person, and some of them got the intervention and some didn’t, and now we see how that changes.

Holden Karnofsky: But you can’t do that, until you can. Until you have digital people. The closest thing is a randomized controlled trial. Those are very expensive. They take a long time. You need a large number of people, because you’re not actually using copies of the same person. So, you’re just hoping that the randomization washes out with the large numbers. So, yeah, the ability to learn from the experiences of actual copies of people as they try different things and do different things, that could be big.

Holden Karnofsky: And another thing you could do, there’s all this stuff I wish I had time to read and think about and do. I wish I could try just spending an entire three years of my life meditating, and seeing how that changes me. But I just have to make all these tough choices about how to use my time. But you could have people making copies of themselves to try all these different things, learn all these different things, study these different things, and explain them back to themselves, essentially. And so, I think there’s this whole set of ways in which I would say over the last few hundred years we’ve made much more impressive progress at understanding the world than we have at understanding ourselves, understanding human nature, and behavior, I think that could change with digital people.

Holden Karnofsky: Then there’s this virtual reality aspect, where you could have digital people experiencing whatever. That could be a really good or really bad thing. If you didn’t have human rights for digital people and they were at the mercy of whoever’s running their environment, then they could be manipulated.

Rob Wiblin: It’s a disaster.

Holden Karnofsky: They could be tortured. It could be really, really dark. If it was well done, there’s no need to have aging, or death, or disease, or violence, or any kind of force. You could have people just changing their appearance, appearing however they wanted to, experiencing whatever they wanted to.

Holden Karnofsky: And then, I think that the big ones, for the purpose of the series, have to do with the stable galaxy-scale civilizations. So, I think if you are a digital person, then you can run anywhere you can run a computer. And you could probably run a computer anywhere that you can find metal, energy, and some other stuff. You need something to cool it. But everything you need, you can find just all over space. You don’t need to find this wonderful planet that’s just like Earth that has the right temperature, has the right amount of water. Anywhere you can get a bunch of metal and a bunch of solar panels and build these computers, you could have digital people there. So, you could have them at every star, every solar system.

Holden Karnofsky: And then you could have this, if you wanted, which I think in many ways is a scary idea rather than a good idea, if you wanted to send out a space probe that creates this virtual civilization in another solar system, you can have that civilization following whatever rules you wanted. You could have this virtual environment that if you’re the president and you want your digital copy to be the president, then every time your digital copy loses an election, it could just reset. It can be programmed to do that, or it can be programmed to respond to any unexpected error with just rolling back a step and rerunning the simulation.

Holden Karnofsky: So, it’s again, just imagining people as software, we have a lot more control over software than we have over our physical environment, and that’s really the crux of it. When you imagine technology reaching its limit, you just imagine us getting more and more control over our environment, and in the limit, or even just literally, if we were software, that means being able to have human lives going on basically anywhere in the galaxy, and in a way that is just, it could be stable. It could be set up so that there’s certain values and certain things about the world that persist.

Rob Wiblin: Going back to some of the first comments, you could have people experience anything, which means that these digital people, they could just have the best meal of their life, and then just do it again and again and again, because there aren’t any limits on the amount that they can eat, for example, or the things they can do. Then you don’t have to simulate the food, just see what nerves are stimulated by their favorite food, and just push a button, and they can feel that. You would have gotten rid of disease, basically, because presumably in this simulation you’ll be able to edit out anything that was problematic, or just revert back to the pre-disease state. I guess potentially you could have new health issues that relate to the computers or something, but you would have removed a lot of the problems in the world that people deal with because they predate us, and we just don’t know how to change the world in order to fix it. You could also just make the environments incredibly pleasant.

Holden Karnofsky: I do want to cut in for one second there.

Rob Wiblin: Sure.

Holden Karnofsky: I think I agree with everything you said. I do want to caution a little bit that I think a lot of people who talk about these topics, they’re transhumanists, and they’re very enthusiastic, and it’s all with an air of enthusiasm, and it’s all under this assumption that technology is great, and more of it is great, and everything’s going to be great, and let’s talk about all the ways things would be great. And I just, I actually just waffle a lot on this, but I think it’s important to recognize that rather than take a position on whether it’ll be great or whether it’ll be terrible — because I think a lot of non-transhumanists just hear this stuff and they think it’s the most horrifying thing they’ve ever heard — rather than take a position either way on this, I just want to be like, “It could be either.”

Holden Karnofsky: And I think that’s one of the things I’ve had trouble with with this series. Everyone wants to read everything as “This is going to happen. It’s going to be awesome.” Or, “This is going to happen. It’s going to be terrible.” And I’m really, really, really… I’ve had trouble convincing people of this, and I have to say this over and over again in this series, I’m really not saying either one. It really could be either. When I think it could be the most important century, I’m not thinking, “Woo-hoo, it could be the most important century. We’ll never have disease.” And I’m also not thinking, “Oh no, it’ll be the most important century. Everything’s going to be bad. It’s going to be worse than it was.”

Holden Karnofsky: I’m just thinking, “Oh my god, oh my god, I don’t know what to do. Oh geez. Anything could happen. It could be the best. It could be the worst. Geez. We’re not ready for this. And we really need to think about this. And we really need to think about how this could go. It could go really well. It could go really poorly. We need to think about how to start to get a grip on this.” So, I think that’s important as the way I’m trying to approach this stuff. I think it’s hard for a lot of people to not be thinking, “Is this going to be good or bad?” Well, I think it is good to think it could be good or bad. It’s good to recognize it could be either.

Rob Wiblin: Yeah, definitely. Another important effect that it could have, which would affect human beings especially, is that at the moment, we only have one Beyoncé, as you point out in one of these blog posts. But in this world, we can just create lots of copies of Beyoncé, and these different copies of Beyoncé can produce many different albums, and potentially go into all of these different styles of music, and then potentially dominate some non-trivial fraction of the entire music industry in a way that a single person can’t. So, the fact that you can create copies of people who are exceptional in particular respects, I guess it would affect the employment opportunities for, I suppose, flesh-and-blood humans who don’t have these abilities and can’t go through these very long training processes and then continue working after that. Do you want to talk about the economic effects that this might have?

Holden Karnofsky: I mean, the economic effects seem really hard to say much meaningful about, other than that the economy itself would explode, or it might not explode if it already exploded from the AI, or something. There would be a lot of economy, is the main thing I would say. It’s not necessarily the case that flesh-and-blood humans… They might have trouble finding a job, but it might turn out that if you have $100 in your bank account that you’re now just unimaginably wealthy by digital people’s standards, because there’s so much productivity, there’s so much of everything that that money you have or whatever, it’s $100 in today’s dollar. You’ve had it in there in some way that it’s keeping up with inflation or getting returns, keeping up with the stock market. It might turn out that the flesh-and-blood humans are just unbelievably rich, because they have these existing claims and then everything gets cheaper. So, it’s really hard to say what happens. The thing that happens economically is just that there’s just so much.

Rob Wiblin: Another implication this has is that so much will be going on that if there’s a way for things to go very badly, then that could come quite quickly by human timescales. Everything could just be running so much more quickly on these computers that eventually, as the technology advances, all of history is occurring in a single human lifetime, in a sense.

Holden Karnofsky: Yeah, exactly, yeah.

Rob Wiblin: And I guess in terms of thinking very big picture about how this could affect the galaxy-scale civilization, the main issue there is, I suppose, the error correction. Or the way that you can just prevent significant drift with resetting, or just having particular digital people who are known to have very specific values that never change, and so on.

Holden Karnofsky: I think the thing that you imagine, or that I imagine, is if today we develop digital people, and all of a sudden, we have the ability to… I mean, we would start by having these people, interacting with them, coming to see them as people, caring about what they do, and the population would grow. And at some point, let’s say if they want to have their own kids or their own friends or their own copies, and they want to go into space… The crazy thing is that whoever’s making decisions about what kinds of computers to build and send into space, that person could be deciding, “Hey, we’re just going to send off something that starts from today’s civilization. It goes wherever it goes.” Or they could be saying, “We want to make sure it preserves human rights no matter what, and we’re going to make sure the code is consistent with that. Or we want to make sure it preserves this person as president no matter what, and the code will be consistent with that.”

Holden Karnofsky: Whoever is building these things and programming these things could be deciding that that decision lasts for billions of years, because it’s affecting these space probes that go out into space and then build their own computers that build their own computers. And that’s the scary thing. That’s the thing where you could imagine just a huge range of variation, where if that was done really thoughtfully and reflectively and in the best way possible, versus if that was done in this sort of, I don’t know, just random, thoughtless way, like people saying, “Well, I want to be president”… There are just massively different futures for the galaxy that last a very long time and affect a very large number of people. That’s what I find so breathtaking and scary about the whole thing.

Rob Wiblin: So, I could imagine that there are some listeners out there who are thinking, “I subscribe to this podcast because I want to improve the world and learn how to have more impact with my career. Who the hell are these two people that are talking about putting people into computers and sending them into space and turning asteroids into solar panels?” What’s your response to people who are like, “This is a cool story, mate, but can’t we get back to the real world?”

Holden Karnofsky: I mean, I’ve been there, for sure. My history is that I spent the first part of my career co-founding GiveWell, which is entirely focused on these straightforward ways of helping people in poor countries by improving health and wellbeing, distributing proven interventions like bed nets and deworming pills for children with intestinal parasites. I mean, that’s where I’m coming from. That’s where I started. That’s what motivates me. That’s what I’m interested in.

Holden Karnofsky: One of the things that I say in the description of my blog is that it’s about ‘avant-garde effective altruism.’ So, the analogy for me would be if you hear jazz, you might hear Louis Armstrong and you might think, “That sounds great. I want to get into jazz.” And then if you meet people who’ve spent their entire life listening to jazz, a lot of their favorite music, you’re just going to be like, “What the hell is that? That’s not music. That’s not jazz. What is that? That’s just noise. That’s just someone kind of screeching into a horn or something.” Avant-garde effective altruism has a similar feel for me. I started by saying, “Hey, gosh, people are dying of malaria and a $5 bed net can prevent it.” And I was really interested in using my career to prevent that, but I was greedy about it. Over the years, I’d always be like, “But could we do even better? Is there a way we can help even more people?”

Holden Karnofsky: Well, maybe instead of helping more people, we could help more persons — things that aren’t people, but that we should still care about. Animals are having a terrible time in factory farms, and they’re being treated horribly. What if someday we’ll decide that animals are like us, and we should care about them? Wouldn’t that be horrible? Wouldn’t it be great if we did something about it today? Just pushing and pushing and pushing, and thinking about it. And I think that that is a lot of what your audience likes to do. That’s a lot of what I like to do. A lot of what I am trying to do is bring people along that avant-garde effective altruism route, and say, “If you just keep pushing and pushing, where do you go?” And in my opinion, where you go is… Yeah, of course, it’s wild to talk about digital people living by other stars in weird virtual environments that are designed to do certain things. Of course, it’s weird.

Holden Karnofsky: But if it’s the kind of thing that we think will eventually happen, or could eventually happen, then most of the people we can help are just future people who are digital people. And if you say, “Well I don’t care about them because they’re future people,” I would say, “Gosh, that didn’t sound very good; you may regret saying that. History may not judge you kindly for saying, ‘I don’t care about people that are future people. I don’t care about people that are digital people. They’re digital. I’m made out of cells.’” There’s a lot of philosophical debates to be had here, but I’ve definitely reached the conclusion that it’s at least pretty dicey to say that kind of thing.

Holden Karnofsky: And so, I think you start from, “I want fewer people to die from malaria.” And I think it actually is logical that you get to, “Well, I care about all people. I care about future people. I care about digital people, and I really care what happens to them.” And there are just awful, awful, huge stakes for a huge, huge, huge number of digital people in this thing that could be happening in this century. And that is something that I need to get a grip on, because the stakes are enormous.

Rob Wiblin: Or at least that someone should be getting a grip on.

Holden Karnofsky: Yeah.

Rob Wiblin: Cool. We’ll come back to some other objections and ways that this whole view could be totally misguided later on, but for now, we’ll carry on inside this worldview.

Transformative AI Timelines [01:22:05]

Rob Wiblin: So next in the series you have a quite lengthy discussion, a four-part discussion, about when we might expect different levels of progress in artificial intelligence, or different kinds of artificial intelligence-like capabilities. We’ve talked about that a number of times on the show before, most recently with Ajeya Cotra back in January, your colleague who wrote this report that influenced your thinking quite a bit. So we won’t rehash the full thing here. But one point you made which I was excited to hear a bit more about was about where our intuitions typically come from, regarding what we expect to see as AI gets closer and closer to different milestones.

Rob Wiblin: There’s lots of different comparisons or benchmarks that we might intuitively use as reference points. We could potentially imagine AI increasing in ability like species do, say from ants to mice to ravens and then to primates and so on. Or the thing we might have in mind is that we would expect an AI to act like a one-year-old human and then like a two-year-old human and then a three-year-old human.

Rob Wiblin: Or possibly we might imagine that AI will become capable of doing what a human could do in 1 second, and then what a human can do in 10 seconds, and so on. And interestingly, all of these have some intuitive plausibility to them. But they potentially have very, very different implications. And it’s possible that all of them are wrong. So, which, if any, of these do you think we should actually use as a guide to our future expectations?

Holden Karnofsky: Well, I don’t really know which one we should use as a guide, but that’s a lot of my point, I think a lot of people just have one in their head, and they’re very confident in it. And I think a particularly common one, because I think it’s the most anthropomorphized and the most, I don’t know… It’s like, we have AI systems that can do the low-paying jobs. Then they can do the medium-paying jobs, then they can do the high-paying jobs. And it’s like, “Gosh, that would be a really polite way for AI to develop.” They can just get right into our economy’s valuations on people and our opinions of what kind of work is valuable. And I think when people talk about unemployment, they’re just assuming.

Holden Karnofsky: They’re just like, “Well, the people right now who aren’t paid very much, those are going to be all the people who are unemployed. And we’ll have to wait for the AI to catch up to the people who are paid a lot right now.” I do think that’s a common intuition that people have. And a lot of what I wanted to point out is just, we don’t know how this is going to go, and how this goes could be a lot more sudden. So a lot of the ones you said, where it’s just going to be like, “Alright, now it’s an ant.” How would we even know that? In my opinion, it could already be at ant-level intelligence, because we don’t have the hardware.

Holden Karnofsky: We can’t build things that can do what ants do in the physical world. And we wouldn’t particularly want to, so it’s just hard to know if you’re looking at an ant brain-level AI or a honeybee brain-level AI or a mouse brain-level AI. We’ve tried a little bit to compare what AIs can do to what very simple animals can do. There’s a report by Guille Costa on trying to compare AIs to honeybees on learning from a few examples, but it’s really all inconclusive stuff. And that’s the whole point, is it might just happen in a way that’s surprising and quick and weird, where the jump from chimp brain to human brain could be a small jump, but could be a really big deal.

Holden Karnofsky: So anyway, if I had to guess one, I would go with we don’t yet have an AI that could probably do whatever a human could do in one second. But I would imagine that that will be, once we’re training human-sized models, which we’re not yet, that’d be the thing you might expect to see us getting closer to. And then you might get closer to things that a human can do in 10 seconds, or 100 seconds. And I think that where that would put us now is we’re just not at human level yet. And so you just wouldn’t be able to make much of what you see yet, except to say maybe make lower animal comparisons, or simpler animal comparisons.

Rob Wiblin: I thought you might be even harsher on all these comparisons, because our experience seems to be that ML systems can do some things that we find incredibly hard, and that no species of animal can really do. They just completely master it. And then there’s other stuff that we think is trivial that they can’t do at all. So it seems like we just can’t really predict very well what capabilities might come at what stage, and what would be precursors to other things. Because it’s so different from the stuff that we’re used to.

Holden Karnofsky: Just to be clear, I’m definitely not saying it’s going to happen overnight. That’s not the point I’m trying to make. So I think before we have this super transformative AI that could automate science or whatever, we’ll probably be noticing that other crazy stuff is happening and that AIs are getting more and more capable and economically relevant. I don’t think there’s going to be no warning. I don’t think it’s going to be overnight, although I can’t totally rule that out either. But what I do think is that it might simultaneously be the case that it’s too early to really feel the trend today, and that a few decades could be plenty. And one way of thinking about that is that the whole field of AI is only a few decades old.

Holden Karnofsky: It’s only like 64 years old, as of this recording. And so if you imagine… And we’ve gone from these computers that could barely do anything to these image recognition models, these audio recognition models that can compete with humans in a lot of things, at least in an experimental laboratory-type setting. And so an analogy that I use at one point is the COVID pandemic. Where it’s like, it wasn’t like it happened completely overnight, but there was an early phase where you could start to see it coming. You could start to see the important trends, but there weren’t any school closures, there weren’t any full hospitals. And that’s, I think, maybe where we are right now with AI. Where you can start to think about the trends, and you can start to see where it’s all going. You haven’t started to feel it yet, but just because you haven’t started to feel it yet… I mean, a few decades is a long time.

Rob Wiblin: I think people, including me, have this intuitive sense that things happen gradually. And so you would have a technology that initially has small social impacts, and then it will have a bit more social impact than that, a bit more than that… It’ll be quite linear, in a sense. But what’s the case that we should expect to have quite a non-linear impact on society, where it goes from like nothing to a lot, faster than you would expect?

Holden Karnofsky: Not overnight, but maybe over the course of decades. The case would just be that that’s what accelerating economic growth looks like. And this report by David Roodman, the Modeling the Human Trajectory report, a lot of what I was interested in at the time was I was thinking about this Robin Hanson analysis of this sum of exponentials where he’s trying to just look at all of economic history and extrapolate where it’s going next. And I wasn’t really convinced it had been done in the best way. And David’s a person who I think is really smart about this sort of thing. And so I just said, “David, if I just handed you this economy…” And I was just like, “I don’t want you to think about AI.”

Holden Karnofsky: “I’m just handing you this economy and saying like, where’s this going? Where would you say this goes next?” And he just did this, in some ways, I mean, there’s a lot of math in there, but it’s a conceptually pretty simple extrapolation that says, “This thing just accelerates. It gets faster and faster and goes to infinity by the end of the century.” It can’t literally go to infinity, but it goes to wherever the limits are, or hits some other bottleneck. And we’re in just this temporary period where it’s slowed down for what’s a heartbeat, in context, if you look at the whole history. We’re on a temporary slowdown and it could accelerate again. So the case is that we are used to what’s called constant exponential economic growth.

Holden Karnofsky: And that feels like it feels. And accelerating economic growth feels more explosive, and chaotic, and would be faster. And we just have to get back on trend. I’m not saying that I hope we get back on trend, and I’m just saying, if we get back on trend, then things will just move in a really crazy fast way. And I have a chart in the second piece in the series called The Duplicator that just has the projection of what accelerating versus constant growth looks like from here. And it’s just a line going straight and a line just going to the moon.

Rob Wiblin: Looking at it another slightly more concrete way, when we look at things that are changing the world… So some things do just change linearly, but then there’s others, I guess most clearly with COVID, it was spreading at an exponential rate. So you get this very sudden shift from the rates are very low to the rates are very high. And so we have to change all our behavior. I guess, in the economy, you can get other cases where something is more expensive than the alternative. And then very quickly it becomes cheaper than the alternative because it’s gone… I guess with solar panels you see this, that it used to be 100 times more expensive than the alternative, then it’s competitive, and then quite quickly it falls below.

Rob Wiblin: And then in other systems, you have this example of water boiling, where the temperature keeps going, there’s this underlying factor, which is the temperature. But if what you’re thinking about, what you’re focusing on, is the water bubbling, then as the water gets hotter and hotter and hotter, there’s no bubbling at all, until very suddenly it starts bubbling a lot. And that’s because you are tracking this surface, I guess, property of it, rather than the underlying trend that will actually drive and cause the effect. Do you want to discuss how those analogies might apply to the artificial intelligence case?

Holden Karnofsky: Well, I think the thing you said about the cost comparison seems right. I mean the question is, when does an AI version of something become better than hiring someone to do it? And I don’t expect that to happen all at once for every task, but just because humans are the underlying thing, powering the whole economy and powering innovation, you can get an especially explosive dynamic, and you can get back to accelerating growth at the point where certain key things that humans are doing, particularly innovation, can be done by AIs. And so the question is, at some point there are certain things humans do that no AI could ever do, no matter how you did it. At another point maybe you could get one AI to do it, but it would take all the money you have.

Holden Karnofsky: At another point it actually becomes, I’d rather just build more servers than hire more people. And it’s not instant, it’s not overnight. As you get closer to the threshold, you start to see signs. That’s true of water boiling and that’s true of COVID, but the claim is not that this is going to happen overnight. The claim is that this is going to happen faster than it feels like it’s going to happen.

Holden Karnofsky: I think it’s also good to talk a little bit about exponential growth for a second, because I think this is actually a bit of a misconception. It’s true that exponential growth is hard to imagine, and it often moves faster than you think. And that is important when we talk about how the rate of economic growth today would not be able to be sustained for more than another few thousand years. However, today’s rate of exponential growth is not going anywhere that crazy this century, in my opinion, if it just stays where it is. 2% growth for the rest of the century, what is that?

Holden Karnofsky: That would be, for the rest of the century, our economy would be like 4x the size at the end of the century. It’s not going anywhere that crazy. And I think some people will make the argument, and I think even kinda can get this vibe from, I don’t know if it’s from Kurzweil directly, but from people quoting him that, “Oh, well the growth we’re on right now, it doesn’t feel like it, but it’s going to infinity.” Well, eventually it is, but not this century. The growth rate we’re on right now is actually just not that scary. It’s not.

Rob Wiblin: I just calculated it, and it would be seven times bigger by the end of the century.

Holden Karnofsky: Okay, yeah. So it’s definitely going somewhere, but it’s not that crazy. And so the claim is different. The claim is that there is a different dynamic called accelerating growth, and we may or may not get back on it. And we don’t know if we will, and this is not something I’m confident in. And if we’re not able as an economy to create the things that do what humans do on a larger scale, then we just keep on this pace for longer. And if we get on the accelerating growth trajectory, then all of a sudden everything moves way faster. Not overnight, but way faster.

Rob Wiblin: So it seems like this might be a really recurring crucial consideration, whether we should expect growth to stay at roughly the same rate or even slow down, or whether we should expect it to accelerate in the way that it has over longer timescales. I might link to a couple of different pieces on that. You talk about it in the second piece in this series, about the duplication of people. There’s also a couple of other introductions, one on Slate Star Codex that you’ve referred to. And I guess of course, for people who really want to go deep into it, there’s the David Roodman long report for Open Phil on this macro history economic growth topic. So is there anything else you’d want to refer people to?

Holden Karnofsky: The Duplicator piece was my best shot at explaining it, with help from Maria on your team with some great illustrations, and it links to three other things. Two of them were Open Phil reports (1, 2), and one of them is a Slate Star Codex piece. So that was where I was getting it. It’s a very standard economic theory thing, but in terms of accessible explanations, that’s what I got.

Rob Wiblin: The Slate Star Codex piece is The Year the Singularity Was Canceled, right?

Holden Karnofsky: Yeah.

Empirical evidence [01:34:14]

Rob Wiblin: So setting aside this uninformed prior issue, what are the best pieces of empirical real-world evidence that we can get on how we should expect AI to progress, in your view?

Holden Karnofsky: Frankly, a lot of my series is about anticipating ways you might think we’d be able to forecast AI, or ways you might think we’d be confident it’s not coming, and trying to undermine that. Because I don’t believe that stuff really holds up too great. So a lot of it is about priors and the burden of proof. And then a lot of it is about, well it doesn’t really feel like it’s coming, what does that mean? There’s not a ton to go on in forecasting AI. I think that itself is very scary. And I think the big points I would cite is I would say, we’ve barely started trying, in some sense. You look at civilization on a thousands-of-years scale or a billions-of-years of scale.

Holden Karnofsky: And the first computer that was a real computer was 1945 or something, the first computer ever was the 19th century. Which is also not that long ago, but it couldn’t really do anything. And yeah, I think the Semi-informative priors over AI timelines report by Tom Davidson is one way of looking at it — we just haven’t been trying that long. And so it really could be soon. And so once you get into that headspace, we really have nothing to go on, then what do you have to go on? And you have expert opinions. So you have the survey by Katja Grace and others, which is basically pointing to similar timelines that I am, although there’s various signs that the experts didn’t seem to be thinking very hard about the questions, and thinking of much longer timelines in response to what seems to be a very similar, differently worded prompt.

Holden Karnofsky: And then there’s Ajeya’s report on biological anchors. And I’m not going to go into a lot of detail on that. I will really quickly, I will say a couple of things because you just had her on the podcast. I will say one, I think her report’s amazing, it’s over 150 pages. It’s pretty technical. One of the pieces in the series is just summarizing it, and that’s all it is. It’s just here’s a layperson’s, 15-page-or-so summary of Ajeya’s report. You could go into all the frameworks and all the ways that it’s too aggressive, too conservative, and all the considerations. And I think you did that with Ajeya. The high level that I would say is, I would just say we have never trained an AI model, a deep learning model or something like that, that has computational power comparable to the human brain, or even 1% of the human brain; it’s never been done yet.

Holden Karnofsky: I think it’s like really, really recently that we saw the first model that is even getting within range of a mouse brain, but in the coming century, according to the compute projections that Ajeya’s using, which I think could be improved for sure, we will not only see the economic ability to train a human brain-size model, we’ll see the economic ability to train much, much more than a human brain-size model. To train a human brain-size model doing tasks that take 1,000 times or more as much compute and as much effort as the tasks that the language models are doing. And you can add quite a lot of slop, and you can even start saying when will it be affordable to do as much computation as all of the history of evolution did with all the brains that have ever existed.

Holden Karnofsky: And that also tentatively looks like it’s late this century, by the very rough estimate. So you’re running these estimates, and we’ve never come within range of the compute we need. But in the next century, in this century, we will see way more compute than you would come up with on most frameworks of what it might take to train a human brain-size model. And that doesn’t close the argument. You then have to talk about, well, is it enough to have a lot of compute? Do you need to have the right software, the right algorithms, the right training environments, the right goals? But a century is also a long time to figure all that out. And so that’s the basic intuition. And you get into the details in the technical report, but the basic intuition is, this could be the biggest thing that ever happens. And it really seems like this century is a very good candidate for it to happen.

Rob Wiblin: Just on the point about how we would have the ability to do more computation than all of evolution has ever done, is it possible that we just would never get there, because we’ll run up against physical limits of what chips can do? Do you know where those estimates are coming from?

Holden Karnofsky: The computation projections are one of the areas of the model that most needs improvement. I would hope that sometime in the next year or two we’ll see just a much better version of that, because it’s a nice well-defined problem that someone could work on. And I hope someone does. So right now it’s just this very simple mathematical extrapolation. It’s not assuming Moore’s law goes on forever, but it’s just a simple mathematical function that’s extrapolating it out. And what you’d really want to do is, you’d really want to say, “Okay, how long can Moore’s law continue in its current form?” Probably not that much longer. Moore’s law is the number of transistors on a chip going up over time.

Holden Karnofsky: And then it’s like, what else could happen that could speed up compute? And there’s a bunch of stuff that’s about deep learning specifically, and optimizing the chips better for the things that today’s AI systems are doing. And then there’s a bunch of wildcard possibilities, like optical computing and quantum computing that could drastically increase compute. So the picture does get pretty complicated. It’s one of these things… My informal sense is that these simple mathematical curves are not going to look wildly off after the projections have been done right. But yeah, I totally admit that the projections are not what they could be.

Holden Karnofsky: But I will say something important, which is I don’t think anyone has done the projections the right way. So I think when people are saying, “Well, I think this is all too crazy.” They’re not like, “I did the projections, they’re not happening.” That’s not what’s happening there. So I think someone needs to figure this out.

Rob Wiblin: I get the impression that broader society, in which people have jobs to do and lives to live, they’re not paying a ton of attention to all of the news on actual AI capabilities. In broader society they’re really underestimating the degree to which we’re getting clarity on what AI might be able to do and when, because so many of these things aren’t science fiction anymore. There actually are applications that would have amazed people 10 or 20 years ago.

Rob Wiblin: And that means a lot of things, we might be able to begin to expect a bunch of stuff in the coming decades, it’s just natural extensions of what already exists. Even if we just do a more linear projection-forward approach. I guess that relies to some degree on my subjective impression of how impressive the things these new language models can do are, the new image models and so on. Which is necessarily subjective. But nonetheless, I find that what seems to me to be a disjunct between the reality of what AI can do and what people perceive it as being capable of doing can be quite frustrating. Do you share this general perception?

Holden Karnofsky: Sort of. I think you’re right to point out that it’s your subjective thing. This is actually a topic that I’ve found surprisingly difficult to talk with people about, because two people will look at an AI system and one will be like, “Oh my God, that thing is amazing. I didn’t know we were making that kind of progress.” And another person would be like, “What is this? This is nothing. This is trivial. This AI, this is not really reasoning. It’s just using this pattern.” Which… That’s a thing that people say a lot. But I mean, I think you could say that about humans as well. I think most of the time, even when I feel like I really understand something, I’m usually doing some pattern recognition, and I realize I don’t understand it as soon as it changes a little bit.

Holden Karnofsky: So I think one of the things that I also talk about in this series is this idea of looking at how impressive the system seems to you, and projecting that forward. And I think it’s just dicey. I watch AI experts talk about this stuff, and it feels like there’s no rhyme or reason to it. I remember one case of a person who was like, “Oh, we’ll never get there. AIs can’t do this. They can’t do that. They can’t do this. They can’t do that.” And then all of a sudden there was some paper that came out and this person was like, “Oh my God, oh my God, no, no, no, no, it’s coming soon.” I looked at the paper and I was like, “I don’t even know what you’re talking… What is new here??”

Holden Karnofsky: And then other people would look at it and be unimpressed. And it’s very hard to pin this stuff down. You can look at what GPT-3 is doing… I think in some ways it’s very impressive. In some ways it’s not. It’s able to continue stories that people start, it’s able to imitate their tone. It’s able to answer questions. It’s able to pass a lot of these language tests that people used to think would be the really hard ones. That seems to be a lot for one model trained in a simple way, but you could break it all day with all kinds of weaknesses and weird stuff that it does.

Holden Karnofsky: And I’ve tried to not make the argument rely on that. I think you can step back from that stuff. And you can say, whether or not things seem impressive to me right now, the amount of computation that’s going to be available is rising dramatically, investment in the field is rising dramatically, research is rising dramatically. It’s very hard to know what’s going to happen, but we’re passing a lot of key milestones this century, and who knows where that’s going.

Rob Wiblin: Is there some kind of deep, underlying philosophical fact that is driving the fact that we just can’t evaluate how impressive different capabilities are? It’s quite surprising, right? Because when you have different physical systems, you’re like, “That’s bigger, and that’s capable of doing more impressive stuff.” But with this, people just completely disagree about how difficult a task is. I mean, maybe that just says something about the human perspective more than any reality about information processing. But I don’t know.

Holden Karnofsky: I mean, one way to think about it is we don’t know how to describe the goal, because the goal is to do everything we can do. And it’s hard to describe the core of what you’re able to do. And so I don’t know. I actually have no idea. I don’t know what’s going on, but I think it’s a very hard and confusing topic to talk about. So is consciousness. And we just have to do our best.

Key messages of the series [01:43:16]

Rob Wiblin: Having sampled some of the key points from the series, there is a substantial amount of work in there. And if people want to dive in, we haven’t justified everything we’ve said here. It has just been a sample platter. You can find the rest at cold-takes.com. But yeah, just to help us with the next section, maybe could you recap the key messages that you’d want people to remember after having read it?

Holden Karnofsky: So there’s this diagram I use over and over again in the series that might be good to insert here, because that’s how I tried to make the whole thing follow-able.

So basically there’s a few key claims made in the series. So one is that eventually we could have this galaxy-spanning civilization that has this high degree of stability and this digital nature that is deeply unfamiliar from today’s perspective. So that’s claim number one. And I think claim number one, I mean, different people have different intuitions, but if you think we have 100,000 years to get to that kind of technology, I think a lot of people would find that pretty plausible. And that already is pretty wild, because that means that we’re among the earliest intelligent life in the galaxy. Claim two is that this could happen much more quickly than you might imagine, because we’re all used to constant growth, but a lot of our history is accelerating growth.

Holden Karnofsky: And if we changed from constant growth to accelerating growth via the ability to duplicate or automate or copy the things humans do to move the economy forward, then 100,000 years could become 10 years, could become 1 year. So that’s claim two. And then claim three is that there’s a specific way that that automation might take place via AI. And that when we try to estimate that and try to forecast it, it looks like all the estimation methods we have and all the best guesses we can make, and they’re so far from perfect, but they do point to this century, and they actually tend to point to a fair amount sooner than the end of the century. And so that’s claim three.

Holden Karnofsky: And so when you put those three together, we’re going to a crazy place. We can actually get there quickly if we have the right kind of tech, and the right kind of tech could be coming this century. And therefore it’s a crazy century. The final piece of the puzzle is just, gosh, that all sounds too crazy. And a lot of the series is just trying to point out that we live in a crazy time, and it’s not too hard to see it, just by looking at charts of economic growth, by looking at timelines of interesting events that have happened in the history of the galaxy and the planet.

Could we be living in a simulation? [01:45:42]

Rob Wiblin: So a response that a few people have had to this idea that we’re living in this extremely bizarrely uniquely important time in history would be, this has to be an illusion somehow. And that somehow could be that we’re in a computer simulation where people are studying this particular interesting part of history, given that you’re clearly a believer in the idea that we could run simulations, could have digital people. Does this seem like a live possibility to you?

Holden Karnofsky: The simulation thing, I think, is a place where I do seem to have different intuitions from other people. I think a lot of people hear, “Maybe we’re in a simulation” and they’re just like, “That’s too crazy. That’s just too crazy.” And I don’t know, I mean, to me, it doesn’t necessarily seem that crazy. If it’s true, it doesn’t necessarily change anything about our actions. And I think a lot of times, the underlying reality we’re living in is a lot weirder and crazier than we were imagining. Or at least it doesn’t speak the same language we thought it was speaking. But the implications are just all the same stuff we see, and nothing really changes. Quantum mechanics is a good example of that, where it’s just the way that physics actually behaves is just very alien to us.

Holden Karnofsky: And it’s very hard to describe in our language. And every time you try to slap your concepts on it, it slips away. And every weird thing you can think of actually happens in quantum mechanics, but “It all adds up to normality,” is a phrase Eliezer Yudkowsky has used. It’s like, “Well, we’re here. And none of this means that a flying spaghetti monster is about to attack you. It just means that the root of it, underneath everything you see, is weirder than you think.” So I just don’t find it that weird to think we’re in a simulation, because it doesn’t necessarily change anything. I don’t think it’s that wild. In some ways I’m thinking, “Well, there’s two ways the world could be. One is that there’s these laws of physics and they operate randomly and they generated us, exactly us.”

Holden Karnofsky: “And another way is that there’s something else going on, and someone or something else had some reason to create something like us.” And the second one, most things fitting that description would be a simulation. And I’m just like, “I don’t really see why the first one is so much more likely than the second.” So I don’t find it that wacky. And I think in some ways, I had a little trouble taking the possibility that we’re in the most important century seriously when people wouldn’t engage with that possibility and acknowledge it and talk about what it means. So I think I had a lot of conversations where someone would be like, “Well, I think we’re going to build this AI that does this and that. And then we’re going to end up spanning the galaxy. And so we need to act now, or we won’t span the galaxy. Or we will, but it’ll be bad.”

Holden Karnofsky: And I’d be like, “Okay, but this is all very dramatic. This is all a lot. Does this make you think that maybe we’re in a simulation somehow? That somehow there’s a reason we’re in this high-leverage position?” And they’d be like, “Simulation? That’s so crazy.” And I don’t understand that. I don’t understand that kind of thinking. And so early in the process of grappling with these ideas, it was very important to me to at least do a little bit of thinking about the simulation stuff. And Ajeya talked about that in her podcast and stuff she did, and I won’t repeat it, but I will say it was just important to me to think about it a little bit and be like, “Is this a possibility? Would it radically change anything?”

Holden Karnofsky: And our conclusion was like, “Look, you could make the argument.” You could make the argument that if in fact the way that the universe goes is that there’s intelligent life that builds a galaxy-spanning civilization out of digital parts, then the moment — by which I mean century or millennium — in which it created that technology might be very interesting to all the creatures that live after that point. They might want to study it. They might want to see how it went. They might want to learn things about what happened. They may want to, I don’t know, relive it. And so it’s totally possible that actually this makes a lot more sense when you think, what’s more likely? That we just randomly ended up in this very weird, important, early time, or that this weird, important, early time already happened and there’s some reason that beings are very interested in rerunning it?

Holden Karnofsky: So I think that’s fundamentally reasonable, but we concluded that it doesn’t really change much. It doesn’t change nothing. It changes some of the numbers, the numbers you see in the essay Astronomical Waste, those numbers change, but it doesn’t really change the bottom line. It doesn’t change, what should we be doing? What matters? In a way that we were easily able to see. And I don’t really think this is a topic that is worth a lot of deep investigation of, or needs a big deep dive, but it was something I wanted to look at. I don’t think it should just be tossed out. I don’t think it ends up mattering very much, but it was important to me to think that piece of it through.

Rob Wiblin:I think you’re in a very small group of people who were driven to take all of this a whole lot more seriously by the simulation argument. That’s where you got on board, rather than that’s where you got off board.

Holden Karnofsky: For me it was more like I had trouble taking it seriously until that thing was acknowledged and grappled with a little bit. But yeah that’s right.

Ways this could end up not being the most important century [01:50:24]

Rob Wiblin: Let’s turn now to what you think are the biggest weaknesses or ways that this whole worldview — that seems so plausible on its face, at least to you and me — how it could end up being wrong. What do you have to say to that?

Holden Karnofsky: Yeah, so the most important century hypothesis is kind of, I’m saying something is likely enough that we should take it super seriously. And so there’s a bunch of different things you could mean by, “How could you be wrong?”

Holden Karnofsky: One thing you could mean is, how could it turn out that this isn’t the most important century, and none of this stuff happens? And for that one, I don’t know, there’s a zillion ways, and all I’m even saying is that it’s reasonably likely. There’s another thing you might mean, which is, are there investigations you could do today that when you are done, might change your mind and make you think that this is not a good bet? That this is not likely enough? Then another thing you could mean is, well, without doing further investigation, what are the best things people say that are just objections, that might be right? And then, a final thing — which I actually find maybe the most interesting — is just like, how could it be that this whole idea, and this whole vibe, is just the wrong thing to be thinking about? Even if the predictions are true, maybe it’s just the wrong thing to be thinking about, or just, “Holden is leading people off the wrong cliff here.” Or off some cliff, probably most cliffs that you’d go off are the wrong cliff. How could that be? Those are different things, so which one do you want to tackle?

Rob Wiblin: I love it. Let’s do all of them. Maybe we can do them in order.

Holden Karnofsky: Alright.

Rob Wiblin: So the first one was just ways that this could end up not being the most important century. I suppose let’s maybe modify that to not the most important 1,000 or 10,000 years, because otherwise the answer might be boring that it just happens in 200 years. What’s the way that this could not be the most important millennium?

Holden Karnofsky: We’ve made all of these guesses and projections about what kind of computational power we think is roughly equivalent to a human brain, and how much it would take to train an AI that size, and whether an AI that size would be able to do certain things, and whether those things would have certain implications. It’s all guesswork. Any piece of that could be wrong. You’d have to have a piece be really wrong, and then have other pieces not be missing in the other direction. I don’t think this is like a super conjunctive hypothesis. I don’t think you have to believe 10 independent things to believe this. We’ve got our best-guess estimates. Every number in there could be too high or too low, but we could have missed overall in either direction.

Holden Karnofsky: Just to rattle some stuff off, the human brain could be a lot more complex than we think, it could just require an unattainable amount of compute to automate it. Or maybe it’s roughly what we think, but it’s still, we’re only guessing that it’s going to become affordable to get that much compute, so maybe it just never will. We’ll hit the physical limits of making chips faster the normal way, and we won’t come up with a new way of making them faster.

Holden Karnofsky: We could choose never to do all this stuff. We could kind of, as a society, say, “We’re regulating this stuff to death. We don’t ever want it.” I think a lot of people would think that is the right thing to do, and that is certainly something that could happen. Though, I think that itself would be a… Grappling with the possibility of the most important century would be an important way to think about whether you do want that to happen. Those are all things.

Holden Karnofsky: And then of course, you could automate science and technology, but it could turn out that the scientific breakthroughs that… There’s not that many really transformative ones left, and the digital people thing doesn’t work, or it does work, but it doesn’t have the consequences I say.

Rob Wiblin: I see. You’re saying maybe the reason why it does just seem so plausible that we are living in the most important century, even though any of these specific claims is very dubious, is just that there are so many different channels by which you could get a massive transformation. You aren’t banking on any particular claim about some technology, like this one thing is going to arrive at this one point in time. Do you agree with that? Or maybe I guess it is somewhat centered on the idea of, I guess being able to digitize minds.

Holden Karnofsky: I think it’s somewhat contingent. I somewhat agree with you. It’s certainly true that there’s a lot of stuff that might not be just what I’m describing, but might be next to it that happens instead. And that does bring up the probability. I’ve described a lot of specific ideas, digital people, PASTA, which is this process for automating science and technological advancement. I’ve described those to help people think about them, help them be concrete.

Holden Karnofsky: And I might miss by a little bit, or even a moderate amount, but then there’s something else that is like that. It’s not true that there’s like a kazillion ways I could… There are many ways I haven’t thought of, but if you were to say to me, “Holden, there’s not going to be anything like PASTA this century. We’re not going to have AI that can automate this stuff.” I’d be like, “Well, I can still think of ways this could be the most important century, but it’s way, way, way less likely now.” You know?

Rob Wiblin: Yeah.

Holden Karnofsky: And I think that… Yeah, so I think there are specific things. And then if you said digital people will never be possible, I think I would then be like, “Well, I would still guess that the future is about as crazy as it would be with digital people.” But now I’m down to like 50/50 or something.

Rob Wiblin: I suppose if you’d take digital minds off the table as a possibility completely, maybe you could fall back on something like brain-computer interfaces, where we enhance human ability to think or reason, using machines in some way that could then be a significant step towards accelerating economic growth again.

Holden Karnofsky: Exactly. Yeah, brain-computer interfaces is another way it can happen. And that one I just think is less likely, because I think that requires a lot of progress in neuroscience, which I think is moving a lot more slowly than AI. But is it possible this century? Yes. Would I have the same fire behind, “Gosh, everyone wake up. It’s the most important century.” Yeah, probably not.

Investigations Holden would love to do [01:55:48]

Rob Wiblin: Okay, let’s move on to the second category of ways this could be wrong, which is what investigations you would love to do. I suppose, if you had a duplicator machine and could make many more copies of yourself and your whole research team, what uncertainties would you like to resolve?

Holden Karnofsky: So things I could imagine doing… Maybe if you did like a really great study of animal brains, or animal behavior, rather, and then compared it to ML systems, you would just be like, “Wow, these animals, they actually are basically doing these tasks that ML researchers have been trying really hard to do. And they really should be able to do them if the frameworks we’re using for brain equivalents and things like that are right. And they’re really not. And so there’s something really big we’re missing here. It’s like maybe brains are doing a lot more computation than we’ve estimated, or they’re doing something really sophisticated that we can’t match with the amount of computation we thought it would take.” Yeah, if you found out that there’s some big challenge in AI that the big models of today are nowhere near and honeybees are nailing it, that would definitely shift me a lot.

Holden Karnofsky: A thing that would really change my mind a lot is if we did the better compute projections — I’m expecting that to come out to similar conclusions to what we have at the moment; different, but similar — but maybe we did them and it was just like, “Nope, 10 more years of the current rate of compute getting cheaper, and then we’re done.” And there’s nothing else coming down the pike, nothing else is going to work out. Everything else is a crazy long shot, even for the rest of the century, quantum computing, very unlikely. And then we’re kind of like, “Okay, well that’s not even going to get us to be able to train a human-sized model at all.” Or maybe it is. I don’t know. But it’s certainly not leaving a lot of space.

Holden Karnofsky: Again, would I then say there’s no way this is happening? No, but I think there is a certain kind of reality to it. There’s a certain kind of, “This is our best guess. This is the best we could do, is that it’s now, it’s this century.” And that would go away. And then you could fall back on the philosophical case and say, “Well, even a very small chance.” And I find that less compelling, but still somewhat compelling, so I probably wouldn’t do nothing on it, but I would have a lot less energy for it, and a lot less like, “Hey, wake up. It’s the most important century” vibe at least.

Holden Karnofsky: Those are investigations that I could imagine. I could sort of imagine just like a sophisticated enough argument between AI researchers resulting in, actually these tasks that humans do are just impossible to build a training environment for, and they’ve never been approached by AI systems and never will be, or something. So I can imagine that stuff.

Best objections [01:58:10]

Rob Wiblin: What about the third category, which is the best objections that people raise when presented with this worldview?

Holden Karnofsky: I think there’s a lot in the series that reasonable people can disagree on, obviously. The biological anchors framework, which is doing most of the work of the specific estimates of when PASTA, the advanced AI, is coming, that has a ton of guesswork in it, and there’s plenty of places it acknowledges that, and the report itself discusses it.

Holden Karnofsky: And this is a thing we do at Open Philanthropy always. Our reports are always discussing all the ways they could be wrong, and the report has a nice big list of ways it’s overaggressive. The big one, it’s really focused on compute. It’s really saying, “When you can afford to build, train, a really big AI model, and you’ve had a few decades to work out a lot of the missing software and a lot of the missing training environments, then we’re getting into high probabilities.” And maybe that’s false. Maybe compute is not the bottleneck, maybe we really need fundamentally different kinds of algorithms, and there’s a way to know that now, if you were more sophisticated about it. I think it’s a big point of disagreement among AI researchers. And the further thing you have to believe is that a few decades won’t be enough to resolve that, even with a greatly increased amount of talent and money going into AI.

Holden Karnofsky: There’s these ways that it’s making a lot of assumptions. It’s making guesses about what’s going to happen to compute and what’s… And you can have different guesses. I will say here I think that you could definitely have different probabilities than me, but I start to have trouble with the probabilities going too low. Based on what we know, you could have different guesses from me, but I think the guesses would still leave more than 10% or something of this being this super important century.

Holden Karnofsky: And then another thing you could do to disagree with me is you could just have a very different interpretation of the social epistemological environment, the fact that there’s no robust expert consensus. I have a whole piece about this where I talk at length about the idea that climate change… Climate change has some things in common with this hypothesis in the sense of like, you’ve got these very complex guesswork-laden models that are projecting that something terrible is going to happen. Bad things are starting to happen now, but they’re not really anywhere in range of the terrible things people are saying are going to happen. It’s many years out. But a lot of people trust the climate change worries and don’t trust these worries. And why is that? And I think a lot of that is because there’s a really well-developed field of climatology. It’s a really well-developed field, and there’s a lot of consensus in that field. And so it feels like there are real experts and they really believe in this.

Holden Karnofsky: And that is, I will just own it, that is not the case for this AI stuff. There is no field of AI-tology. Or to the extent there’s a field, it’s a few people writing reports. It’s not got the maturity of climate change.

Rob Wiblin: It’s not the UN making a huge meeting, and bringing every country together.

Holden Karnofsky: Yeah, exactly. It’s not in the same place. It’s like a lot of people just look around themselves and they’re like, “I don’t see all the experts worrying about this. I don’t see a big consensus, and I need that to believe something crazy like this.” And that is something that you could… I don’t think it’s crazy to have that attitude. I disagree with it.

Holden Karnofsky: And I especially disagree with it if you are an ambitious effective altruist who wants to be the kind of person who can be early, be the kind of person who can see things before others see them, and act before others act, and take your biggest opportunities to have an outsized impact. I think it’s going to be… You can’t really run that at the same time as you’re running “Well, I don’t really get out of bed until it’s known, until the experts are all there and there’s a mature field, and they’re all in agreement.” But I don’t think it’s crazy. I don’t think it’s crazy to take that view.

Rob Wiblin: Because by then it’s too late?

Holden Karnofsky: It’s not. It’s not too late for climate change, so it’s not too late to do anything, but I think you’ve missed a lot of the… You’ve missed maybe your best opportunities to make a difference by being super early. And so that’s, yeah I think that’s something you don’t want to do. Something I don’t want to do, anyway.

Rob Wiblin: Do you have a theory for why it is that there isn’t more of a consensus around these broad concerns? I guess my reaction would be that it takes time for people to change their mind, especially about something that’s as peculiar as this, like making these kinds of forecasts. Even if they’re sound, you wouldn’t expect people to change their mind overnight.

Holden Karnofsky: Yeah, that’s right.

Rob Wiblin: And it’s important to look a bit at the trajectory of how many people take this worldview very seriously. And the trajectory seems to be robustly upwards. Yeah, not everyone finds that persuasive.

Holden Karnofsky: I do feel that way. I think if you just look at climate science versus AI forecasting, it’s just like…one of them is a field. People have PhDs in climate science, or climatology. There are universities running programs. Just the whole process of getting to the point where there could be a field of forecasting stuff like this, that just like… People started talking about this stuff not that long ago. And it would take a long time. I think it will happen, may happen… Well, it may not have time to happen before all the crazy stuff happens, but by default, I think it will. It’s on track to happen, but it hasn’t happened yet.

Holden Karnofsky: And so I think just for people to… The first people who start talking about something, they have to get to the point where it could be a study, a program of study, and other people know how to study it. And then people who are coming up in their career can get degrees in it. And just all that stuff takes a really long time to play out.

Rob Wiblin: There are smart people at Open Phil and elsewhere who have heard most of these arguments, but they don’t find themselves…they don’t find that their actions end up being very much guided by this worldview. What are they thinking other than just, different people want to make different bets? What do people who are skeptical of this think?

Holden Karnofsky: Well, I think different people want to make different bets is a lot of it, especially at Open Phil. And I think that’s a really key part of it for me, is I really… I think there’s many people who would be totally reasonable to ignore this, because they are in a position where it would be really hard and costly — and not in tune with what they’re good at — to be thinking about the most important century. And they’re doing something else, and they’re amazing at it, and they love it. And it’s always hard to know what benefits you might get from just being really good at something else.

Holden Karnofsky: I’m really into worldview diversification, which I know you talked about with Alexander, and division of labor, and having the EA community not all be all in on one bet, having Open Philanthropy not be all in on one bet. That’s a big part of it for me.

Holden Karnofsky: The other thing I would say is, what is the case against spending your life worrying about the most important century, or spending your time on it? And I think there is a pretty good case against, which is that the ‘so what’ is under-baked right now, it’s underdeveloped. It’s not zero. There is stuff that I think people should be doing and should be doing more of.

Holden Karnofsky: The closest thing to something really robustly good might be technical research on the AI alignment problem. How do you build an AI that is doing what you thought it was going to do, instead of doing some other random thing, and maybe building a galaxy-scale civilization around some random set of values you didn’t even mean to give it? That seems like a good thing to be doing.

Holden Karnofsky: I have to be honest, we support it. We fund a lot of it. I’m really into it. I’m really pro it. I cannot look you in the eye for any of it and say, “This really feels like it’s making us noticeably safer.” It feels like very early-stage stuff. A lot of it’s super theoretical. A lot of it’s on systems that are just not that smart yet. It’s hard for me to… The best thing I can say about my very favorite AI alignment research work is like, “Yeah, that could help, I guess. That could help. It’s good that you’re doing it.” And that makes me nervous.

Holden Karnofsky: And then there’s other stuff where it’s like, people will say, “Oh, it’s the most important century. We’ve got to build the AI as soon as possible so it happens in the U.S. instead of China.” And I’m like, “I don’t know. Hold up. Maybe that’s right. Absolutely. It’s also possible that racing to build AI, building it as fast as you can… That’s exactly how we let this whole thing spin out of control and create this galactic civilization before we’re ready, before anyone has really had a chance to have a conversation about what we want it to look like, or about how to get the AI to do what we want, or to even just not do totally random, weird stuff that ends up being very powerful stuff.”

Holden Karnofsky: I think a lot of people jump into this with, “Let’s do this, let’s do that.” And the truth is, that is the biggest hesitation for me in writing this series and putting it out there is just, I think we’re not at the phase where we have a lot of clarity about what to do. And that makes me nervous because I think the proper mood for the most important century is not being excited, it’s not being afraid, it’s just being like, “Oh.” Just being stunned, and just feeling like dang, something big could be happening. It’s bigger than us. It’s above our pay grade as an entire species. It doesn’t feel like we’re ready.

Rob Wiblin: …But I don’t see how to help.

Holden Karnofsky: Yeah, exactly. I don’t see how to help. And so it’s just—

Rob Wiblin: It’s demoralizing.

Holden Karnofsky: —that’s a hard mood to convey in a series of blog posts. And that is the thing that gave me pause, and I’m trying to convey it. I’m trying to just get people to think, this thing is big, and I have not had a chance to process it. And we as a species have not had a chance to process it. And we are going to think about things we can do that could help.

Holden Karnofsky: And there’s a lot of candidates. And there’s a lot of really interesting open questions for people who are really great, self-directed researchers that I think can get us that clarity on how to make this go well. And there are a lot of ideas that I think are promising.

Holden Karnofsky: For example, before we build the galactic-scale civilization, it would be nice if we really took our time to debate, and discuss, and reflect, and think about what we want it to look like. There are things that seem to be better than other things, factors that seem to be correlated with things going well instead of badly.

Holden Karnofsky: And if we keep reasoning it through, and carefully discussing it and thinking about it, I think we will, over time, get more and more understanding of what we can do to help. But yeah, I think it’s not crazy for someone to say, “I don’t know what to do with that. And I have this other super exciting thing I’m working on that’s going to lead me to all kinds of great places. And when you come back in 10 years knowing what you want, I’ll be in a better position to help you then.” I think that’s a completely reasonable thing. And in another world I might do that.

Holden Karnofsky: I think in the world we’re in, it really matters that Alexander works at Open Philanthropy. It really matters that I think the global health and wellbeing work is going to be in great shape without needing my help. And I’m trying to be where it makes the most sense for me to be.

Not rejecting weird ideas [02:08:05]

Rob Wiblin: A general property of the way that I think, that has always been the case across the board, is just that I’m much less inclined than average to reject ideas because they seem weird. And I think that might come in part from just a contrarian instinct, or I enjoy playing around with ideas and having ideas that are different than other people. But I do also think, separately, I think it can be justified. Because people want to say, “I don’t accept that. I don’t want to take that very seriously because it’s weird.” But people’s calibration of what is weird is just based on what they’ve already observed. And if you look at history and you look at how people’s views of physics, and economics, and all these other… And like religion, how all of this has changed. People in the past would think that the things that we believe and are doing now are absolutely bizarre. And if you went and got a hunter-gatherer and brought them to the modern world, they would just be completely astonished, to the same degree probably that we would be by a future involving digital people and artificial intelligence.

Rob Wiblin: I just want to make the claim that it’s not safe to reject ideas that are weird, in physics, or in social science, or in predicting the future. Do you have any comments on that?

Holden Karnofsky: Well, I think what you’re saying is true, although I also think a lot of people really are better off doing exactly what you’re criticizing. A lot of people are going to have much better lives if they reject things that are weird, even if they seem true. There’s a lot of bad ideas out there.

Holden Karnofsky: And let’s say you’re a person, and you grow up and it’s like, you don’t want to go to school, but everyone tells you go to school, because it’d be weird if you didn’t. You go to school. Then you don’t want to get a job, but you see everyone getting a job, so you get a job. And then a bunch of cults come knocking on your door, and what they’re saying sounds really exciting, but it’s weird, so you don’t do it. And I think you just won. I think that went really well for you.

Holden Karnofsky: And it’s true that, okay the job you took was a weird job that people in the year 200 would have not imagined. But you did good. You did good and that worked out well for you.

Holden Karnofsky: And so I don’t think that it’s crazy to have this anti-weirdness heuristic. I really don’t. I think what is good is to evaluate it, and poke it, and think about it over time, because I think it’s like what I’ve been trying to do with my life, is think about what are the weird things that turn out to actually be pretty reasonable? And what are the weird things that I can just dismiss out of hand? And what are the patterns in that?

Holden Karnofsky: It’s like, what kinds of people should I be taking more seriously? What kinds of introductions, and language, and styles are correlated with someone who’s about to change my mind for real, instead of someone who’s about to lead me on a wild goose chase? And I think that stuff’s really important. Honestly, I can’t say I’ve ever been a person who’s very sympathetic to, “Here’s a four-part logical argument that I delivered in five minutes, now go change what you’re doing with your life.” I’ve really never been a fan of that. And you can see the way that I’ve done things in my career is I’ve always wanted to do my homework before I make a big bet.

Holden Karnofsky: And maybe you’re one of those people. I don’t think I am. Maybe you’re one of those people who’s just like, you just have awesome intuitions. And you’re blessed. And when things make sense to you in five minutes, that’s just because they make sense. And if something is stupid, you’ll see the problem with it immediately. And you just have this gift, and that’s great. But I don’t think I have that gift. And I also wouldn’t know initially if I did.

Holden Karnofsky: I’m into watching as you go through life. Certain kinds of weird are worth looking into further, and may just be worth betting on. And there are certain kinds of things that it has not been helpful to me to dismiss because they’re weird. And I want to do less dismissing of those things. And then you get this sort of accelerated self-improvement in your beliefs when you have that attitude, because you’re learning.

Holden Karnofsky: I was studying when I was at GiveWell, we were learning about bed nets and deworming, but we were also learning about what kinds of studies to trust, and what kinds of people to trust, and who turns out to be right when you learn more, and who turns out to be wrong when you learn more, and what weird stuff turns out to be crazy, and what weird stuff turns out to be exactly the sort of thing you should have been doing.

Holden Karnofsky: And so it’s like I’m trying to do this meta-learning at the same time as I’m doing this learning. And now I’ve got a pretty well-developed sense of what kind of weird do I want to ignore, or what kind of weird do I want to get into? That would be more of the way I’d put it.

Holden Karnofsky: I do agree you shouldn’t… A life-long rule of dismissing everything just because it’s weird seems like it has to leave you short of your potential to do amazing things. But it may also stop you from doing really stupid things.

Rob Wiblin: I guess there is a pretty big distinction between, how willing are you to do things that are completely non-conformist in your lifestyle or life choices, where people can learn a lot from experience, and how much should you do that in what is effectively a research project where you’re trying to get ahead of the crowd? Or like entrepreneurship, it’s a style of entrepreneurship where you’re trying to get ahead of the crowd and figure things out so you can either make more money or do more good. And in that case, rejecting things that are weird just almost certainly means that you’re going to fail, because—

Holden Karnofsky: Oh yeah.

Rob Wiblin: —the only way to… Yeah.

Holden Karnofsky: Once you get into the business where your whole thing is upside, you’re essentially at a startup, and you just want to do something amazing and that’s your goal, that’s… For a lot of people, that’s not their professional goal. Once it is, then it becomes extremely costly to just be dismissive of weird stuff. Although, at the same time, it’s like you have to preserve your time to be able to look into the most important weird stuff. So you can’t go chasing down every weird thing that sounds like it might be true. You have to have some way of distinguishing.

Most promising ways of positively steering the development of AI [02:13:03]

Rob Wiblin: So do some of the details of the worldview that you laid out in the blog post series give you any insights into what would be the most promising ways of steering the development of AI in a positive direction? One that jumped out at me is that maybe we should spend more time thinking, and talking, and advocating about rules that might govern how digital people are treated, or potentially used, because that could be very significant.

Rob Wiblin: And another one might be that we should be especially attentive to progress and use of AI systems that might advance science and technology. Do you have any other possible lessons from the worldview?

Holden Karnofsky: Yeah, those both seem like really important topics that have gotten extremely little attention. I’m still working on the last piece of it, the implications, what does this all mean? But it’s like when you think about different possible ways this could go, there’s a wide variation. And one way it could go is you could get this misaligned AI that I imagine you’ve talked about before a zillion times on this podcast, this AI that’s off running the world in its own way, doing what it wants, that has nothing to do with what we were trying to do. That’s one possibility.

Holden Karnofsky: Another possibility is you have people just desperately racing each other to control this digital future civilization, as much of it as they can. And another is that you have people negotiating with each other and trying to reach a peaceful agreement. And another is that you have people not only negotiating, but doing it in really good faith, and taking their time, and being patient, and really trying to reflect on it. I think it’s called the ‘long reflection’ in Toby Ord’s book The Precipice.

Holden Karnofsky: And it’s like… I named them in ascending order. If you can make the later ones I said more likely, relative to the earlier ones… That’s why like AI alignment research, it just seems like the outcome that’s hardest to be excited about is the one where we just have the civilization built by some messed up AI that didn’t do what we were even… We didn’t have any idea of what it would do. I like the idea of doing AI alignment research.

Holden Karnofsky: And then I like the idea of just thinking about, how can we get to a point where the key decision makers are going to understand the stakes, and take them seriously, and not approach them through a lens of, “Well, my job is to grab all the power for me/my country,” but instead, “My job is to get the best outcome for the world. How can we get to that place?” That could be a matter for international relations, international cooperation. It could also be a matter for just trying to get to a place where the people who are making decisions are people who are good people, who understand things, who are sane.

Holden Karnofsky: I think those are levers that you could try to push on, and then you could get into more specific stuff. You could try to implement all the things I just said with specific policies. Maybe it would be great for us to get to a place where no one is allowed, or at least it’s stigmatized, to build really huge AI models until you have certain assurances, certain things you can say about why you think they won’t lead to a terrible place. And until you’ve got some plan for how you’re going to roll them out, and how we’re going to govern them, and who’s going to make decisions, and who’s going to be included in those decisions… Maybe you shouldn’t be allowed, or it shouldn’t be encouraged, to build really big AI models until you can provide some of that, until we have some clarity around that. That’s an idea. That’s an example of a specific policy that I think needs a lot more work before it’s tangible enough to really make sense.

Holden Karnofsky: But there’s all kinds of specific ideas people are wrestling with and thinking about. What if we had a policy that said this, or a regulation that said that, or a technology that could do this? And some of them seem like they would help. Some of them seem like they would hurt, and we should stop them.

Holden Karnofsky: I think we have a ways to go before we have a lot of clarity on this. For some people that might mean you should ignore it and wait for creative, intellectual people to get more clarity. But I think for that to mean everyone ignores it would be a massive mistake for humanity.

Rob Wiblin: Alright, we should probably wrap up on the Most Important Century blog series, because we’ve been chatting about it for a little while. I guess despite the fact that we’ve been talking about it for so long, there’s a lot more material in the series if people want to go check it out. That’s at cold-takes.com.

Rob Wiblin: My guest today has been Holden Karnofsky. Thanks so much for coming on the 80,000 Hours podcast, Holden.

Holden Karnofsky: Thanks for having me on.

Rob’s outro [02:17:09]

Rob Wiblin: OK that’s the first half of this interview with Holden, covering the idea that we might be living in the most important century in humanity’s history.

We spoke for so long we thought we’d do you the favour of breaking this episode in two, especially as there’s a nice natural place for a break here in the middle.

The second half is about how longtermism is coming along as a movement, what Holden specifically recommends people do to guide the world in the right direction, and then a fun final section where we cover all sorts of blog posts Holden is writing and various exciting ideas he wants to get out there. That should be out in a week or two so subscribe if you haven’t already.

If you’ve made it to the end of this episode, I just want to draw your attention to the Open Philanthropy Technology Policy Fellowship.

Open Philanthropy is looking for applicants for a U.S. policy fellowship program focused on high-priority emerging technologies, especially AI and biotechnology. The program will go for 6-12 months and offer training, mentorship, and support matching with a host organization for a full-time position in Washington, DC.

You’ve got until September 15th to apply, and can find out more on the Open Philanthropy website, or by clicking through the link on the blog post associated with this episode.

As I mentioned in the intro we’re also currently hiring a new Head of Marketing to spread the word about this podcast and all the other services 80,000 Hours offers.

As always, you can stay on top of those opportunities and hundreds of others by regularly checking our job board, at 80000hours.org/jobs.

If you go there and join our job board newsletter, you’ll get an email every two weeks when it’s updated, with a selection of some of the most interesting options.

The 80,000 Hours podcast is produced by Keiran Harris.

Audio mastering is by Ben Cordell.

Full transcripts are available on our website and produced by Sofia Davis-Fogel.

Thanks for joining, talk to you again soon.

Learn more

Career review: Foundation grantmaker

Future generations and their moral significance

The case for reducing existential risks

Global priorities research


Michael_2358 @ 2021-10-04T01:14 (+1)

Was I the only one who found it coincidental that there was this conversation about a more technologically advanced intelligence colonizing the galaxy just a few months after the US government released a report acknowledging we have no explanations for all the UFO sightings over the past decades? It seems like perhaps life on another planet beat Holden's most important century to the punch. Maybe we are their simulation? I never thought much about UFOs before 2021, but given the evidence that has come to light, it seems like Occam's razor suggests a vastly more technologically sophisticated intelligence is already on earth.