#204 – Making sense of SBF, and his biggest critiques of effective altruism (Nate Silver on The 80,000 Hours Podcast)

By 80000_Hours @ 2024-10-17T20:41 (+22)

We just published an interview: Nate Silver on making sense of SBF, and his biggest critiques of effective altruism. Listen on Spotify, watch on Youtube, or click through for other audio options, the transcript, and related links. Below are the episode summary and some key excerpts.

Episode summary

I’m now more convinced to divide effective altruism into the orange, blue, yellow, green, and purple teams. Maybe the purple team is very concerned about maximising philanthropy and also very PR concerned. The red team is a little bit more rationalist influenced and takes up free speech as a core cause and things like that.

I think it’s hard to have a movement that actually has these six or seven intellectual influences that get smushed together, because of all people getting coffee together or growing up on the internet (in a more freewheeling era of the internet) 10 or 15 years ago. I think there are similarities, but to have this all under one umbrella is beginning to stretch it a little bit.

— Nate Silver

In today’s episode, Rob Wiblin speaks with FiveThirtyEight election forecaster and author Nate Silver about his new book: On the Edge: The Art of Risking Everything.

On the Edge explores a cultural grouping Nate dubs “the River” — made up of people who are analytical, competitive, quantitatively minded, risk-taking, and willing to be contrarian. It’s a tendency he considers himself a part of, and the River has been doing well for itself in recent decades — gaining cultural influence through success in finance, technology, gambling, philanthropy, and politics, among other pursuits.

But on Nate’s telling, it’s a group particularly vulnerable to oversimplification and hubris. Where Riverians’ ability to calculate the “expected value” of actions isn’t as good as they believe, their poorly calculated bets can leave a trail of destruction — aptly demonstrated by Nate’s discussion of the extended time he spent with FTX CEO Sam Bankman-Fried before and after his downfall.

Given this show’s focus on the world’s most pressing problems and how to solve them, we narrow in on Nate’s discussion of effective altruism (EA), which has been little covered elsewhere. Nate met many leaders and members of the EA community in researching the book and has watched its evolution online for many years.

Effective altruism is the River style of doing good, because of its willingness to buck both fashion and common sense — making its giving decisions based on mathematical calculations and analytical arguments with the goal of maximising an outcome.

Nate sees a lot to admire in this, but the book paints a mixed picture in which effective altruism is arguably too trusting, too utilitarian, too selfless, and too reckless at some times, while too image-conscious at others.

But while everything has arguable weaknesses, could Nate actually do any better in practice? We ask him:

How would Nate spend $10 billion differently than today’s philanthropists influenced by EA?
Is anyone else competitive with EA in terms of impact per dollar?
Does he have any big disagreements with 80,000 Hours’ advice on how to have impact?
Is EA too big a tent to function?
What global problems could EA be ignoring?
Should EA be more willing to court controversy?
Does EA’s niceness leave it vulnerable to exploitation?
What moral philosophy would he have modelled EA on?

Rob and Nate also talk about:

Nate’s theory of Sam Bankman-Fried’s psychology.
Whether we had to “raise or fold” on COVID.
Whether Sam Altman and Sam Bankman-Fried are structurally similar cases or not.
“Winners’ tilt.”
Whether it’s selfish to slow down AI progress.
The ridiculous 13 Keys to the White House.
Whether prediction markets are now overrated.
Whether venture capitalists talk a big talk about risk while pushing all the risk off onto the entrepreneurs they fund.
And plenty more.

Producer and editor: Keiran Harris
Audio engineering by Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong
Video engineering: Simon Monsour
Transcriptions: Katy Moore

Highlights

Is anyone doing better at "doing good better"?

Rob Wiblin: So you talk about various virtues and vices that effective altruism might have, or various weaknesses it might have. I’m curious to know, all things considered, in your view are there other groups doing philanthropy better than Open Phil, or competing with Open Phil in terms of their impact per dollar? Or are there other groups that give better advice, or similarly good advice to 80,000 Hours on how you can do good with your life or your career?
Nate Silver: Not that I’m… Yeah, I think it’s probably a pretty big gap. Although, again, I think the Gates Foundation. I went to some dinner that Bill Gates hosted. It was not on the record so I can’t share specific comments, but enough to say that they are quite rigorous about what they do. And he is very knowledgeable about the different programmes and how effective they are, relatively speaking.
But yeah, I think this is a pretty huge win. Even if you assume that maybe there’s some mean reversion versus what the true value of a malaria net is versus the estimated value, and maybe you’re making slightly favourable assumptions, the delta between malaria nets in Africa and giving to the Harvard endowment has to be like 100,000x or something. In fact, I think actually even the Harvard endowment is probably disutilitarian: I think it probably actually could be bad for society.
So yeah, I think there are a lot of low-hanging fruit and easy wins. But one property of being in the process of being a gambler or doing adjacent things for a long time is that I think people don’t quite realise the curve where you only have so much low-hanging fruit, and then it gets harder and harder, and/or there are more competitors in the market, and then you kind of quickly get to the surface where we’re going for smaller and smaller advantages — and then small errors in your models might mean that you’re into negative expected value territory.

Is effective altruism too big to succeed?

Rob Wiblin: Do you think EA culture should be more freewheeling, and more willing to just say stuff that pisses people off and makes enemies, even if it’s not maybe on a central topic? It seems sometimes in the book that you think: maybe!
Nate Silver: Directionally speaking, yes. I think to say things that are unpopular is actually often an act of altruism. And let’s assume it’s not dangerous. I don’t know what counts as dangerous or whatnot, but to express an unpopular idea. Or maybe it’s actually popular, but there is a cascade where people are unwilling to say this thing that actually is quite popular. I find it admirable when people are willing to stick their necks out and say something which other people aren’t.
Rob Wiblin: I think the reason that EA culture usually leans against that, definitely not always, is just the desire to focus on what are the most pressing problems. We say the stuff that really matters is AI, regulation of emerging technologies, poverty, treatment of factory farmed animals.
And these other things that are very controversial and might annoy people in public, I think EAs would be more likely to say, “Those are kind of distractions that’s going to cost us credibility. What are we really gaining from that if it’s not a controversial belief about a core, super pressing problem?” Are you sympathetic to that?
Nate Silver: This is why I’m now more convinced to divide EA into the orange, blue, yellow, green, and purple teams. Maybe the purple team is very concerned about maximising philanthropy and also very PR concerned. The red team is a little bit more rationalist influenced and takes up free speech as a core cause and things like that. I think it’s hard to have a movement that actually has these six or seven intellectual influences that get smushed together, because of all people getting coffee together or growing up on the internet (in a more freewheeling era of the internet) 10 or 15 years ago. I think there are similarities, but to have this all under one umbrella is beginning to stretch it a little bit.
Rob Wiblin: Yeah, I think that was a view that some people had 15 years ago, maybe: that this is too big a tent, this is too much to try to fit into one term of “effective altruism.” Maybe I do wish that they had been divided up into more different camps. That might have been more robust, and would have been less confusing to the public as well. Because as it is, so many things are getting crammed into these labels of effective altruism or rationality that it can be super confusing externally, because you’re like, “Are these the poverty people or are these the AI people? These are so different.”
Nate Silver: Yeah. I think in general, smaller and more differentiated is better. I don’t know if it’s a kind of long-term equilibrium, but you see actually, over the long run, more countries in the world being created, and not fewer, for example.
And there was going to be originally more stuff on COVID in the book, but no one wants to talk about COVID all the time, four years later, but in COVID all the big multiethnic democracies — especially the US, the UK, India, and Brazil — all really struggled with COVID. Whereas the Swedens or the New Zealands or the Israels or Taiwan, they were able to be more fleet and had higher social trust. That seemed to work quite a bit better.
So maybe we’re in a universe where medium sized is bad. Either be really big or be really small.

The stark tradeoffs we faced with COVID

Nate Silver: Yeah, the middle-ground solutions were actually the worst, which is where the multiparty democracies wound up a lot of the time. In poker, you call it a raise-or-fold strategy: often, in the game theory equilibrium in poker, you either want to raise or fold and not call.
So either you want to do like Sweden and be like, “We’re never going to get R below 1, so let’s do more things outdoors and protect old people. But a lot of people are going to die.” Or you do like New Zealand: “Fortunately, we’re an island country in the South Pacific, and there are no cases yet. Just shut down the border for two years.” And those extreme strategies are more effective than the muddling through, I think.
Rob Wiblin: So you would say we suffered a tonne of costs socially. People’s wellbeing was much reduced. And at the same time, by the time the vaccines arrived, half of people had been exposed anyway — so we’d already borne half the costs, roughly. Maybe not quite as much, because we managed to spread out the curve.
Nate Silver: I mean, the R=1 model becomes complicated when you have reinfection. You start to introduce more parameters when you have some duration of immunity from disease, although clearly severe outcomes are diminished. There are going to be long COVID people getting mad at me. Clearly the overall disease burden from COVID goes down, and probably people are infected with COVID all the time and don’t even realise it right now.
There’s a long history of, it’s thought that some flus actually were maybe COVID-like conditions that are now just in the background and aren’t a particularly big deal. And the fact that discussion of “herd immunity” got so stigmatised was one of a lot of things that disturbed me about discussion about the pandemic.

The 13 Keys to the White House

Nate Silver: So the 13 Keys are a system by Allan Lichtman, who is a professor of government, I don’t know if he’s a professor emeritus, retired now, at American University in Washington, DC — which I think are an example of the replication crisis and junk science.
One problem you have in election forecasting that’s unavoidable is that you have a small sample of elections since American elections began voting in the popular vote in 1860. Before that, you would have state legislatures appoint candidates. It’s a sample size of a few dozen, which is not all that large. And for modern election forecasting, the first kind of scientific polling was done in roughly 1936 — and was very bad, by the way, at first. One election every four years, so you have a sample size of 22 or something like that.
So when you have a small sample size and a lot of plausible outcomes, you have a potential problem that people in this world might know called “overfitting” — which is that you don’t have enough data to fit a multi-parameter model. And there are different ways around this; I don’t know if we want to get into modelling technique per se. But the Keys to the White House is a system that claims to perfectly predict every presidential election dating back to the 19th century based on 13 variables.
There are a couple of problems when you try to apply this, forward-looking. One is that a lot of the variables are subjective. So: Is there a significant foreign policy accomplishment by the president? Is the opponent charismatic? These are things that, if you know the answer already, you can overfit and kind of p-hack your way to saying, “Now we can predict every election perfectly” — when we already know the answer. It’s not that hard to “predict” correctly, when the outcome is already known.
So when the election’s moving forward, then actually Allan Lichtman will issue his prediction. But it’s not obvious. You have to wait for him to come on stage, or come on YouTube now, and say, “Here’s what I predict here, based on my judgement.” So it’s a judgement call on a lot of these factors.
Also, he’s kind of lied in the past about whether he was trying to predict the Electoral College or the popular vote, and shifted back and forth based on which was right and which was wrong. But he’s a good marketer, taking a system that’s just kind of punditry with some minimal qualitative edge or quantitative edge, and trying to make it seem like it’s something more rigorous than it is.
Rob Wiblin: So it’s got 13 different factors in it. There’s so many things that are crazy about this. You don’t even need to look at the empirics to tell that this is just junk science and totally mad. So he’s got 13 factors that I guess he squeezed out of… I mean, in the modern era, there’s only like a dozen, at most two dozen elections you could think about — and we’re really going to be saying that it’s all the same now as it was in the 19th century. That seems nuts.
So he’s got 13 different factors. Almost all of these come on a continuum. Like a candidate can be more or less charismatic; it’s not just one or zero — but he squeezes it into the candidate is charismatic, or the candidate is not; or the economy is good, the economy is bad — so he’s throwing out almost all this information. He’s got so many factors, despite the fact that he’s got almost no data to tell which ones of these goes in. He hasn’t changed it, I think, since 1980 or something when they came up with it.
Nate Silver: Yeah. And he says, for example, that Donald Trump is not charismatic. By the way, he’s a liberal Democrat. And like, I’m not a fan of Donald Trump, but he literally hosted a reality TV show. He was a game show host. I think there’s a certain type of charisma that comes through with that. And that’s the one thing he probably does have, is that he’s charismatic. Maybe not in a way that a Democrat might like, but he’s a funny guy. He’s an entertainer, literally. So I don’t know how you wouldn’t give him that key, for example.

Can we tell whose election predictions are better?

Rob Wiblin: Yeah. So I asked Claude to give a brief summary of the paper and some of the points that it pulled out were:
Presidential elections are rare events. They occur only every four years. This provides very few data points to assess forecasting methods. The authors demonstrate through simulations that it would take 24 election cycles, or 96 years, to show with 95% confidence that a forecaster with 75% accuracy outperformed random guessing, and that comparing the performance of competing forecasters with similar accuracy levels could take thousands of years.
What do you make of that?
Nate Silver: So I challenge these guys to a bet. If they think that it’s no better than random, then I’m happy. I mean, right now, our model — as we’re recording this in early September, full disclosure — is close to 50/50. So yeah, if they think that that’s no better than a coin flip, then I’m happy to make a substantial bet with these academics. Because, look… Am I allowed to swear on this show?
Rob Wiblin: Of course.
Nate Silver: It’s like, OK, you have an event when it’s every four years. To get a statistically significant sample will take a long time. Yeah, no shit. You don’t have to waste a slot in an academic journal with this incredibly banal and obvious observation.
But I’d say a couple of things. One is that when you actually have a sample size which is not just the presidential elections, but presidential primaries and midterm elections: in midterm elections, there are roughly 500 different races for Congress every year. Of course, they’re correlated, which makes this quite complicated structurally, but there’s a little bit more robustness in the data than they might say.
But also, they’re kind of caught in this… I consider it the replication crisis paradigm of, like, you hit some magic number when it’s 95%, and then it’s true instead of false. And that’s just not… I mean, I’m a Bayesian, right? I don’t think that way.
One of the authors of the paper was saying, based on one election, you can’t tell whether… So in 2016, models had Trump with anywhere from a 29% chance — that was a then-FiveThirtyEight model — to a less than 1% chance, 0.1%, let’s call it. And they said that you can’t really tell anything from one election which model is right or which model isn’t. And actually, it’s not true if you apply Bayes’ theorem, and you have a 0.1% chance happening on a model that’s never actually been published before, and it’s wrong. The odds are overwhelming that model is inferior based on that sample size of one to the 29% chance model.
So to me, it kind of indicates a certain type of rigid academic thinking, which is not fast enough to deal with the modern world. In the modern world, by the time you prove something to an academic standard, then the market’s priced it in. The advantage that you might milk from that has already been realised.
It’s interesting to see effective altruism: which comes out of academia, but understands that you’re having debates that occur quickly in the public sphere, on the EA forums, for example. And they’re big believers in being in the media. And that part I like: that the velocity of academia is not really fit for today’s world.
Rob Wiblin: Yeah. I think presumably the authors of this paper wouldn’t really want to say that your model is no better than a coin flip. I guess what they’re saying is, imagine that there were two models that were similarly good — your model, and one that was a bit different, that gave a bit more weight to the fundamentals versus polling or something like that — and say it gave Trump a 27% chance when you gave it a 29% chance. It’s actually quite difficult to distinguish which of these is better empirically, and so you might have to turn to theory, and then that’s not really going to be decisive. What do you make of that sort of idea?
Nate Silver: I get a little perturbed because we are the only… So the legacy of FiveThirtyEight, and now Silver Bulletin models, this is a pretty rare case of having forecasts in the public domain where there is a complete track record of every forecast we’ve made, both in politics and sports, since 2008. And they’re very well calibrated: our 20% chances happen 20% of the time. You get a much larger sample size through sports than through elections.
But yeah, it’s this abstract of basically no other model in this space has a track record over more than one election. And we also are having presidential primaries and things like that; there’s quite a long track record now.
And I would think academics who are interested in public discourse would be more appreciative of how it’s much harder to make actual forecasts where you put yourself out there under conditions of uncertainty, and publish them so they can be vetted and observed, than to back-test a model.
And look, I think there’s probably some degree of jealousy, where… I mean, there is, right? You take these ideas and you popularise them and there’s a pretty big audience for them. But also, I’m taking risks every time I forecast. I mean, we’ve had 70/30 calls where we’re perceived as being wrong, and you’re taking reputational risk. So I don’t know.

Do venture capitalists really take risks?

Rob Wiblin: Something you say in the book that surprised me is that venture capitalists talk a big game about taking risks and revolutionising the world and changing everything and being willing to upend it all, but you actually think they don’t take that much risk. Why is that?
Nate Silver: Are you a basketball fan, or a sports fan?
Rob Wiblin: Soccer sometimes.
Nate Silver: In American sports, we have the draft, which is a mechanism for giving the worst team more competitive equality over the long run. If you’re the worst team, you get the first pick. You therefore get the next good player.
For the top Silicon Valley firms, it’s almost the reverse of this, right? Where if you’re Andreessen Horowitz or Founders Fund or Sequoia, and you’re very successful, then you get the first draft pick: the next founder that comes over from Ireland or Sri Lanka or across the country or whatever else will want to then work with this firm that has bigger network effects.
I mean, Marc Andreessen even told me that it’s kind of a self-fulfilling prophecy, their success: they have access to the top founder talent all over the world; the expected value of any given bet is quite high. And yes, there’s high variance, but he actually gave me some data — and if you do the math, almost every fund they make is going to make some money. The risk of ruin is actually very, very low.
Rob Wiblin: Because they’re diversified across hundreds of different…?
Nate Silver: Diversified. You have a fund that has 20 companies a year. And by the way, it’s not true that it’s totally hit or miss. There are plenty of 1xs and 2xs, or even getting half your money back. That helps actually quite a bit over the long run. So it’s a very robust business where they’re guaranteed to make a really nice profit every year. Look, I think a lot of them are risk-taking personality types, but they kind of have a business that’s too good to fail almost.
Whereas a founder may be making a bet that is plus-EV in principle, but the founder’s life may not be that great much or most of the time. To commit to an idea that has a 10-year time horizon that results in complete failure some percentage of the time, results in a moderate exit after a lot of sweat equity some larger percent of the time, and has a 1-in-1,000 chance of you’re now the next Mark Zuckerberg, or 1-in-10,000: that’s less obviously a good deal, depending on the degree of risk aversion you have. And you have to have some risk-takingness — or risk ignorance, I call it — in order to found a company in an area that has not achieved market success so far and has this very long time horizon.

Linch @ 2024-10-17T22:44 (+12)

This is why I’m now more convinced to divide EA into the orange, blue, yellow, green, and purple teams. Maybe the purple team is very concerned about maximising philanthropy and also very PR concerned. The red team is a little bit more rationalist influenced and takes up free speech as a core cause and things like that.

I'd love to know what he thinks the other colors should be!

david_reinstein @ 2024-10-18T00:00 (+3)

Not sure what Nate means when he says “game theory”. He is not using concepts from formal game theory correctly, at least.