levin's Quick takes

By tlevin @ 2023-08-25T19:57 (+6)

null
tlevin @ 2024-04-30T21:51 (+65)

I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.

I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.

In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.

Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).

Tyler Johnston @ 2024-05-01T03:35 (+28)

I broadly want to +1 this. A lot of the evidence you are asking for probably just doesn’t exist, and in light of that, most people should have a lot of uncertainty about the true effects of any overton-window-pushing behavior.

That being said, I think there’s some non-anecdotal social science research that might make us more likely to support it. In the case of policy work:

  • Anchoring effects, one of the classic Kahneman/Tversky biases, have been studied quite a bit, and at least one article calls it “the best-replicated finding in social psychology.” To the extent there’s controversy about it, it’s often related to “incidental” or “subliminal” anchoring which isn’t relevant here. The market also seems to favor a lot of anchoring strategies (like how basically everything on Amazon in “on sale” from an inflated MSRP), which should be a point of evidence that this genuinely just works.
  • In cases where there is widespread “preference falsification,” overton-shifting behavior might increase people’s willingness to publicly adopt views that were previously outside of it. Cass Sunstein has a good argument that being a “norm entrepreneur,” that is, proposing something that is controversial, might create chain-reaction social cascades. A lot of the evidence for this is historical, but there are also polling techniques that can reveal preference falsification, and a lot of experimental research that shows a (sometimes comically strong) bias toward social conformity, so I suspect something like this is true. Could there be preference falsification among lawmakers surrounding AI issues? Seems possible.

Also, in the case of public advocacy, there's some empirical research (summarized here) that suggests a "radical flank effect" whereby overton-window shifting activism increases popular support for moderate demands. There's also some evidence pointing the other direction. Still, I think the evidence supporting is stronger right now.

P.S. Matt Yglesias (as usual) has a good piece that touches on your point. His takeaway is something like: don’t engage in sloppy Overton-window-pushing for its own sake — especially not in place of rigorously argued, robustly good ideas.

tlevin @ 2024-05-02T01:30 (+3)

Yeah, this is all pretty compelling, thanks!

Cullen @ 2024-05-02T01:52 (+3)

Do you have specific examples of proposals you think have been too far outside the window?

freedomandutility @ 2024-05-05T11:31 (+3)

I think Yudkowsky's public discussion of nuking data centres has "poisoned the well" and had backlash effects.

freedomandutility @ 2024-05-05T11:36 (+2)

I'd also like to add "backlash effects" to this, and specifically effects where advocacy for AI Safety policy ideas which are far outside the Overton Window have the inadvertent effect of mobilising coalitions who are already opposed to AI Safety policies.

tlevin @ 2025-03-05T20:09 (+41)

I sometimes say, in a provocative/hyperbolic sense, that the concept of "neglectedness" has been a disaster for EA. I do think the concept is significantly over-used (ironically, it's not neglected!), and people should just look directly at the importance and tractability of a cause at current margins.

Maybe neglectedness useful as a heuristic for scanning thousands of potential cause areas. But ultimately, it's just a heuristic for tractability: how many resources are going towards something is evidence about whether additional resources are likely to be impactful at the margin, because more resources mean its more likely that the most cost-effective solutions have already been tried or implemented. But these resources are often deployed ineffectively, such that it's often easier to just directly assess the impact of resources at the margin than to do what the formal ITN framework suggests, which is to break this hard question into two hard ones: you have to assess something like the abstract overall solvability of a cause (namely, "percent of the problem solved for each percent increase in resources," as if this is likely to be a constant!) and the neglectedness of the cause.

That brings me to another problem: assessing neglectedness might sound easier than abstract tractability, but how do you weigh up the resources in question, especially if many of them are going to inefficient solutions? I think EAs have indeed found lots of surprisingly neglected (and important, and tractable) sub-areas within extremely crowded overall fields when they've gone looking. Open Phil has an entire program area for scientific research, on which the world spends >$2 trillion, and that program has supported Nobel Prize-winning work on computational design of proteins. US politics is a frequently cited example of a non-neglected cause area, and yet EAs have been able to start or fund work in polling and message-testing that has outcompeted incumbent orgs by looking for the highest-value work that wasn't already being done within that cause. And so on.

What I mean by "disaster for EA" (despite the wins/exceptions in the previous paragraph) is that I often encounter "but that's not neglected" as a reason not to do something, whether at a personal or organizational or movement-strategy level, and it seems again like a decent initial heuristic but easily overridden by taking a closer look. Sure, maybe other people are doing that thing, and fewer or zero people are doing your alternative. But can't you just look at the existing projects and ask whether you might be able to improve on their work, or whether there still seems to be low-hanging fruit that they're not taking, or whether you could be a force multiplier rather than just an input with diminishing returns? (Plus, the fact that a bunch of other people/orgs/etc are working on that thing is also some evidence, albeit noisy evidence, that the thing is tractable/important.) It seems like the neglectedness heuristic often leads to more confusion than clarity on decisions like these, and people should basically just use importance * tractability (call it "the IT framework") instead.

MichaelDickens @ 2025-03-06T00:06 (+9)

Upvoted and disagree-voted. I still think neglectedness is a strong heuristic. I cannot think of any good (in my evaluation) interventions that aren't neglected.

Open Phil has an entire program area for scientific research, on which the world spends >$2 trillion

I wouldn't think about it that way because "scientific research" is so broad. That feels kind of like saying shrimp welfare isn't neglected because a lot of money goes to animal shelters, and those both fall under the "animals" umbrella.

US politics is a frequently cited example of a non-neglected cause area, and yet EAs have been able to start or fund work in polling and message-testing that has outcompeted incumbent orgs by looking for the highest-value work that wasn't already being done within that cause.

If you're talking about polling on AI safety, that wasn't being done at all IIRC, so it was indeed highly neglected.

NickLaing @ 2025-03-06T05:51 (+6)

I love this take and I think you make a good point but on balance I still think we should keep neglectedness  under "ITN". It's just a framework it ain't clean and perfect. You're right that an issue doesn't have to be neglected to be a potentially high impact a cause area. I like the way you put it here.

"Maybe neglectedness useful as a heuristic for scanning thousands of potential cause areas. But ultimately, it's just a heuristic for tractability'

That's good enough for me though.

I would also say that especially in global development, relative "importance" might become  less "necessary" part of the framework as well. If we can spend small amounts of money solving relatively  smallish issues cost effectively then why not?

You're examples are exceptions too, most of the big EA causes were highly neglected before EA got involved.

When explaining EA to people who haven't heard of it, neglectedness might  be the part which makes the most intuitive sense, and what helps people click. When I explain the outsized impact EA has had on factory farming, or lead elimination, or AI Safety because "those issues didn't have so much attention before", I sometimes see a lightbulb moment.

Jakob_J @ 2025-03-06T10:38 (+1)

I agree and made a similar claim previously. While I believe that many currently effective interventions are neglected, I worry that there are many potential interventions that could be highly effective but are overlooked because they are in cause areas not seen as neglected.

BenjaminTereick @ 2025-03-06T08:27 (+1)

Disagree-voted. I think there are issues with the Neglectedness heuristic, but I don’t think the N in ITN is fully captured by I and T. 

For example, one possible rephrasing of ITN is: (certainly not covering all the ways in which it is used)

  1. Would it be good to solve problem P?
  2. Can I solve P?
  3. How many other people are trying to solve P?

I think this is a great way to decompose some decision problems. For instance, it seems very useful for thinking about prioritizing research, because (3) helps you answer the important question "If I don’t solve P, will someone else?" (even if this is also affected by 2).

(edited. Originally, I put the question "If I don’t solve P, will someone else?" under 3., which was a bit sloppy)

tlevin @ 2025-02-25T04:43 (+35)

Biggest disagreement between the average worldview of people I met with at EAG and my own is something like "cluster thinking vs sequence thinking," where people at EAG are like "but even if we get this specific policy/technical win, doesn't it not matter unless you also have this other, harder thing?" and I'm more like, "Well, very possibly we won't get that other, harder thing, but still seems really useful to get that specific policy/technical win, here's a story where we totally fail on that first thing and the second thing turns out to matter a ton!"

Karthik Tadepalli @ 2025-02-26T07:36 (+8)

Cluster thinking vs sequence thinking remains unbeaten as a way to typecast EA disagreements. It's been a while since I saw it discussed on the forum. Maybe lots of newer EAs don't even know about it!

levin @ 2023-08-25T19:57 (+10)

A technique I've found useful in making complex decisions where you gather lots of evidence over time -- for example, deciding what to do after your graduation, or whether to change jobs, etc., where you talk to lots of different people and weigh lots of considerations -- is to make a spreadsheet of all the arguments you hear, each with a score for how much it supports each decision.

For example, this summer, I was considering the options of "take the Open Phil job," "go to law school," and "finish the master's." I put each of these options in columns. Then, I'd hear an argument like "being in school delays your ability to take a full-time job, which is where most of your impact will happen"; I'd add a row for this argument. I thought this was a very strong consideration, so I gave the Open Phil job 10 points, law school 0, and the master's 3 (since it was one more year of school instead of 3 years). Later, I'd hear an argument like "legal knowledge is actually pretty useful for policy work," which I thought was a medium-strength consideration, and I'd give these options 0, 5, and 0.

I wouldn't take the sum of these as a final answer, but it was useful for a few reasons: