Excerpts from "Doing EA Better" on x-risk methodology
By Eevee🔹 @ 2023-01-26T01:04 (+22)
This is a linkpost to https://forum.effectivealtruism.org/posts/54vAiSFkYszTWWWv4/doing-ea-better-1
The post "Doing EA Better" contains some critiques of the EA movement's approach to studying and ranking x-risks. These criticisms resonated with me and I wish we paid more attention to them. There were concerns about the original post being quite long and mixing a lot of different topics together, so I decided to extract some relevant sections into a separate post to enable focused discussion.
The original post is, per site policy, available under a Creative Commons BY 4.0 license, so I am excerpting it as permitted by this license.
We need to stop reinventing the wheel
Summary: EA ignores highly relevant disciplines to its main area of focus, notably Disaster Risk Reduction, Futures Studies, and Science & Technology Studies, and in their place attempts to derive methodological frameworks from first principles. As a result, many orthodox EA positions would be considered decades out of date by domain-experts, and important decisions are being made using unsuitable tools.
EA is known for reinventing the wheel even within the EA community. This poses a significant problem given the stakes and urgency of problems like existential risk.
There are entire disciplines, such as Disaster Risk Reduction, Futures Studies, and Science and Technology Studies, that are profoundly relevant to existential risk reduction yet which have been almost entirely ignored by the EA community. The consequences of this are unsurprising: we have started near to the beginning of the history of each discipline and are slowly learning each of their lessons the hard way.
For instance, the approach to existential risk most prominent in EA, what Cremer and Kemp call the “Techno-Utopian Approach” (TUA), focuses on categorising individual hazards (called “risks” in the TUA),[41] attempting to estimate the likelihood that they will cause an existential catastrophe within a given timeframe, and trying to work on each risk separately by default, with a homogenous category of underlying “risk factors” given secondary importance.
However, such a hazard-centric approach was abandoned within Disaster Risk Reduction decades ago and replaced with one that places a heavy emphasis on the vulnerability of humans to potentially hazardous phenomena.[42] Indeed, differentiating between “risk” (the potential for harm), “hazards” (specific potential causes of harm) and “vulnerabilities” (aspects of humans and human systems that render them susceptible to the impacts of hazards) is one of the first points made on any disaster risk course. Reducing human vulnerability and exposure is generally a far more effective method of reducing risk posed by a wide variety of hazards, and far better accounts for “unknown unknowns” or “Black Swans”.[43]
Disaster risk scholarship is also revealing the growing importance of complex patterns of causation, the interactions between threats, and the potential for cascading failures. This area is largely ignored by EA existential risk work, and has been dismissed out of hand by prominent EAs.
As another example, Futures & Foresight scholars noted the deep limitations of numerical/probabilistic forecasting of specific trends/events in the 1960s-70s, especially with respect to long timescales as well as domains of high complexity and deep uncertainty[44], and low-probability high-impact events (i.e. characteristics of existential risk). Practitioners now combine or replace forecasts with qualitative foresight methods like scenario planning, wargaming, and Causal Layered Analysis, which explore the shape of possible futures rather than making hard-and-fast predictions. Yet, EA’s existential risk work places a massive emphasis on forecasting and pays little attention to foresight. Few EAs seem aware that “Futures Studies” as a discipline exists at all, and EA discussions of the (long-term) future often imply that little of note has been said on the topic outside of EA.[45]
These are just two brief examples.[46] There is a wealth of valuable insights and data available to us if we would only go out and read about them: this should be a cause for celebration!
But why have they been so neglected? Regrettably, it is not because EAs read these literatures and provided robust arguments against them; we simply never engaged with them in the first place. We tried to create the field of existential risk almost from first principles using the methods and assumptions that were already popular within our movement, regardless of whether they were suitable for the task.[47]
We believe there could be several disciplines or theoretical perspectives that EA, had it developed a little differently earlier on, would recognise as fellow travellers or allies. Instead, we threw ourselves wholeheartedly into the Founder Effect, and in our over-dependence on a few early canonical thinkers (i.e. MacAskill, Ord, Bostrom, Yudkowsky etc.), we thus far lost out on all that they have to offer.
This expands to a broader question: if we were to reinvent (EA approaches to) the field of Existential Risk Studies from the ground up, how confident are we that we would settle on our current way of doing things?
The above is not to say that all views within EA ought to always reflect mainstream academic views; there are genuine shortcomings to traditional academia. However, the sometimes hostile attitude EA has to academia has hurt our ability to listen to its contributions as well as those of experts in general.
On the hasty prioritization of AI risk and biorisk
OpenPhil’s global catastrophic risk/longtermism funding stream is dominated by two hazard-clusters – artificial intelligence and engineered pandemics[56] – with little affordance given to other aspects of the risk landscape. Even within this, AI seems to be seen as “the main issue” by a wide margin, both within OpenPhil and throughout the EA community.
This is a problematic practice, given that, for instance:
The prioritisation relies on questionable forecasting practices, which themselves sometimes take contestable positions as assumptions and inputs
There is significant second-order uncertainty around the relevant risk estimates
The ITN framework has major issues, especially when applied to existential risk
It is extremely sensitive to how a problem is framed, and often relies on rough and/or subjective estimates of ambiguous and variable quantities
- This poses serious issues when working under conditions of deep uncertainty, and can allow implicit assumptions and subconscious biases to pre-determine the result
- Climate change, for example, is typically considered low-neglectedness within EA, but extreme/existential risk-related climate work is surprisingly neglected
- What exactly makes a problem “tractable”, and how do you rigorously put a number on it?
It ignores co-benefits, response risks, and tipping points
It penalises projects that seek to challenge concentrations of power, since this appears “intractable” until social tipping points are reached[57]
It is extremely difficult and often impossible to meaningfully estimate the relevant quantities in complex, uncertain, changing, and low-information environments
It focuses on evaluating actions as they are presented, and struggles to sufficiently value exploring the potential action space and increasing future optionality
Creativity can be limited by the need to appeal to a narrow range of grantmaker views[58]
The current model neglects areas that do not fit [neatly] into the two main “cause areas”, and indeed it is arguable whether global catastrophic risk can be meaningfully chopped up into individual “cause areas” at all
A large proportion (plausibly a sizeable majority, depending on where you draw the line) of catastrophic risk researchers would, and if you ask, do, reject[59]:
- The particular prioritisations made
- The methods used to arrive at those prioritisations, and/or
- The very conceptualisation of individual “risks” itself
It is the product of a small homogenous group of people with very similar views
- This is both a scientific (cf. collective intelligence/social epistemics and a moral issue
There are important efforts to mitigate some of these issues, e.g. cause area exploration prizes, but the central issue remains.
The core of the problem here seems to be one of objectives: optimality vs robustness. Some quick definitions (in terms of funding allocation):
- Optimality = the best possible allocation of funds
- In EA this is usually synonymous with “the allocation with the highest possible expected value”
- This typically has a unstated second component: “assuming that our information and our assumptions are accurate”
- Robustness = capacity of an allocation to maintain near-optimality given conditions of uncertainty and change
In seeking to do the most good possible, EAs naturally seek optimality, and developed grantmaking tools to this end. We identify potential strategies, gather data, predict outcomes, and take the actions that our models tell us will work the best.[60] This works great when you’re dealing with relatively stable and predictable phenomena, for instance endemic malaria, as well as most of the other cause areas EA started out with.
However, now that much of EA’s focus has turned on to global catastrophic risk, existential risk, and the long-term future, we have entered areas where optimality becomes fragility. We don’t want most of our eggs in one or two of the most speculative baskets, especially when those eggs contain billions of people. We should also probably adjust for the fact that we may over-rate the importance of things like AI for reasons discussed in other sections
Given the fragility of optimality, robustness is extremely important. Existential risk is a domain of high complexity and deep uncertainty, dealing with poorly-defined low-probability high-impact phenomena, sometimes covering extremely long timescales, with a huge amount of disagreement among both experts and stakeholders along theoretical, empirical, and normative lines. Ask any risk analyst, disaster researcher, foresight practitioner, or policy strategist: this is not where you optimise, this is where you maintain epistemic humility and cover all your bases. Innumerable people have learned this the hard way so we don’t have to.
Thus, we argue that, even if you strongly agree with the current prioritisations / methods, it is still rational for you to support a more pluralist and robustness-focused approach given the uncertainty, expert disagreement, and risk management best-practices involved.
harfe @ 2023-01-26T05:23 (+9)
Thank you for extracting these things!
Ironically, this comment will not be an object-level criticism, and is more a meta-rant.
As someone who believes that the existential risk from AI is significant, and more significant than other existential risks, I am becoming more annoyed that a lot of the arguments for taking AI xrisk less serious are not object-level arguments, but indirect arguments.
If you are worried that EA prioritizes AI xrisk too much, maybe you should provide clear arguments why the chance that advanced AI will kill all of humanity this century is extremely small (eg below 2%). (or provide other arguments like "actually the risk is 10% but there is nothing you can do to improve it").
The following are not object-level arguments: "You are biased", "You are a homogeneous group", "You are taking this author of Harry Potter fanfiction too seriously, please listen to the the people we denote as experts™ instead", "Don't you think, that as a math/CS person, it aligns suspiciously well with your own self-interest to read a 100-page google doc on eliciting latent knowledge in a rundown hotel in northern england instead of working for google and virtuously donate 10% of your income to Malaria bednets?"
Maybe I am biased, but that does not mean I should completely dismiss my object-level beliefs such as my opinion on deceptive Mesaoptimization.
Argue that the Neural Networks that Google will build in 2039 do not contain any Mesaoptimization at all!
Relay the arguments by domain-experts from academic disciplines such as "Futures Studies", and "Science & Technology Studies", so that new EAs can decide by themselves whether they believe the common arguments about orthogonality thesis and instrumental convergence!
Argue that the big tech companies will solve corrigibility on their own, and don't need any help from a homogeneous group of EA nerds!
To be fair, I have seen arguments that say something like "AGI is very unlikely to be build this century". But the fact that some of the readers will have doubts whether these very lines were produced by chatGPT or a human should give you doubt of the position that reaaching human level intelligence with trillions of parameters is impossible in the next 70 years.
JoshuaBlake @ 2023-01-26T10:16 (+12)
I think the problem here is that you are requiring your critics to essentially stay within the EA framework of quantitative thinking and splitting risks, but that framework is exactly what is being criticized
Gideon Futerman @ 2023-01-26T11:06 (+4)
I think something that's important here is that indirect arguments can show that given other approaches, you may come to different conclusions; not just on prioritisation of 'risks'(I hate using that word!), but also on techniques to reduce those as well. For instance, I still think that AI and Biorisk are extremely significant contributors to risk, but probably would take on pretty different approaches to how we deal with this based on trying to consider these more indirect criticisms of the methodologies etc used
BrownHairedEevee @ 2023-01-27T06:32 (+3)
Exactly. For example, by looking at vulnerabilities in addition to hazards like AGI and engineered pandemics, we might find a vulnerability that is more pressing to work on than AI risk.
That said, the EA x-risk community has discussed vulnerabilities before: Bostrom's paper "The Vulnerable World Hypothesis" proposes the semi-anarchic default condition as a societal vulnerability to a broad class of hazards.
harfe @ 2023-01-27T07:16 (+1)
To be clear, if you make arguments of the form "X is a more pressing problem then AI risk" or "here is a huge vulnerability X, we should try to fix that" then I would consider that an object-level argument, if you actually name X.