A Case for Empirical Cause Prioritization

By Peter Wildeford @ 2016-06-06T17:32 (+25)

Cause prioritization is likely quite important. EAs know that some causes can be immensely more effective than others with the best cause being many times more effective than the average cause. If you believe Michael Dickens’s calculations, AI safety work could be 10^51 times more important than work on global poverty[1] work if the estimate provided is literally correct and 15000 times more important once you adjust for the prior belief and the strength of the evidence for the estimate. This would suggest that $1K put toward AI safety could potentially accomplish more than the $9.8M that Good Ventures gave GiveDirectly in December 2015. Similarly, using Dickens’s numbers, the OpenPhil grants of $3M to promoting cage free egg campaigns would be worth an equivalent of $3B to GiveDirectly (not taking into account room for more funding or diminishing marginal returns).

If this true[1], that would mean establishing and promoting AI safety or cage free egg campaigns over global poverty donations would be of immense value. And if this isn’t true, then it would be good to figure out which cause is the best and what relative returns we can expect or if there is a currently undiscovered cause that would be higher leverage than all the known causes. Figuring out which cause is the best (and by how much) is key to the work of cause prioritization.

Cause prioritization work can come in a bunch of different forms[2]:

Empirical work -- IPA and JPAL create RCTs that give us a reasonable degree of confidence about the causal relationships between work in global health.

Theoretical work -- the Intergovernmental Panel on Climate Change puts together models to help understand the risk posed by climate change.

Philosophical work -- The Foundational Research Institute attempts to uncover crucial considerations that can change our views on what matters and what works. For another example, philosophical work in population ethics can have significant effects on how you view the benefits of AMF .

Synthesis work -- GiveWell and the Copenhagen Consensus both aggregate considerations surfaced by other forms of cause prioritization and turn them into decisions about what should be prioritized.

Forecasting work -- Philip Tetlock works to help improve our ability to accurately predict the future.

For example, consider the question “How valuable is marginal investment in campaigns for cage-free eggs?”

Empirical: Partner with a lobbying org like The Humane League to randomly select some organizations to go after and some organizations to leave alone. See what happens to the organizations you leave alone relative to the ones you target. Do they go cage free on their own? Do other organizations target them instead? Do the organizations you target pledge to go cage free but then never follow through?

Theoretical: Create a Monte-Carlo simulation of cage free campaigning based on a variety of guesses at reasonable parameters. Create a model on Guesstimate.

Philosophical: Think about whether nonhuman animals are morally relevant and what kinds of ways nonhuman animals can be relevantly harmed. Think about what it means for animals to have welfare relative to humans. Consider whether cage-free campaigns are important, neglected, and tractable.

Synthesis: Aggregate a bunch of pre-existing information into a report on cage-free campaigns (likeACE or OpenPhil[3]).

Forecasting: Figure out who has the best track record at forecasting the impact of animal welfare campaigns and figure out how they would predict the outcome of cage-free corporate campaigns.

Of these methods, I’d argue that the biggest opportunities lie in working more on empirical cause prioritization. Why?

Without empirical work, we have very little concrete evidence to use to make the other methods work. Theoretical and synthesis work both go off of empirical work, aggregating them together. Philosophical work is best when telling us how to judge empirical work or what views empirical work may have missed. And forecasting work is best benchmarked against how accurately it predicts the future empirical results.

Empirical cause prioritization is relatively neglected to its importance. I’d expect all five methods to be useful for cause prioritization, but there has lately only been significant investment in philosophical, theoretical, and synthesis-related work. No one appears to do any empirical work[4], which suggests it could be a neglected opportunity worth taking[5].

Empirical work is more tractable. Working out tough problems like quantifying the amount of existential risk that currently exists are difficult problems that not only are unlikely to be resolved soon, but are unlikely to even find frameworks for resolution that are satisfying. While empirical work is challenging, we at least already have a framework for how progress can be made[6].

There are studies worth doing now that can be done now but aren’t being done, such as a high-quality studies to determine whether certain interventions work to improve animal welfare (potentially working with philosophical work to establish a relative value to global poverty work) and higher-quality work to establish the impact of EA movement building.

Endnotes

[1]: Of course, I’m pretty skeptical this is the case. I don’t think anyone claims AI risk is literally 10^51 times more important, but I’m even skeptical of the 15000x number.

[2]: This was adapted from Paul Christiano’s Untitled Prezi on cause prioritization.

[3]: One aspect I did not mention that OpenPhil does a lot of (and ACE does some of) is learning by giving. This could be considered empirical cause prioritization work that is a lot less structured and rigorous than an RCT but potentially allows one to learn the effects of particular interventions with high external validity.

[4]: Looking through the five outlined approaches to cause prioritization, here are some organizations that I think best fit each (certainly I am missing some organizations):

Empirical -- the Institute for Health Metrics and Evaluation, IDinsight,the Center for Global Development, the World Health Organization,the Center for Disease Control, Cochrane,the Campbell Collaboration,the Abdul Latif Jameel Poverty Action Lab,Innovations for Poverty Action, and the International Initiative for Impact Evaluation, AI Impacts.

Theoretical/Philosophical -- Machine Intelligence Research Institute,Future of Humanity Institute, Future of Life Institute,Global Catastrophic Risks Institute, Foundational Research Institute, Global Priorities Project

Synthesis -- GiveWell, Open Philanthropy Project,Copenhagen Consensus,the Disease Control Priorities Project, Our World in Data, Animal Charity Evaluators

Forecasting -- Centre for Applied Rationality, Philip Tetlock

While there are a good deal of organizations working on producing empirical studies, with the exception of AI Impacts these organizations are entirely within global poverty as a cause and none of the organizations work to compare between causes.

Likewise, for the question of cage-free campaigns, there exists marginal investment in synthesis, theoretical approaches, and philosophical approaches, but no investment in empirical or forecasting approaches.

[5]: Similarly, forecasting work is also neglected, though I expect it to be somewhat less useful and tractable than empirical work. When thinking through how neglected something is we must always wonder if it is neglected for good reason. I don’t think this is true for empirical work -- empirical work seems to have important value for establishing initial numbers to work from. Instead, I think empirical work is neglected because it is very difficult and relatively expensive.

[6]: Now excuse me while I try to figure out what kind of methodology best fits an empirical study to figure out whether empirical methods to cause prioritization outperform other methods of cause prioritization.

undefined @ 2016-06-07T16:45 (+7)

If you believe Michael Dickens’s calculations, AI safety work could be 10^51 times more important than work on global poverty[1] work if the estimate provided is literally correct

A common response here is that this looks at the long-term effects of AI safety but only the direct effects of global poverty work, when really you should look at the long-term effects of global poverty alleviation as well. Although this may make global poverty look worse because I believe the long-term effects look slightly more likely to be bad than good.

undefined @ 2016-06-09T16:47 (+4)

I feel like this post still doesn't square the discrepancy between causes where empirical work is more or less viable, but I applaud getting more 'harder'(?) evidence in any cause. The fact seems to be existential risk reduction won't ever be amenable to the level of empirical evidence of other causes. We're both familiar with Katja Grace's seminal essay 'estimation is the best we have'. I guess my response to this for some time has been "your estimates aren't very good", which is why I'm glad Grace is working at AI Impacts to get better ones.

I feel like there is a history of organizations in some causes claiming that the theoretical value of their work is so high that other considerations are moot, but they don't make the case as to why their specific organization is effective. If, say, MIRI has the potential to be 15000x more effective than the best poverty intervention, I'm concerned as to why nobody has been trying to evaluate whether in practice the work MIRI is doing actually fulfills that potential. Lately, this has been changing, which keeps me hopeful.

undefined @ 2016-06-09T01:58 (+3)

There are studies worth doing now that can be done now but aren’t being done, such as a high-quality studies to determine whether certain interventions work to improve animal welfare

There is a planned program to fund empirical research for animal advocacy https://www.animalcharityevaluators.org/blog/introducing-our-new-advocacy-research-program-officer/