Cost-effectiveness of operations management in high-impact organisations

By Vasco Grilo🔸 @ 2022-11-27T10:33 (+48)

Summary

Following up on the challenge to quantify the impact of 80,000 hours' top career paths introduced by Nuño Sempere, I have estimated the cost-effectiveness of operations management in high-impact organisations (OM), which arguably include 80,000 Hours’ top-recommended organisations.

The results for the mean cost-effectiveness of various metrics in bp/G$ in terms of existential risk reduction are summarised in the table below for my preferred method. I present all results with 3 digits, but I think their resilience is such that they only represent order of magnitude estimates (i.e. they may well be wrong by a factor of 10^0.5 = 3). 

The full results are in this Sheet, and the calculations in this Colab[1].

Mean cost-effectiveness (bp/G$) of…

Method 3 with truncation

Global health and development

0.431

Longtermism and catastrophic risk prevention

3.95

Animal welfare

1.62

Effective altruism infrastructure

3.20

The effective altruism community

1.55

Operations management in high-impact organisations

7.01

Acknowledgements

Thanks to Abraham Rowe, Dan Hendrycks, Luke Freeman, Matt Lerner, Nuño Sempere, Sawyer Bernath, Stien van der Ploeg, and Tamay Besiroglu.

An oil painting by Matisse of operations management in high-impact organisations. Generated by OpenAI's DALL-E.

Methods

I estimated the cost-effectiveness[2] from the product between:

This method assumes the cost-effectiveness distribution of the high-impact organisations is represented by the one theorised for the effective altruism community in the next section. Moreover, the cost-effectiveness estimates are only accurate to the extent that future opportunities are as valuable as recent ones.

The calculations are in this Colab[1].

Cost-effectiveness of the effective altruism community

I calculated the cost-effectiveness of the effective altruism community from the mean cost-effectiveness weighted by cumulative spending between 1 January 2020 and 15 August 2022 of 4 cause areas:

These are the areas for which Tyler Maule collected data here[3] (see EA Forum post here). I adjusted the 2020 and 2021 values for inflation using the calculator from in2013dollars.

I computed the cost-effectiveness of each area using 3 methods. All rely on distributions which are either truncated to the 99 % confidence interval[4] (CI) or not truncated, in order to understand the effect of outliers. The parameters of the pre-truncation distributions, which are the final distributions for the non-truncation cases, are provided below.

Method 1

I defined the cost-effectiveness of longtermism and catastrophic risk prevention as a truncated lognormal distribution with pre-truncation 5th and 95th percentiles equal to 1 and 10 bp/G$ in terms of existential risk reduction. These are the lower and upper bounds proposed here by Linchuan Zhang.

I assumed the ratio between the cost-effectiveness of i) longtermism and catastrophic risk prevention and ii) global health and development to be a truncated lognormal distribution with pre-truncation 5th and 95th percentiles equal to 10 and 100. These are the lower and upper bounds guessed here by Benjamin Todd for the ratio between the cost-effectiveness of the Long-Term Future Fund (LTFF) and Global Health and Development Fund (search for “10-100x more cost-effective”).

I considered the ratio between the cost-effectiveness of i) animal welfare and ii) global health and development to be a truncated lognormal distribution with pre-truncation 5th and 95th percentiles equal to 270 ÎĽ and 211. I computed these multiplying:

This adjustment is analogous to assuming the mean moral weight is directly proportional to the number of neurons.

I set the cost-effectiveness of effective altruism infrastructure to the mean cost-effectiveness weighted by cumulative spending between 1 January 2020 and 15 August 2022 of the other 3 areas.

Method 2

I obtained the cost-effectiveness of each area based on the 27 answers regarding the mean cost-effectiveness of the Effective Altruism Funds given in the EA talent needs survey - 2018. Such answers are in the table below, whose last column was calculated by me.

Fund

Mean cost-effectiveness relative to the Effective Altruism Infrastructure Fund (%)

10th percentile

Median

90th percentile

Geometric mean between the 10th and 90th percentiles

Global Health and Development Fund

1

5

63

7.94

Long-Term Future Fund

16

167

283

67.3

Animal Welfare Fund

3

10

107

17.9

Effective Altruism Infrastructure Fund

100

100

100

100

I defined the cost-effectiveness of longtermism and catastrophic risk prevention as in method 1.

For the other areas, I assumed a truncated lognormal distribution with pre-truncation 10th and 90th percentiles of the area relative to those of longtermism and catastrophic risk prevention based on the 10th and 90th percentiles in the table above:

Method 3

I defined the cost-effectiveness of each area from the mean between those of methods 1 and 2. My best guesses regard the truncation case of this method.

Multiplier of operations management in high-impact organisations

I defined the multiplier of OM as the median of the 11 distributions described in the table below, and also experimented with truncating to the 99 % CI the component distributions of each of them. I used the median with the intention of following Jaime Sevilla’s best guess on how to aggregate forecasts:

I obtained the distributions via asking i) 75 people working at 80,000 Hours’ top-recommended organisations (on October 30 and 31), and ii) 259 people in the Slack “EA Forecasting & Epistemics” (on November 2) for the multiplier of OM of their organisations and the effective altruism community. You can see here the list of emails I contacted, and the messages regarding i) and ii) (see “Emails” and “Slack message”, respectively).

I should emphasise the multiplier of OM may depend a lot on the organisation (e.g. its size, maturity, cause area, and what it understands as operations[8]), specific position (e.g. seniority), and personal fit[9]. Consequently, aggregating all estimates as I did has serious limitations.

Multiplier of OM for own organisations (N = 7)

Distribution (without truncation)

Mean (5th to 95th percentile)

Product between[10]:

  • Lognormal[11] with 5th and 95th percentiles 0.25 and 30.
  • Reciprocal of lognormal[11] with 5th and 95th percentiles 0.6 and 1.3.

9.19 (0.274 to 35.1)

Lognormal with 5th and 95th percentiles 0.75 and 3.5.

1.81 (0.750 to 3.50)

Product between[12]:

  • Lognormal with 5th and 95th percentiles 5.6 and 28.
  • Reciprocal of lognormal with 5th and 95th percentiles 50 and 75.
  • Reciprocal of lognormal with 5th and 95th percentiles 0.1 and 10.
  • Lognormal with 5th and 95th percentiles 20 and 90.

29.1 (0.672 to 112)

Lognormal with 25th and 75th percentiles 100 and 1 k.

1.36 k (19.1 to 5.25 k)

Normal[11] with mean and standard deviation 7 and 83.

6.99 (-129 to 144)

Product between[13]:

  • Normal with 5th and 95th percentiles 90 and 110.
  • Reciprocal of normal with 5th and 95th percentiles 50 and 70.
  • Reciprocal of the sum between:
    • Normal with 5th and 95th percentiles 3.5 and 4.
    • Reciprocal of normal with 5th and 95th percentiles 3.25 and 3.6.
0.420 (0.339 to 0.512)

Product between[14]:

  • Lognormal with 5th and 95th percentiles 0.25 and 2.
  • Reciprocal of lognormal with 5th and 95th percentiles 10 and 25.
  • Lognormal with 5th and 95th percentiles 30 and 70.
2.69 (0.609 to 6.89)

Multiplier of OM for the effective altruism community (N = 4)

Distribution (without truncation)

Mean (5th to 95th percentile)

Lognormal with 5th and 95th percentiles 0.75 and 3.5.

1.81 (0.750 to 3.50)

Product between[12]:

  • Normal with 5th and 95th percentiles -0.87 and 29.
  • Reciprocal of lognormal with 5th and 95th percentiles 20 and 60.
  • Reciprocal of lognormal with 5th and 95th percentiles 0.2 and 10.
  • Lognormal with 5th and 95th percentiles 20 and 60.

22.5 (-0.354 m to 88.8)

Lognormal with 25th and 75th percentiles 10 and 100.

136 (1.90 to 524)

Normal[11] with mean and standard deviation 30 and 161.

30.0 (-235 to 295)

The low number of estimates is evidence of:

In addition, what OM refers to is somewhat unclear. Based on what 80,000 Hours describes here, I think it can refer to both operations more broadly, or to senior operations positions which are further down the career path.

I also thought about estimating the multiplier based on the number of vacancies and candidates for operations management and all positions, but decided not given their unclear relationship with value. As vacancies decrease and candidates increase for a given position, the difference between the factual and counterfactual decreases, but the value of the factual increases.

Results

The tables below have the results for the mean, 5th percentile, and 95th percentile of the multiplier of OM and cost-effectiveness metrics. This Sheet contains more results (see tab “TOC”).

Multiplier of operations management

Multiplier of OM…

Mean

5th percentile

95th percentile

Without truncation

4.55

1.30

13.2

With truncation

4.53

1.31

13.0

Cost-effectiveness

Method 1

Without truncation

Cost-effectiveness (bp/G$) of…

Mean

5th percentile

95th percentile

Global health and development

0.163

0.0196

0.509

Longtermism and catastrophic risk prevention

4.04

1.00

10.0

Animal welfare

41.1

3.85 ÎĽ

4.42

Effective altruism infrastructure

5.61

0.283

3.41

The effective altruism community

5.61

0.283

3.41

Operations management in high-impact organisations

22.5

0.643

20.6

With truncation

Cost-effectiveness (bp/G$) of…

Mean

5th percentile

95th percentile

Global health and development

0.156

0.0208

0.481

Longtermism and catastrophic risk prevention

3.95

1.03

9.71

Animal welfare

1.95

4.67 ÎĽ

3.64

Effective altruism infrastructure

1.31

0.291

3.14

The effective altruism community

1.31

0.291

3.14

Operations management in high-impact organisations5.92

0.666

18.7

Method 2

Without truncation

Cost-effectiveness (bp/G$) of…

Mean

5th percentile

95th percentile

Global health and development

0.762

0.0522

2.66

Longtermism and catastrophic risk prevention

4.04

1.00

10.0

Animal welfare

1.35

0.170

4.18

Effective altruism infrastructure

5.14

2.35

9.39

The effective altruism community

1.85

0.766

3.80

Operations management in high-impact organisations8.421.5426.0

With truncation

Cost-effectiveness (bp/G$) of…

Mean

5th percentile

95th percentile

Global health and development

0.705

0.0549

2.53

Longtermism and catastrophic risk prevention

3.95

1.03

9.71

Animal welfare

1.29

0.177

4.01

Effective altruism infrastructure

5.10

2.39

9.23

The effective altruism community

1.79

0.773

3.59

Operations management in high-impact organisations8.101.5624.7

Method 3

Without truncation

Cost-effectiveness (bp/G$) of…

Mean

5th percentile

95th percentile

Global health and development

0.463

0.0660

1.43

Longtermism and catastrophic risk prevention

4.04

1.00

10.0

Animal welfare

21.2

0.101

4.11

Effective altruism infrastructure

5.37

1.60

5.76

The effective altruism community

3.73

0.567

3.58

Operations management in high-impact organisations15.51.1723.8

With truncation

Cost-effectiveness (bp/G$) of…

Mean

5th percentile

95th percentile

Global health and development

0.431

0.0676

1.36

Longtermism and catastrophic risk prevention

3.95

1.03

9.71

Animal welfare

1.62

0.104

3.49

Effective altruism infrastructure

3.20

1.62

5.48

The effective altruism community

1.55

0.574

3.27

Operations management in high-impact organisations7.011.1921.6

Discussion

Multiplier of operations management

The mean of the multiplier of OM for the non-truncation and truncation cases are 4.55 and 4.53. I thought organisations would organise themselves such that the expected (marginal) cost-effectiveness would be similar for all positions, so I was somewhat surprised to get values 5 times as high as 1.

The p-values for the null hypothesis that the OM follows distributions with the same shape as the ones I obtained, but with a mean of 1, are 1.27 % and 1.16 % for the non-truncation and truncation cases[15]. So one can be reasonably confident that the multiplier is higher than 1, but only if the 11 answers I got are representative of the effective altruism community, which is far from certain.

The mean multiplier of OM for the truncation case is 99.5 % the one for the non-truncation case. This means the outliers of each of the individual estimates practically do not affect the results.

Cost-effectiveness

For the truncation case, the mean cost-effectiveness metrics as a fraction of that of longtermism and catastrophic risk prevention for methods 1, 2 and 3 are:

Consequently, for the truncation case:

For the non-truncation case:

I believe the results for the truncation case are more accurate because it is hard to represent outliers well based on subjective 90 % CIs. For example, I think the cost-effectiveness of animal welfare for the non-truncation case is too heavy-tailed, with its mean being 9.98 k (= 41.1/0.00411) times its median. The heavy-tailedness of this same metric for the truncation case seems more reasonable, with its mean being 474 (= 1.95/0.00411) times its median.

The mean cost-effectiveness of OM for the truncation case as a fraction of that for the non-truncation case is:

This means the outliers have a material effect for methods 1 and 3, but not for 2.

The 5th percentile, median, and 95th percentile of the cost-effectiveness of OM for method 3 with truncation are 17.0 %, 58.1 % and 3.09 times the mean of 7.01 bp/G$. I expected the distribution to be more heavy-tailed, but I arguably had in mind the wider distribution of potential applicants instead of the narrower one of those selected for working in the positions.

Further work

Some potential avenues for further work are, in my descending order of importance:

  1. ^

     For 10 M random samples, each truncation and non-truncation case takes me 5 min to run and save the results.

  2. ^

     In this text, cost-effectiveness refers to marginal cost-effectiveness.

  3. ^

     This is a link to my copy, which contains data last updated on August 15. You can find Tyler’s Sheet here.

  4. ^

     If X and X_pre_trunc are the truncated and pre-truncation distributions, and p is the probability of X_pre_trunc being between the minimum and maximum of X, the probability of X being between a and b is 1/p times as large as that of X_pre_trunc being between a and b, which are 2 values between the minimum and maximum of X.

  5. ^

     These 2 values consider a wide moral weight distribution whose 95th percentile is 60 k (= 17.2 / (270 ÎĽ)) times as large as the 5th percentile.

  6. ^

     According to Jaime:

    When the data includes poorly calibrated outliers, if it's possible exclude them and take the geometric mean. If not, we should use a pooling method resistant to outliers. The median is one such popular aggregation method.

  7. ^

     The median significantly attenuates the effect of the outliers. For the truncation (non-truncation) case, the mean multiplier with all estimates is 1.54 (1.97) times that without the 4th and 8th estimates using the median, but 11.0 (12.5) times using the mean.

  8. ^

     According to Stien van der Ploeg, Animal Charity Evaluators’ Executive Director:

    Some organisations consider any non-program related positions to fall under operations, including communications, strategy, HR, finance, and fundraising roles. Other groups only consider specific administrative jobs like finance, personnel, and organisational support as operations, and some interpret it even narrower.

  9. ^

     The mean person working in OM has a much better fit than the respective mean applicant, but there may still be material variation amongst workers.

  10. ^

     The 1st/2nd distribution represents the marginal impact/cost per unit time of OM relative to all positions.

  11. ^

     The respondent mentioned this type of distribution was an approximation.

  12. ^

     The 1st/2nd distribution represents the marginal impact/cost per unit time of OM, and the 3rd/4th that of all positions.

  13. ^

     The 1st distribution represents the marginal impact per unit time of OM as a multiple of the mean marginal impact of all positions, and the 2nd/3rd the marginal cost per unit time of OM / all positions.

  14. ^

     The 1st/2nd distribution represents the marginal impact/cost per unit time of OM, and the 3rd/4th that of all positions.

  15. ^

     Calculated in J2 of the last 2 tabs of the Sheet.

  16. ^

     According to the data collected by Tyler Maule, the spending as a fraction of the total of the FTX Foundation between January 1 and August 15 on longtermism and catastrophic risk prevention, and effective altruism infrastructure was 73.5 % and 26.5 %.

  17. ^

     According to the data collected in July 2021 by Benjamin Todd here, the “FTX team” represented 35.8 % (= 16.5/46.1) of the funds committed to effective altruism. Assuming the cost-effectiveness is inversely proportional to the committed funds, losing those from FTX leads to it being 1.56 (= 1/(1 - 35.8 %)) times as high.


CBConfessions @ 2022-11-27T23:57 (+11)

Hey there! Really appreciate you doing work on this! 

As someone who is not well-versed in cost-effectiveness analysis, but is very keen learning about this work - could you make the summary a bit more accessible? When reading it I was like: 1) what the hell is bp/g$ 

(I know there is a wiki page linked, but I think most people don't want to click on hyperlinks during the reading of a summary, they just want to decide whether to commit to reading the post. 

After I checked the wiki link I realised bp means 0.0001 but after a quick glance I'm still unsure what giga is (note that I'm writing this comment at 1 am, so the fault can definitely be mine)

Vasco Grilo @ 2022-11-28T07:18 (+4)

Hi CB,

Thanks for asking, and being keen to learn about this work!

I understand this notation may not be the most easily comprehensible at first sight. Using it more often will arguably make it more understandable in the long-run.

As you say, 1 bp = 0.01 % = 0.0001. 1 G = 10^9, so 1 G$ means 1 billion dollars.

I see this is your 1st comment. Welcome to the EA Forum!

CBConfessions @ 2022-11-28T20:33 (+3)

thanks!

NunoSempere @ 2023-02-13T12:03 (+5)

So coming back and looking at this, one central mystery is: why is the multiplier so high? Some possible answers might be:

I'm also confused about whether operations roles are all similar enough that they can be modelled the same way.

So if I was working more on this, I'd probably:

CristinaSchmidtIbáñez @ 2022-11-27T18:34 (+5)

Hi Vasco,

Thanks so much for all the effort put into attempting to do this calculation. I really appreciate it!

I have one main question  (+ a meta comment) around the calculation of the cost-effectiveness of OM:

In the email you sent you asked for "(I_OM / C_OM) / (I_A / C_A), where:

What impact metric is meant by I? I read through your post, but maybe I missed something...

It would've helped a lot to see this information in the post itself to follow reasoning transparency.

Vasco Grilo @ 2022-11-27T19:26 (+2)

Hi Cristina,

Thanks for the kind words!

What impact metric is meant by I?

Good point. I have not mentioned what is meant by impact. Some thoughts:

  • It depends on one's moral views, but I guess leaving it undefined was a fine way of accounting for moral uncertainty. People presumably gave their estimates based on their moral views.
  • In practice, each organisation has its own heuristics for impact. For Against Malaria Foundation, it may be the number of distributed bednets. For a research organisation, it may include the quantity and quality of publications. More generally, success in the objectives and key results of the organisation would be an indication of producing impact.

In the email you sent you asked for "(I_OM / C_OM) / (I_A / C_A)

I would say I did not exactly ask people to use that formula. I asked:

For both [organisation] and the effective altruism community, I would be happy to know your best guesses for the ratio between the expected marginal cost-effectiveness of:

  • Operations management positions.
  • All positions.

Then, I gave that formula as as example:

One example way of estimating the ratio is from (I_OM / C_OM) / (I_A / C_A)

Only 3 people explicitly used this formula (in agreement with the tables of this section).

It would've helped a lot to see this information in the post itself to follow reasoning transparency.

Thanks for the feedback. I thought linking to it was fine, as the formula was just a suggestion intended to illustrate what I meant by ratio between the expected marginal cost-effectiveness of operations management and all positions.

ChanaMessinger @ 2022-12-14T12:13 (+4)

Out of curiosity, what programs did you use for your calculations? Squiggle, sheets, other?

Vasco Grilo @ 2022-12-14T22:28 (+4)

Hi Chana,

I used this Colab. The link was in the section Methods, but I have added a sentence to the Summary with it such that it is more visible now. Thanks.

ChanaMessinger @ 2022-12-14T22:30 (+4)

Sorry, missed that! Thanks so much.

Max Ghenis @ 2022-11-29T00:47 (+3)

Really interesting post. Not to hijack it, but I didn't know about the EA Forecasting & Epistemics Slack. Can you point me to info on it or how to join?

Vasco Grilo @ 2022-12-14T22:37 (+3)

Thanks, Max!

Regarding the Slack, from here (I appreciate it is not the most visible place; I do not know whether there is more information elsewhere):

There's some public discussion on Github. There's also a private EA/epistemics Slack with a few channels for Squiggle; message Ozzie to join.

I guess you can reach to Ozzie here.