GiveWell's updated estimate of deworming and decay

By GiveWell @ 2023-04-03T21:07 (+46)

This is a linkpost to https://docs.google.com/document/d/1Vh-bhYkGteK4z48LpJHIbV_CbsSnp6__gYZRGZSQlfE/edit

Author: Alex Cohen, GiveWell Senior Researcher

This document describes the rationale for the decay adjustment in our deworming cost-effectiveness analysis. We have incorporated this adjustment thanks to criticism from the Happier Lives Institute.

Editor's note: In our earlier comment, we said we should have characterized the results from Lång and Nystedt (2018) as mixed rather than positive. We have now updated the spreadsheet so that study is correctly color-coded, and we have updated the relevant part of the post. In the "Prior for decay" section, we edited one sentence as indicated below.

Original text: "Of those 10 studies, 3 found decreasing effects on income, 3 found increasing effects, and 4 found mixed effects (either similar effects across time periods, different patterns across males and females, or increases and then decreases over the life cycle)."

Revised text: "Of those 10 studies, 3 suggest decreasing effects over time, 2 suggest increasing effects over time, and 5 show mixed effects (either similar effects across time periods, different patterns across males and females, or increases and then decreases over the life cycle)."

In a nutshell

What we did previously

The main piece of evidence we use for the long-term effects of deworming is an RCT in Kenya that measures effects on income at ~10 years (KLPS-2), ~15 years (KLPS-3) and ~20 years (KLPS-4) after children receive deworming treatment.[1]

Our typical approach has been to pool effects on earnings and consumption across three survey rounds, which suggests an effect of 0.109 on ln income.

Because deworming has limited high-quality evidence for an impact on income, we substantially discount this observed effect from the three survey rounds.[2] Our prior is that a plausible effect of deworming in the RCT in Kenya is ~1%.[3] The RCT evidence, which finds an effect of ~10%, updates us slightly from that prior.[4] Using an informal Bayesian updating framework, our best guess is that the effect for individuals in the RCT is ~1.4%, i.e., we apply a replicability adjustment of 13% to the findings from the RCT in Kenya.[5]

We then assume that any effects of deworming last for 40 years once an individual enters the labor force (assumed to be 8 years after receiving deworming treatment). We assume these follow-ups provide noisy estimates of the same effect, and our prior is that effects should be constant over this 40-year period.

Incorporating the possibility that there is decay over time

An alternative interpretation is that the estimates across surveys reflect different effects of childhood deworming over time. If we take the survey estimates at face value, there appears to be a decline in effect over time (0.234 to 0.069 to 0.039 in ln earnings, KLPS-2 to KLPS-4, and 0.30 to 0.09 in ln consumption, KLPS-3 to KLPS-4).

We think it’s possible these changes reflect true declines in effect over time and that we should account for this possibility in our CEA. We do this by (i) putting some weight on these providing estimates of different effects over time, (ii) updating from a prior that effects are constant over time, and (iii) applying separate replicability adjustments for each survey round and using effects from KLPS-2 to KLPS-4 to extrapolate declines 40 years out.

Weight on decay

When we look at evidence like this, we typically favor pooled results when there is no a priori reason to believe effects differ over time, across geographies, etc. (e.g., a meta-analysis of RCTs for a malaria prevention program). In cases where there’s more reason to believe the effects vary across time or geographies, we’re more likely to focus on “sub-group” results, rather than pooled effects. In either case, this is often a subjective assessment.

In this case, we’re uncertain about whether to pool results or not and think there are reasons for and against putting more weight on decline in effects over time. As a result, we put 50% weight on the surveys capturing noisy estimates of the same effect and 50% weight on surveys capturing true changes in effects on earnings and consumption over time.

Reasons for putting more weight on effects varying over time:

Reasons to put less weight on effects varying over time:

We’re uncertain about the appropriate weight to put on the interpretation that income effects are different (and declining) over time, and this is a key judgment call in our analysis.

Prior for decay

A key assumption is that we’re updating from a prior that the effects on increased income are 1% and constant over 40 years. If we had reason to believe instead that effects should decay, based on evidence from similar interventions, then we’d be updating from a prior of decay and include steeper decay. We would also put more weight on the interpretation that the different estimates for the effect of deworming over time are capturing true differences and less on the interpretation that these are noisy estimates of the same effect.

In order to assess whether the impact of deworming on income increases, decreases, or remains the same over the lifecourse of those receiving deworming treatment as children, we carried out a shallow literature review and consulted with experts and GiveWell researchers regarding studies of childhood interventions with multiple adult follow-ups. We looked for studies that examined long-term effects of improvements in early-life health (e.g., weight/height), cognition, and education, which we think are some of the plausible mechanisms through which deworming leads to impacts on later-life income.

We found 10 longitudinal studies with at least two adult follow-ups from a number of countries examining the impact of a range of childhood interventions or conditions (see this table), in addition to the deworming study (Hamory et al. 2021).

Of those 10 studies, 3 suggest decreasing effects over time, 2 suggest increasing effects over time, and 5 show mixed effects (either similar effects across time periods, different patterns across males and females, or increases and then decreases over the life cycle).

Based on this, we think it makes sense to continue to assume as a prior that income effects would be constant over time. I have low confidence in these estimates, though, and it’s possible further work could lead to a different conclusion. Specific areas of uncertainty and areas for further investigation are:

Replicability adjustment for each survey

We use a replicability adjustment in our deworming CEA to capture our best guess at the portion of the income effects of deworming found in the Kenya RCT that would be found if a perfect experiment could be run again under the same conditions. To create this adjustment, we use a broadly Bayesian framework.[13] Our “prior” in this context is our best guess at what we would have expected the effect size of deworming on developmental effects to be in absence of results from the Kenya RCT. We then update our prior using the Kenya RCT and our views on the strengths and limitations of the evidence base.

To incorporate decay into our estimates, we apply separate replicability adjustments for each follow-up survey from the Kenya RCT (KLPS-2, KLPS-3, and KLPS-4). Under each story (different estimates over time vs. noisy estimates of the same effect), we update from a prior of 1% impact on consumption over time. I updated replicability adjustments for each of the estimates (10 years, 15 years, 20 years) by running the same replicability adjustment calculations for each year. In the case where we interpret these as different estimates over time, I follow a similar approach to our current CEA but update separately for each time period.

Our current approach in the deworming CEA:

The alternative approach (which views KLPS-2, KLPS-3, and KLPS-4 as capturing different income effects and so allows there to be decay):

Like our current replicability adjustments, these estimates hinge on judgment calls and assumptions.

More broadly, there may be alternative approaches to updating from priors on both the average effect of deworming and decay over time that are more accurate. We’ve chosen to model decay by (i) specifying a prior on the effects of deworming on income over time and (ii) updating from this prior by putting some weight on the RCT in Kenya finding decay in effects over time and some weight on the RCT capturing noisy estimates of the same effect over time. There may be better or more formal approaches to model decay (e.g., by putting priors on the initial effect of deworming and a prior on decay, then updating both based on the KLPS surveys). Ultimately, we chose the current approach because it seems like the most straightforward and most consistent with what we’re currently doing, but it’s possible alternative approaches are better.

Bottom line adjustment factor

Our best guess is that we should apply a -10% adjustment due to the possibility of decay in effects over time.

In the model where we assume KLPS 2-4 provide noisy estimates of the same effect, we estimate an average effect of deworming of 0.109 on ln income. When we update from our skeptical prior, our best guess is ~1.4% over 40 years for a net present value of 0.115.

In the model where we assume KLPS 2-4 provide different estimates over time, we estimate an effect of 0.23, 0.19, and 0.07 on ln income at 10-years, 15-years, and 20-years post deworming. When we update from our skeptical prior, our best guess is ~1.6%, ~1.5%, and 1.3% at years 10, 15, and 20 and a net present value of 0.093 during the full time period.

We put 50% weight on each of these interpretations, which lowers the total effect by -10% (relative to putting 100% weight on KLPS 2-4 capturing noisy estimates of the same effect).

Sources

Title Link
Baird et al. 2016 https://doi.org/10.1093%2Fqje%2Fqjw022
GiveWell, "2023 cost-effectiveness analysis - version 2" https://docs.google.com/spreadsheets/d/10JFJaWnFAEKmsv5XjXqGqEoMUx0eM7x3WYwu_vC7FRw/edit#gid=472531943
GiveWell, "Context on deworming replicability adjustment (2020)" https://docs.google.com/document/d/1-F5sZBq6FD6E73SWkKFhwMR9gCdKUCTfp9dOe0I-1vw/edit
GiveWell, "Deworming decay adjustment: deworming effect calculation (2023)" https://docs.google.com/spreadsheets/d/1iUcIjfudwQlPOftbG_e5axbiAfRr15ie7rvtTiiDFvU/edit#gid=1321957472
GiveWell, "Deworming decay adjustment: KLPS 4 Deworming Effect Size Parameter Update (2023)" https://docs.google.com/spreadsheets/d/1bbZWTjklQ5hc2i4zynCR6gq2TqR0x3HCgC-ssK6i0FI/edit#gid=1667455426
GiveWell, "Deworming decay adjustment: replicability adjustment (2023)" https://docs.google.com/spreadsheets/d/1u6kDrFbns-2_M46G_POro09RcKQZ0TSy8qq-IXK6Z1o/edit#gid=2002315610
GiveWell, "Deworming decay adjustment: replicability adjustment (informal Bayesian analysis, 2023)" https://docs.google.com/spreadsheets/d/1kOh43pku33n7AQyAv6X43Z9bTUThcE43ZZcOeD-YUoo/edit#gid=251688210
GiveWell, "Deworming Effect Size Parameter Update - KLPS 4 Results 11.07.19" https://docs.google.com/spreadsheets/d/1MNEPqRhIndfpeJT3LxCK1Bvrn-N81n0PlvsXRjc3fb4/edit#gid=0
GiveWell, "Deworming replicability adjustment (2020)" https://docs.google.com/document/d/1PZfYXegWco0qrmQnQBjclZeq4uUdWfpE5xdBYx0cAEU/edit
GiveWell, "Deworming replicability adjustment 2019" https://docs.google.com/spreadsheets/d/1ZvX6XI5AKxTYQJbyxlEkmuf6LyYlt18kyfeqRGs-aaM/edit#gid=0
GiveWell, "Long-term effects literature review" https://docs.google.com/spreadsheets/d/1n1fRU77jvxFlIkiHF4zoQn3I6n3KcpjGfLih_W_JniM/edit#gid=0
GiveWell, "UC Berkeley — KLPS-4 Survey" https://www.givewell.org/research/incubation-grants/uc-berkeley/april-2017-grant
GiveWell, "Why we can’t take expected value estimates literally (even when they’re unbiased)," 2011 https://blog.givewell.org/2011/08/18/why-we-cant-take-expected-value-estimates-literally-even-when-theyre-unbiased/
Hamory et al. 2021 https://doi.org/10.1073/pnas.2023185118
McGuire, Dupret, and Plant, "Deworming and decay: replicating GiveWell’s cost-effectiveness analysis," 2022 https://web.archive.org/web/20230221162055/https://forum.effectivealtruism.org/posts/MKiqGvijAXfcBHCYJ/deworming-and-decay-replicating-givewell-s-cost

Notes


  1. “Wage earnings and self-employment profits were collected in KLPS-2, KLPS-3, and KLPS-4; agricultural profits were collected in KLPS-3 and KLPS-4. Annual per capita household earnings are calculated as the sum of wage employment earnings, self-employment profits, and agricultural profits across all household members, divided by the number of household members. Household earnings are only available in KLPS-4.” Hamory et al. 2021, Table 1. ↩︎

  2. We describe the rationale for this here and here. ↩︎

  3. This is based on evidence from health and other possible mechanisms that might contribute to deworming’s long term effects. Our calculations are in this spreadsheet. 1% is the weighted average of effects from different mechanisms (these cells) with the weights on these different mechanisms (these cells). ↩︎

  4. The treatment effect of deworming on ln(income) in the Miguel and Kremer 2004 study population is 0.109, based on our pooling of results across rounds. We describe the rationale for this parameter in the documents linked from this cell in our cost-effectiveness analysis. ↩︎

  5. We describe our informal Bayesian approach here and here. The rationale for our 13% replicability adjustment for deworming is in the documents linked from this cell. ↩︎

  6. Hamory et al. 2021, Appendix, Fig. S3. ↩︎

  7. “It is worth noting that one quarter of both the treatment and control groups are still in school by the time of the survey (Table II), and labor market outcomes are less meaningful for this group.” Baird et al. 2016, IV.C. “Impact on Labor Hours and Occupation,” paragraph 1. ↩︎

  8. Hamory et al. 2021, Appendix, Fig. S3. “Deworming Treatment Effects by Survey Round, B. Annual Individual Earnings.” ↩︎

  9. “Annual individual earnings are calculated as the sum of wage employment across all jobs; nonagricultural self-employment profit across all business; and individual farming profit, defined as net profit generated from noncrop and crop farming activities for which the respondent provided all reported household labor hours and was the main decision maker within the last 12 mo. Wage earnings and self-employment profits were collected in KLPS-2, KLPS-3, and KLPS-4; agricultural profits were collected in KLPS-3 and KLPS-4.” Hamory et al. 2021, Table 2. ↩︎

  10. Hamory et al. 2021, Appendix, Fig. S3. “Deworming Treatment Effects by Survey Round, B. Annual Individual Earnings.” ↩︎

  11. Hamory et al. 2021, Appendix, Fig. S3. “Deworming Treatment Effects by Survey Round, A. Annual Per-Capita Consumption.” ↩︎

  12. “The measurement of economic outcomes was also improved: KLPS round 4 (KLPS-4) incorporates a detailed consumption expenditure questionnaire (modeled on the World Bank Living Standards Measurement Survey; see ref. 32) for all respondents, and round 3 collected this for a representative subsample.” Hamory et al. 2021, Introduction, paragraph 5. ↩︎

  13. See this blog post for further discussion of GiveWell's approach to using broadly Bayesian frameworks in our analyses. ↩︎

  14. 1.4% equals 0.109 treatment effect * 13% replicability adjustment. ↩︎

  15. See discussion above, under "Reasons to put less weight on effects varying over time." ↩︎


JoelMcGuire @ 2023-04-03T23:07 (+36)

Hi Alex, I’m heartened to see GiveWell engage with and update based on our previous work! 

[Edited to expand on takeaway]

My overall impression is:

[Note: I threw this comment together rather quickly, but I wanted to get something out there quickly that gave my approximate views.]

1. There are several things I like about this update: 

 

2. There are a few things that I think could be a bit clearer:

 

My next two comments are related to some limitations of this update that Alex acknowledges: 

  • It’s possible we’ve missed some relevant studies altogether.
  • We have not tried to formally combine these to get point estimates over time or attempted to weight studies based on relevance, study quality, etc.
  • We are combining studies that may have little ability to inform what we’d expect from deworming (twin studies, childcare programs, etc.).
  • It could be possible to re-assess other studies measuring long-term benefits of early childhood health interventions. When we set our prior, we excluded studies that did not report separate effects on income at different time periods. We guess that for several of these studies, it would be possible to re-analyze the primary data and create estimates of the effect on income at different time periods.

3. After briefly looking over the literature review GiveWell uses to build a prior on the long-term effects of deworming, it seems like further research would lead to different results. 

 

4. Progress towards building a firmer prior seems straightforward. Is GiveWell planning on refining its prior for deworming's trajectory? Or incentivizing more research on this topic, e.g., via a prize or a bounty? Here are some reasons why I think further progress may not be difficult:

 



 



 

  1. ^

    Higher ln earnings effects from KLPS-2 to KLPS-3 are driven by lower control group earnings in KLPS-2 ($330 vs. $1165).[8] In KLPS-3, researchers started measuring farming profits in addition to other forms of earnings,[9]so part of the apparent increase in control group earnings from KLPS-2 to KLPS-3 is likely driven by a change in measurement, not real standards of living or catch-up growth.” 

  2. ^

    “We found 10 longitudinal studies with at least two adult follow-ups from a number of countries examining the impact of a range of childhood interventions or conditions (see this table), in addition to the deworming study (Hamory et al. 2021). Of those 10 studies, 3 found decreasing effects on income, 3 found increasing effects, and 4 found mixed effects (either similar effects across time periods, different patterns across males and females, or increases and then decreases over the life cycle). Based on this, we think it makes sense to continue to assume as a prior that income effects would be constant over time. I have low confidence in these estimates, though, and it’s possible further work could lead to a different conclusion.”

GiveWell @ 2023-04-13T13:53 (+7)

Hi, Joel,

Alex here, responding to your comment. Thank you for taking the time to give us this feedback!

In response to some of your specific points: 

  • You're right that we should have characterized the results from Lång and Nystedt (2018) as mixed rather than positive. Thanks for pointing out that mistake. We will update the spreadsheet so that study is correctly color-coded, and update the relevant part of the post. With this adjustment, among the studies we looked at, 3 suggest decreasing effects over time, 2 suggest increasing effects over time, and 5 show mixed effects. This still doesn't seem like it adds up to strong evidence for either increasing or decreasing effects, so my prior of a flat effect over time remains the same. 
  • We excluded Duflo et al. 2021 because it didn't appear to include much about life cycle impacts on income from the intervention. It does report some increases in income for women in the treatment group between 2019 and 2020. However, I'd be reluctant to interpret that as evidence for increases over adulthood, because it represents only one year and because it compares pre-COVID results with results during COVID, which means other factors are probably at play.
  • That said, I agree that a more in-depth analysis might lead to a different prior for how we should expect early-life health interventions to affect income over the life cycle. We didn't prioritize an in-depth analysis for this adjustment, but we would be open to more work to create a better-informed prior of deworming's income effects over time. This would require deeper engagement with the studies we looked at to better understand their methodologies, relevance to deworming, and other factors. At the moment, it's not a high-priority project for GiveWell staff, but we're considering an external partnership to explore this further. We imagine that having a better grasp on how income effects change over time could inform our analysis not just of deworming but also of other programs we support, including vitamin A supplementation and seasonal malaria chemoprevention.  

We'll continue to share here if more work on this leads us to further updates. 

Best,
Alex

Kaleem @ 2023-04-03T21:21 (+9)

Hi Alex, thanks for this really detailed post, and for the work you put into the analysis! Its a really nice example of how internal critique in the EA community has lead to a tangible update.

My question: (How) Should the average reader/non-expert update on this -10% re-weighting? Like, if ~-10% is the decided as the official relighting, will this have a non-negligible effect on how we should view the cost-effectiveness of deworming programs etc?

Guy Raveh @ 2023-04-03T22:24 (+5)

And furthermore, will it change how funds from the 'all grants' fund are spent?

GiveWell @ 2023-04-07T22:58 (+10)

Hi, Kaleem and Guy!

This is Miranda Kaplan, communications associate at GiveWell. I'll answer both questions here, since they're closely related.

This adjustment updated GiveWell's overall impression of deworming by around 10%. But the bottom-line takeaway on deworming—which is that it's one of the most cost-effective programs we know of in some locations, but we have a higher degree of uncertainty about it than we do our top charities—hasn't changed much, and we think that should probably continue to be the takeaway for followers of our work. 

You can see the effect of our adjustment across all locations and all deworming programs we've supported in our cost-effectiveness analysis change tracker. Before this adjustment, there was already wide variation in our cost-effectiveness estimates for these programs—as high as 38.3x cash for Deworm the World's program in Kenya, and as low as -1x cash for SCI Foundation's program on Unguja, Zanzibar.

We can't say yet what the impact of the decay adjustment will be on GiveWell's overall grantmaking in the deworming space, either using All Grants Fund donations or using other sources. Our approach to grantmaking hasn't changed—we will continue to assess funding gaps for deworming on a case-by-case basis, and consider filling those gaps that clear our cost-effectiveness bar. In a few cases, locations that previously looked cost-effective enough to meet our bar for funding (currently 10x cash) now don't meet that standard. For example, as a result of this adjustment, the estimated cost-effectiveness of Deworm the World's program in Lagos state, Nigeria, dropped to 8.9x cash from 9.9x cash. But for most locations, this change didn't cause a decisive shift in cost-effectiveness that would affect a funding decision.

I hope that's helpful! 

Best,

Miranda

Guy Raveh @ 2023-04-08T20:04 (+2)

Hi Miranda, thanks for the very clear answer!

I don't necessarily agree with the method of allocation, but from a broad perspective I'm happy to see that a small change in estimates translates to a small, but still meaningful, adjustment in allocation.