How we use back-of-the-envelope calculations in our grantmaking

By Coefficient Giving @ 2025-05-28T23:22 (+79)

*Emma Buckland, Program Associate in Farm Animal Welfare, drafts a quick back-of-the-envelope calculation for her colleagues.*

This post was written by Open Philanthropy Global Health and Wellbeing Staff. Original blog post here.

At Open Philanthropy, our mission is to help others as much as we can with the resources available to us. When making tough calls about which grants will help the most, we rely on a tool that helps clarify expected impact: back-of-the-envelope calculations, or BOTECs.

BOTECs are rough quantitative models that estimate potential grants’ social return on investment (SROI). Open Phil program staff use them to compare a grant's expected benefits to its estimated costs.

Despite the name, most BOTECs wouldn’t actually fit on the back of an envelope. They often involve dozens of assumptions, nested calculations, and scenario planning — especially when we’re modeling larger grants or entire focus areas.

Staff across our Global Health and Wellbeing (GHW) focus areas use BOTECs to help answer a key question: Will this grant likely clear our bar for GHW grants?^[1] That bar is currently set at a return of slightly over 2,000x in “Open Phil dollars.”^[2]

A key purpose of the BOTEC exercise is to figure out: what would have to be true for this grant to meet or exceed the bar? This helps us focus our attention on the most decision-relevant assumptions.

What BOTECs can (and can’t) tell us

BOTECs are flexible tools, and their structure varies depending on the type of grant.

For health-focused grants, we often estimate how many disability-adjusted life years (DALYs) a grant might avert.
For farm animal welfare (FAW), we might model reductions in suffering.
For movement-building work, we might estimate how much funding an organization could raise for highly cost-effective charities.

Some BOTECs are backward-looking, based on a grantee’s track record. Others are forward-looking, built around a grantee’s theory of change and how likely we think it is to succeed.

To explore the range of possible outcomes, program staff often build multiple versions of a BOTEC — using optimistic and pessimistic inputs to see how sensitive our estimates are to certain parameters.^[3] This isn’t about reaching a precise “answer,” but about stress-testing our thinking under different plausible scenarios.

We don’t rely on BOTECs alone to make grant decisions; they are one piece of a broader picture. We also weigh the grantee’s leadership and track record, strategic considerations the BOTEC can’t easily quantify, and potential for “unknown upside” or unusual leverage.^[4] Grantmaking is always uncertain, and BOTECs don’t remove that uncertainty — but they do help us navigate it.

Sample BOTECs

Below are four sample BOTECs, simplified a bit for readability. We use pseudonyms to preserve confidentiality, but all draw from real grants we made across vastly different focus areas. We are sharing these examples to give a sense of how we think, where we (frequently!) debate, and how we try to allocate resources to help others as much as we can. As always, we welcome your feedback here.

Trialing statins for treating tuberculosis

This grant falls within our focus area of global health R&D.

A recent Phase IIa/b trial tested atorvastatin as an additional treatment for tuberculosis (TB), which kills more than a million people each year.^[5] The trial found that atorvastatin helped to reduce TB infection levels more quickly and showed promising improvements in reducing lung damage, based on chest X-rays.

This grant would support a Phase IIc trial to evaluate atorvastatin as an addition to standard TB treatments.^[6] The proposed trial would investigate whether different dosing strategies of atorvastatin improve survival rates, accelerate the reduction of TB infection levels, and reduce long-term damage to the lungs. The grant would also support efforts to develop a novel sweat-based diagnostic test for TB.

Benefits of the grant

How many people could benefit? To estimate this, we projected the number of people who would be on drug-susceptible TB treatment by the time results from a future Phase III confirmatory trial would be available. We started with the current number of new cases and made three assumptions: that new cases will decline by 2% per year until the Phase III trial (which we assumed wouldn’t finish until eight years after making this grant), that the current proportion of treated cases would remain constant, and that 96% of cases would be drug-susceptible. This resulted in an estimate of 5.5 million people who might benefit from the treatment.
Speedup: We estimate that funding this trial would result in a one-year speedup of atorvastatin being adopted as a TB treatment.^[7]
How effective could the treatment be? Among patients with drug-susceptible TB who are receiving treatment, the current mortality rate is approximately 6%. We do not know how effective this new treatment might be — this is the reason for the trial — but we estimate that a positive result might achieve a 10% relative reduction in mortality (i.e. a 0.6% absolute reduction).
What share of patients would actually adopt this treatment? We estimate that 60% of adult patients would add atorvastatin if it proves effective at reducing TB mortality. This would be a relatively high uptake — higher than the figures we use in most of our therapeutic development BOTECs — but we thought it was justifiable given that atorvastatin is already generic, cheap, and widely available.
Implication for lives saved: Combining the estimates above implies that a successful trial would allow a 0.6% absolute reduction in mortality for 3.3 million people — which equates to preventing roughly 19,900 TB deaths per year, once the intervention has been scaled up.
DALYs: This impact would represent 539,500 DALYs (~19,900 deaths prevented x 27 DALYs per death from TB).^[8] This would yield a total value of $54 billion in “Open Phil dollars” (539,500 DALYs x $100,000, the amount of value we assign to averting a DALY).
Total benefit. Since the grant would create one additional year of $54 billion/year, the total value of the grant is 54 billion OP dollars ($OP 54 billion).

Costs: The current trial would cost $320,000. However, it is only a Phase IIc trial, so there would be additional research and rollout costs to consider.^[9] We roughly estimate these costs to be $7 million.^[10]

Probability of success: We estimate a 40% probability that Phase II and Phase III trials will succeed.

Final SROI: The grant’s SROI is the benefit ($OP 54 billion) divided by the all-in program costs ($7 million), all multiplied by the estimated chance of success (40%). This yields ~3,000x SROI, well above our ~2,000x bar. That is, we expect the grant to generate 3,000 times as much value — in Open Phil dollars — as it costs.

Some limitations of this BOTEC

This BOTEC leaves out some additional benefits — such as the potential value of a new TB diagnostic tool — as well as any improvements to quality of life beyond preventing deaths.

In addition, we only have very rough estimates of what subsequent trials/studies will cost, and we don’t include the possibility that newer TB treatments in development reduce the TB burden beyond the current downward trend.

Updates since making the grant

More than a year has passed since we made this grant, and during that time, we’ve continued to learn about the wider TB landscape. If we were modeling it today, here’s what we’d change about the BOTEC:

In retrospect, we should have assigned a lower chance of success — closer to 30% than 40%.
Conversely, we should have estimated a more dramatic speed-up for atorvastatin adoption — closer to three years than one. We think this is a more appropriate estimate because there weren’t many funding options available for this work, and because we supported the lead researcher at a relatively early stage of his research (before his most promising results were published).
We would also model a longer time period before impact to account for widespread adoption of the drug (rather than just completion of a final clinical trial). We tentatively think widespread adoption might take 18 years, versus 8 years for the clinical trial. The extra 10 years would reduce the grant’s ROI, because we expect TB prevalence to decrease over time. That means a drug widely available in 2042 rather than 2032 wouldn’t have as many potential patients to reach.

While the lower chance of success and longer time to impact would reduce the SROI, the larger speedup would substantially increase it. Taken together, these changes would increase the final SROI.

Reducing the cost of pXRF screening

This grant falls within our focus area of global public health policy.

Current methods for testing lead in paint are resource- and skill-intensive. Portable XRF-based measurement (pXRF) uses a handheld device to quickly and cheaply detect lead in paint and other materials. If successful, the device would allow inspectors to more easily identify lead in paint and prevent it from causing harm.

This grant would fund the nonprofit organization Lead the Way^[11] to:

Collect 200-300 painted samples, measured using both pXRF and sophisticated lab techniques (at two universities for additional validation), and develop standard methods for using pXRF.
Put the method into practice by engaging closely with a national government, training local personnel, and using pXRF for measurement.
Test out the method for rapid screening of wet paints.

The key assumptions in the theory of change^[12] are:

Lead in paint represents 5% (confidence range: 5-20%) of the overall lead burden in lower- and middle-income countries (LMICs).
For this grant to drive reductions in leaded paint, it needs to increase detection (and through detection, enforcement) by actors using pXRF to test paints with lead levels above regulatory standards.
- There is a 50% chance (range: 40-75%) that this grant will validate pXRF enough for it to be published as a reference method.
- If pXRF is published as a reference method, we assume that countries representing 20% (range: 10-40%) of the overall burden of lead exposure in LMICs will adopt pXRF to test for lead in paint.
- In total, we estimate a 50% chance that this grant affects sources responsible for about 1% of the overall lead burden in LMICs.
  - (I.e. 5% of lead exposure comes from paint x 20% of countries adopting the method = 1%.)
However, “affecting” doesn’t mean “eliminating.” Even if some countries adopt pXRF, only a fraction of lead exposure from paint will be eliminated in those countries. We’ll need to estimate that fraction:
- We assume that half the paint by market volume has high lead levels, and that — if adopted — pXRF will reduce by 20% (range: 10-30%) the formal^[13] leaded paint market that accounts for 60% (range: 40-80%) of the lead problem.
- Additionally, if pXRF is adopted, we assume it will be used to make a reasonable dent in the informal market. We assume 40% (range: 20-60%) by volume of the leaded paint market is from smaller, informal producers, and that this method could reduce the size of that market by 50% (range: 25-90%).
- So overall, we would expect a 20% reduction in the 60% of the market that is formal (12% of the total burden), plus another 50% reduction in the other 40% of the market that is informal (20% of the total), for an overall reduction of 32% (assuming that LMIC countries widely adopt pXRF).
Based on the above calculations, what fraction of lead exposure would we expect this intervention to eliminate?
- 50% (chance of pXRF becoming a reference method) x (20% of the paint burden in LMICs, which in turn is 5% of the overall lead burden) x [(20% of the 60% market share from the formal sector) + (50% of the remaining 40% from the informal sector)] = .16%
How much benefit will come from a .16% reduction?
- We estimate that impacts from scanning will eliminate 0.16% of the total lead burden in LMICs.
- We estimate that the total lead burden in LMICs = 117M babies x 5 ug/dL x $OP 3900 per ug/dL^[14] = (~$OP 2.3 trillion).
- So the total expected benefit is 0.16% of ~$OP 2.3 trillion (2,281,500,000,000), which equals ~$OP 3.6 billion (3,650,400,000).
- We also make important but speculative assumptions that (i) pXRF development only deserves 10% (range: 5-20%) of the credit for the total impact of pXRF scanning,^[15] and (ii) that the grant speeds up development of the method by 8 years.
  - Considering these factors, we multiply $OP 3.6 billion by .1 to account for the development share of credit for method use = $OP 360 million.
  - We then multiply by 8 to account for years of speedup = $OP 2.9 billion.
- Finally, we multiply by .8 again to account for Open Phil’s share of credit for development = $OP 2.3 billion.^[16]

The expected value of this grant is ($OP 2.3 billion in benefit) / (total cost of project [~$350,000]) = ~6,500x SROI in expectation. For any one parameter, changing the best guess to the more pessimistic end of the confidence range still yields an SROI above the ~2,000x bar.

Supporting a fundraising organization for effective charities

This grant falls within our focus area of effective giving & careers.

Giving Ground is the only professional organization promoting effective giving in its country. With just 3.5 full-time employees, it raises money almost exclusively for GiveWell's top charities.

In many cases, Open Philanthropy program staff use backward-looking BOTECs to estimate future impact — so to assess a potential new grant, we created a BOTEC based on Giving Ground’s past performance:

With ~$300,000 in operating expenses, Giving Ground raised ~$800,000 for GiveWell's top charities over roughly the last 2.5 years.
We think that most of this sum is directly attributable to Giving Ground. A survey of Giving Ground donors suggested that 15% of the money allocated came from people who had already encountered effective giving in other contexts, so we applied a 15% reduction to the funds that Giving Ground raised when we created our estimate.

After the 15% adjustment, our best guess is that over the next two years, Giving Ground will raise $2.27 for GiveWell for every $1 it spends: $800,000 x (1-.15 [counterfactuality discount]) / ($300,000 [operating expenses]) = 2.27x.^[17] Our bar for effective giving interventions is a 2x ratio,^[18] so this clears our bar.

Improving conditions for broiler chickens

This grant falls within our focus area of farm animal welfare. Note that the FAW team uses a different bar than the rest of our GHW focus areas.

Feather Forward runs national campaigns to convince corporations to improve their animal welfare policies. The campaigns focus on stopping the use of caged hens to produce eggs and on improving the welfare of chickens raised for meat (called “broiler chickens”).

Just like the grant above, this is a backward-looking BOTEC. When considering a new grant to Feather Forward, we modeled their impact under their previous grant as follows:

Costs: $1 million — Feather Forward’s total expenses across the period the grant covered (two years).
Animals impacted: We estimated that Feather Forward impacted 34 million animals alive at any time through a combination of:^[19]
- 13 new country-level cage-free commitments from companies.
- Two major global cage-free commitments to which Feather Forward contributed.
- Six country-level broiler welfare commitments, including five major retailers.
- Implementation progress of cage-free commitments in their country.
Credit adjustment: We gave Feather Forward the following credit for each set of wins:^[20]
- 80% of the credit for national cage-free and broiler wins: they were the major group behind those; the remaining 20% of credit goes to another organization in the country.
- 10% of the credit for global cage-free wins: the organization directed almost all of the campaigns in their part of the world, which comprised roughly 10% of wins globally.
Commitment/implementation effort adjustment: In most cases, after animal protection groups secure a corporate pledge, they need to get the pledging company to follow through on their commitment.
- For new commitments, we apply an adjustment (usually 40-60%) to account for the cost of subsequent work.
- When we credit groups for companies’ progress on implementing commitments, we similarly apply an adjustment to give credit for the work of securing the commitment in the first place.

Given these adjustments, we estimated that Feather Forward’s work improved the welfare of 5.4 million animals alive at any time.^[21] To determine the full value of this impact, we apply additional considerations that include:

How many years these campaigns accelerated adoption of reforms
The degree of welfare improvement per animal
The extent to which each chicken can suffer

After incorporating these factors, we calculated that this grant easily exceeded our internal bar for farm animal welfare interventions. While our FAW BOTECs incorporate different moral weights and welfare improvement assessments than our other focus areas, the core approach remains consistent.

^{^}
The ~2,000x bar means that for every dollar we spend, we now aim to create as much value as giving ~$2,000 to someone earning $50,000/year in the U.S. We value improvements in income using a logarithmic utility function, so that an x% improvement in income is valued at $50,000 * ln(1+x). For health improvements, we use $100,000 as the value of a disability-adjusted life year (DALY) averted; an intervention that averted 3 DALYs would be valued at $300,000. See here for more on these methods.
^{^}
Our fundamental unit of benefits is what we call the “Open Phil dollar” ($OP), which is equivalent to giving $1 to someone earning $50,000/year in the U.S. We value a DALY at $100,000 in Open Phil dollars ($OP 100,000).
^{^}
In rare cases, we model our uncertainty more explicitly, specifying distributions for key input parameters and using Monte Carlo simulations to model the distribution of possible outcomes. We sometimes use carlo.app for this.
^{^}
For example: funding projects in emerging fields that build critical infrastructure or develop expertise, where direct impacts may only follow years later and be hard to attribute to any project.
^{^}
Phase IIa/b trials assess a treatment’s efficacy and safety. Atorvastatin is a widely used cholesterol-lowering drug (i.e. a statin). TB is a leading global health concern that primarily affects the lungs.
^{^}
Phase IIc trials are late-stage exploratory trials that build on earlier Phase II results to further confirm dosing, safety, and efficacy in a larger or more specific patient population. They aim to reduce uncertainties before progressing to larger, definitive Phase III trials.
^{^}
Our framework for R&D grants is that we do not create technologies which would never have existed in the absence of our grant. Rather, we assume that our funding allows researchers to develop the technology earlier than they would have without our funding. The exact number of years we would speed up a desired outcome versus a counterfactual is highly debatable; most of our R&D BOTECs assume between 2 and 7 years.
^{^}
Our estimate of DALYs per death from TB takes into account the expected years of healthy life remaining for the average TB patient.
^{^}
Including one or more Phase III trials, guideline changes, research dissemination, and potential validation studies.
^{^}
To avoid overestimating SROI, we look at the all-in program costs rather than our own contribution. For this grant, you could think of $OP 54 billion as “the benefits our funding would generate if we put in $7 million.” If we put in ~4.5% of that funding, we should only “credit” ourselves with ~4.5% of the benefits.
^{^}
All organizations mentioned in this piece are pseudonymous to preserve confidentiality.
^{^}
For some of these points, we show the range of potential outcomes our investigator thought was plausible, as well as the figure they used in the actual calculation. So “5% (range: 5-20%)” implies that the plausible odds were thought to range from 5% to 20%, and the figure actually used was 5%.
^{^}
By “formal,” we mean a licensed manufacturer operating within the regulatory framework of the country. By “informal,” we mean those operating outside of it (and often selling at cheaper price points).
^{^}
$3,900 in Open Phil dollars per 1 ug/dL per birth is a value we use across many lead grants. It is calculated here and is based on effects of lead in a child’s blood on later life income and health.
^{^}
The other 90% of the credit goes to the regulators who implement the method, the advocates who push for its adoption, etc.
^{^}
There is also a second theory of change: the cheaper, easier method reduces measurement costs that would have otherwise been borne by either philanthropic funds or governments. We assume that potential impact from speeding up enforcement (as above) dwarfs the cost reduction factor, so we haven’t factored this second theory of change into our SROI.
^{^}
If a significant share of the funding had gone toward charities not recommended by GiveWell, we’d also have applied a “cost-effectiveness adjustment,” since we assume the average charity (even in the global health space) is somewhat less impactful than one of GiveWell’s top charities. For example, some charities might get a 50% adjustment, meaning we think each dollar to those charities has roughly the same impact as $0.50 going to a top charity. (Since all funding went to GiveWell top charities in this case, we didn’t apply any cost-effectiveness adjustment.)
^{^}
For giving opportunities that act as money multipliers, we typically look for a return of twice our funding (even though this conceptually is higher than our normal ROI bar since we think effective giving is >0.5 times as effective as our bar). The extent of this higher bar is subjective, but we believe a higher bar is warranted for a few reasons:
1. We think the way we model "meta" funding opportunities, which are more distant from impact, is more likely to overestimate cost-effectiveness by failing to account for additional efforts required by other actors to achieve impact. (If we give money directly to GiveWell, we’re responsible for 100% of that. If our grantee spends $1,000 and raises $2,000 for GiveWell, our $1,000 may be only partly responsible — for example, the donors reached by the grantee may also have been influenced by GiveWell’s own advertising, or by efforts from other actors in the effective giving world.)
2. Our model is likely to overestimate the impact from additional donations to “high-impact” charities because it assumes that any funding attributed to the grantee would otherwise not have gone to charity at all (rather than going to a less impactful charity).
3. Intuitively, we aren't excited about supporting opportunities that could spend an additional $1 to generate only slightly more than that for high-impact charities, even if all donations are counterfactual. Supporting these opportunities would mean that small errors in our calculations (or unforeseen expenses) could result in negative impact (e.g. spending $1 to raise $0.90), and potentially send us into a meta trap.
^{^}
We estimate that 34 million animals are alive at any time in the supply chains of the companies that have made commitments. In most cases, we lack data on companies’ animal product usage and the number of animals in companies’ supply chains. When individual company data is not available, we use data from a comparable company (adjusted for revenues or outlets) or use national and sector animal product usage and company sector market share (e.g. how many chickens are sold in retail, and the share of the retail market the company accounts for).
^{^}
Outside of advocacy groups, no other institutional actors are pushing corporations to adopt these commitments. So while it is relatively easy to attribute almost all of the credit to advocacy groups, crediting specific groups involves highly uncertain judgment calls.
^{^}
Because broiler chickens are typically slaughtered after less than two months, well over 5.4 million animals are affected over the course of a year.

Knight Lee @ 2025-05-30T07:51 (+6)

Can you do a back-of-the-envelope calculation, on the costs and benefits of doing more back-of-the-envelope calculations? E.g. getting a different person to independently replicate a back-of-the-envelope calculation, in order to average out errors and biases specific to one individual and make it more robust.

SummaryBot @ 2025-05-29T16:09 (+1)

Executive summary: Open Philanthropy explains how it uses back-of-the-envelope calculations (BOTECs) to estimate the cost-effectiveness of grants across focus areas like global health, lead exposure reduction, animal welfare, and effective giving, illustrating their approach through detailed examples and emphasizing both the utility and limitations of these rough but decision-critical models.

Key points:

BOTECs clarify expected impact by estimating a grant’s social return on investment (SROI), helping Open Phil determine whether a grant clears its cost-effectiveness threshold — currently ~2,000x in “Open Phil dollars” for Global Health and Wellbeing grants.
The models vary by grant type — DALYs averted for health, suffering reduced for animals, or funds raised for effective charities — and may be forward- or backward-looking depending on available data and theory of change.
BOTECs guide but don’t dictate decisions; qualitative factors like leadership, track record, and unusual upside are also considered, and multiple BOTEC versions test the robustness of conclusions across different scenarios.
Examples illustrate application and nuance: A tuberculosis R&D grant modeled to avert nearly 20,000 deaths annually showed a 3,000x SROI; a lead detection method grant had an expected 6,500x SROI; an effective giving org cleared a 2x bar for fundraising ROI; and a broiler welfare campaign surpassed the animal welfare team's separate bar.
Open Phil adjusts BOTECs over time as new information arises — for example, reassessing speedup timelines or success probabilities post-grant — and openly acknowledges uncertainties, estimation challenges, and speculative assumptions in modeling.
The post invites community feedback and aims to demystify Open Phil’s quantitative thinking, while signaling that BOTECs are one tool among many in a broader evaluative process.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.